And the image capacity adequate hold the data

COSC 2P95 – Assignment 1 – Section 02 – Steganography

Due: October 31st at 5:00pm
The topic of this assignment is steganography: either the act livening up statistical charts with doodles of stegosauruses, or the process of transforming information so that it can be hidden within data that still remains readable.

Consider the following three images:

However, the zip isn't held in metadata, or somehow adjacent to the data. Rather, the zip is actually stored visually within the image!

How is this possible? Let's explore.

Can you see the difference? Possibly. You may have special eyes. But, at a glance, would you assume they were two different colours? Or that it was intentional?

Therein lies the core of our approach.

Interjection: Of course, this means we need to store the image either uncompressed, or with a lossless compression. That's why the provided image files are .ppm and .png.

Once we know we can arbitrarily change three bits per pixel, that means we can hide other information within the image. For example, if a byte is 8 bits, then we can fully store it within three pixels, with another bit to spare. (In practice, one could argue that original ASCII really only needs 7 bits per character, so we could be even more efficient. However, for our purposes here, we want to use full bytes)

																B6 (85)								R7 (190)

																								R1 (209)						G1 (170)

There are, of course, still several variations and decisions to consider: • Should one consider the filename of the original stored file?

◦ Potentially, yes. But for this assignment, you're only storing contents; not the files themselves• Should each byte be written in the sequence of most significant towards least significant, or vice versa? ◦ The loop might be slightly more natural to write starting with the least, but for this assignment, we'll be starting from the most
◦ Of course, it only matters if you want to be consistent with someone else storing data according to the same specification
• How do you mark the end of the hidden data?

For this assignment, we'll be using P3. This means you can store an entire 24-bit image in uncompressed ASCII. As you can (hopefully) guess, those files will be bloody huge; in part because there's no compression, and partly because each 8-bit channel uses multiple bytes.

(For comparison, for the unmodified version of the provided image, the .png version is about 595KB, while the .ppm is 8.7MB)

(Note that, for this assignment, you don't need to handle #commented lines, or strictly limiting the # of characters per line, but there are a few restrictions listed below)

Requirements:
Your assignment will have two modes: an interactive mode, and a non-interactive mode.

$ ./steg original.ppm
Width: 1024 Height: 768
Capacity: 294908
Stored: none

$ ./steg seekrit.ppm
Width: 1024 Height: 768
Capacity: 294908
Stored: 258177

• It presents a menu which repeats, processing each command, until the user chooses to exit

It has four options:
1. Store data
◦ Prompt for the input image filename
◦ Prompt for the input text(/data) filename
◦ Assuming the files load, and the image's capacity is adequate to hold the data, prompt for the output image filename, and create the new PPM
2. Retrieve data
◦ Prompt for the input image filename
◦ Assuming the file loads, and data appears to be present, prompt for the text(/data) output filename, and save the extracted bytes to it
◦ The original (steganographic) image is not modified

$ ./steg
Menu:	Output text filename: recovered.zip	Capacity: 294908
Menu:	Output text filename: recovered.zip	Stored: none
-----
1. Store data into image.
2. Retrieve data from image.		-----
3. Inspect image.	2. Retrieve data from image.
4. Scrub image.
0. Quit.		3. Inspect image.
1	0. Quit.
Input image filename: original.ppm
Capacity: 294908		4
Input text filename: 1and5.zip	Width: 1024
258177 bytes to store.
Output image filename: seekrit.ppm	Stored: 258177

Menu:

For the PPMs:
• When creating the PPM, put the width and height on the same row, separated by either a space or tab

• Use one line per image row

• Remember to include a header file, with appropriate function prototypes
• Consider how you'd want to represent a single byte. We've discussed a data type with two variations, and

either might be useful
• You may choose to use arrays (if so, you'll probably have to dynamically allocate them; if so, remember

been provided, and ask for help if you need additional explanations• It's highly advisable to first start small

◦ e.g. write your functions for loading and saving PPMs
▪ You'll be able to confirm visually that they look right
▪ If you have a problem with a later function, you won't need to wonder what the cause of missing bytes could be
◦ Then maybe try to extract just the inspection information
▪ Remember that you've been given a sample file with stored data

Submission:
Submission will be done strictly electronically, through sandcastle.

Include:
• All of your source (.cpp and .h) files
• .pdf copies of your source files
Sample executions (.txt or .pdf, per your preference)•
• Any additional instructions or notes, if you wish, in another text file