As part of a programming project to show the capabilities of MATLAB, I created an algorithm that takes data (in the form of UTF-8 characters) and hides it inside an image in a way that makes the image appear unchanged, leaves the file size the same, and doesn’t need a password to decrypt (the data and the image are the only parts of the algorithm).
The algorithm hides data in images by taking each character of the data, finding its UTF-8 byte values, translating that number to base 4, and adjusting the image pixel by pixel so that the change in the colors of the pixels reflects the data. The amount of color change in each pixel is based on how many digits the character has, in base 4. The red, green, or blue channel of each pixel changes so that its value for the changed color channel modulus 4 is the value from each digit of the UTF-8 value, in base 4. The result is an image where each data-carrying pixel changes by 1% in one of its 3 color channels and by about .33% overall, appearing identical to the original pixel. Because of the use of modulus, decryption does not rely on any password; simply providing the image with data-carrying pixels is enough to extract the meaningful data. The process is admittedly a bit difficult to understand based on overview alone; a more detailed examination follows below.
UTF-8 to base 4 number
The algorithm I wrote takes UTF-8 plain text and hides it in a non-compressed image. In order to understand how it works, it’s first important to understand how UTF-8 formats characters. UTF-8’s purpose is to represent every character using one to four bytes. A byte is a piece of data that can be any number between 0 and 255 (more technically, it’s 8 bits, where a bit is a 1 or a 0). This means that any character can be represented by 8,16,24, or 32 bits. For a more thorough look at UTF-8, you can refer to its wikipedia page.
My algorithm feeds data into the image one character at a time. Let’s look at the capital letter K. This has a UTF-8 hex value of 0x004B, which is 75 in decimal. In base 4, the number 75 is 1023. This number has 4 digits, which means that it will take 4 pixels to represent the letter K in the image.
How images work
Almost all image file types can be represented as 3 matrices that are the size of the image’s length x height in pixels, with each matrix value being between 0 and 255. This represents the RGB color channels of the image. For example, a 100 x 100 pixel image, when you look at the file, is represented by 3 matrices that have 100 rows and 100 columns. If the image is pure red, then the first matrix, the red channel, will have every value as 255. The other two matrices are populated with 0s.
How do we hide data in the image? Let’s go back to the letter K, which is 1023 in base 4. As I mentioned above, it’ll take 4 pixels, one for every digit, to represent it. Let’s go back to the 100 x 100 pixel red square. The very first pixel has a value for 255 in its red channel. 255 modulus 4 is 3. The first digit of the letter K is 1, so we change the R value of the first pixel so that its modulus is 1. We can do this by subtracting 2 from that cell, turning the 255 to 253. the next pixel’s red channel also has a modulus of 3, and since the second digit of the letter K is 0, we subtract 3 so that the second pixel’s red channel is at 252. Using this kind of process we can change each pixel by up to 3 from its original red channel value.
At maximum, this change would be 3/256 ≈ 1% change in the red channel of the pixel, and over all, a change of 3/(256 x 3 channels) = .4% change in the pixel overall, at maximum. And that’s just the maximum value; most pixels will change by less than that. This amount of change is of course indistinguishably small, making the changes to the image invisible.
Bookkeeping and decryption
UTF-8 is a variable-width encoding style, which means that different characters will have different lengths (i.e., a different number of pixels in base 4). How do we know when one character ends and another begins? There are quite a few ways to address the need to end one character and start another. I decided that the best, most elegant solution would take up the fewest number of pixels as possible. Of course, this meant using only one pixel to indicate a change. I decided to address this by changing the modulus from 4 to 5, and making 5 a special character. Now, when the character is complete, the next pixel will be set to have a modulus of 4, so that we realize that one character is ending and another is beginning. We can also use multiple 4s to indicate other special actions. In the version of the program I made for the project, two 4s back to back indicated a newline character, and three 4s back to back indicated the end of the data.
When it comes time to decode the image, the program goes through each pixel, looking at the modulus of each pixel. It concatenates the numbers back to back until it reaches a 4, and then it translates that number to its UTF-8 character. When it reaches 444 it will stop looking at pixels. The image cannot be wiped of its data, since the original values of the color channels aren’t saved, but since the image is identical to what it was before, this is not a big issue.