Mean the output the dwt encoding program

CaracalSteg: A Dart JPEG Steganography Library and Companion App

A Technical Report
presented to the faculty of the
School of Engineering and Applied Science
University of Virginia

Dylan Cao

Technical advisor: David Wu, Department of Computer Science

Since JPEG preserves low frequency data well, though, we can exploit this by embedding our data into the low frequency components. While the DCT used by JPEG is tricky to
implement, other simpler discrete wavelet transforms are able to decompose images into low and high frequency components in a sufficiently similar way. Specifically, we have implemented a slight modification of the Haar discrete wavelet transform described by Xu et al. (2004). Though the algorithm is pre-existing, we could not find an implementation of it suitable for public use.

Other researchers found that some existing steganography programs may produce images suitable for upload to Facebook, which is similar to what our program aims to do (Hiney et al., 2015). However, two of the three suitable programs identified as working for this use case are command-line only, and the third, JP Hide & Seek, has a rather unfriendly interface (that does not even correctly render on our Windows 10 laptop—the image below was not cropped, the UI 2

characters between 32 and 126, inclusive, rather than handling arbitrary data for simplicity of implementation. The message then needs to be encoded into bits that can be inserted into a cover file. The cover file is then read in by the program, the encoded data is inserted, and an output JPEG file is produced.

Least-Significant Bit (LSB) Steganography
We started using an implementation of LSB steganography as a baseline for testing purposes. LSB is not a robust technique on its own, so this is primarily for experimentation. First, the message is encoded into a bitstring. Originally, our implementation just used the bits of the ASCII message; later on, we used a repeated 256-bit Hadamard code for redundancy. After the bitstring to embed is generated, it is then inserted into the file using the nth least-significant bit of an RGB value of a pixel. This is configurable, but defaults to the 4th least-significant bit; for JPEGs, less significant values result in unusable levels of data loss even with little
compression applied to the output image. Pixels in the image are used in a left-right and top-down order: the first pixel modified is in the upper-left corner of an image, then the pixel to the right of that, etc. The channels used are in RGB order. Given 3 bits of information to embed, the first bit goes in the nth LSB of the R channel, then the G channel, etc.

for two repetitions of our message “BAR,” it is embedded as “BARBAR.” It is also possible to repeat in a character-wise fashion, such as “BBAARR.” Repeating in whole is theoretically helpful for error-correcting capability in our application. With character-wise repetition, each character is embedded in a relatively local set of space in our image; if compression destroys that whole space, the character is lost even with repetition. With message-wise repetition, each character is embedded multiple times throughout the image, so local errors still allow us to recover the original character.

In decoding, we chose to assume the most-decoded character in each position was the

1 is the correct value; otherwise, we would assume it is 0. Experimenting with this decoding scheme found that it is generally less accurate, so the first scheme was used.

To use this with LSB steganography, we simply use the repeated message converted to bits as the bitstring passed to the LSB embedding step. We found this did somewhat improve our accuracy, but improvements can be made by combining error-correcting codes.

Repeated Hadamard Error-Correcting Code
We can combine error-correcting codes for even more robustness. Instead of embedding our Hadamard-coded message once in a file, we can repeat it in the file. A message with two characters M1 and M2 is then converted into Hadamard codes C1 and C2. The codes are then embedded in the file as C1C2C1C2C1… in the manner described in the repetition codes section. Each repetition is then decoded into a character, and the most-commonly occurring character in each message position is assumed to be the correct one. We have thus combined our Hadamard code with our repetition code.

Haar-Wavelet Transform Method of Embedding
Experiments showed yet more improvement in decoding accuracy, but still not enough

The integer Haar DWT described in the Xu et. al paper is implemented simply using basic matrix operations, and produces a cA3matrix that also consists of 8-bit integers. It is fully and losslessly invertible as well. The simplicity is a major advantage over attempting to implement a DCT transform, which would output floating point values and be fairly vulnerable to floating point rounding errors. Also, the fact that cA3 consists of 8-bit integers is highly advantageous for our application: we can embed our coded messages into the last-significant bits of cA3, then invert the Haar DWT to produce our output bitmap. Xu et. al suggests using the 3rd significant bit of cA3, which we do by default (though our implementation is configurable).

Otherwise, the major difference between our implementation and theirs is that they use a different error-correcting code prior to embedding.

Decoding the hidden message involves producing cA3 from the input JPEG file, extracting the least-significant bits of cA3, Hadamard decoding them, accounting for the repetition codes, and finally outputting the original message. The program must know the number of characters in the hidden message, or corruption will occur due to misaligned repetitions.

JPEG Quality
Users can provide a quality parameter between 0 and 100 to the program, corresponding

These results specifically pertain to the Haar algorithm; we will not report results on the less interesting LSB algorithm.

First, JPEG quality levels of roughly 30 or higher generally allow for a message to be fully or nearly fully decodable before recompression as long as the image is sufficiently large.

Figure 2 – Left shows sample image without message, right shows one with a Haar embedded message. Artifacting resembling odd squares and color shifting can be seen.

In-Progress: Flutter Mobile App Front-End
We are developing a front-end for the steganography library using the Flutter mobile application framework. The code as well as an APK binary release for Android is available at At this time, the app supports Android only est for iOS. However, the code base should be 12

able to run on iOS with little or no changes. Flutter also has desktop app support in beta, so in the future it is also possible that the app can be easily ported to desktop platforms.

Figure 3 - Encoding screen

Figure 5 - Decoding screen showing with dialog box display the text embedded within the image

As previously mentioned, the UI also could use some work. The existing startup hang should be fixed. A progress indicator should be added, Haar DWT vs. LSB should be configurable, resizing functionality should be added, and there should be a way to cancel an

Al-Ani, M., & Awad, F. (2013). The JPEG image compression algorithm. International Journal of Advances in Engineering & Technology, 6, 1055–1062.

Chang, E., Fernando, U., & Hu, J. (n.d.). Lossy data compression: JPEG. Data Compression.

Xu, J., Sung, A. H., Shi, P., & Liu, Q. (2004). JPEG compression immune steganography using wavelet transform. International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004., 2, 704-708 Vol.2.

https://doi.org/10.1109/ITCC.2004.1286737