Custom Image Compression Formats

This project contains implementations of lossy and lossless image compression techniques that come together to create custom-designed image formats IM3 and IN3. The IM3 format is lossy, meaning that we gain higher compression ratios but at the cost of image quality. This design closely resembles the JPEG pipeline. The IM3 format, however, is lossless, meaning we can construct the original image exactly from the compressed data.

This project was a deeper dive into compression techniques than I had anticipated I would go, but was super fascinating and was a worthwhile exercise. Some of my hardest work went into this project, as implementation details required clever planning, data manipulation, and strong understanding of the techniques, not to mention it was a nightmare to debug.

Source code: https://github.com/CurtisBuckoll/IMCompress

Using This Program

Upon execution, the program will prompt to open a file. There are only three types that the program will accept: .bmp, .in3, or .im3, and the .bmp file type must ONLY be the 24-bit format. If a .bmp file type is selected, a prompt will appear to select either lossless (.in3) or lossy (.im3) encoding. An .in3 or .im3 compressed image of the same name as the source image will then be created in the same directory, depending on the selected encoding. If either an .in3 or .im3 file is initially opened instead, then the file will be decoded and displayed to the screen.

The encoder is designed to work with image heights and widths that are multiples of 8. Strange artifacts of behaviour may occur otherwise.

Project Overview

The lossless IN3 coder uses a fairly straightforward pipeline, involving difference (or delta) coding and Huffman coding. Encoding neighbouring pixel differences typically improves the compression ratio achieved by Huffman coding if we assume we are working with ‘smooth’ images, and for most images, this generally is the case. When the difference between neighbouring pixels is low, we get difference values clustered towards zero reducing overall information ‘entropy’, allowing Huffman coding can achieve better compression ratios. The pipeline can be visualized like this:

The Huffman coding step examines the frequency distribution of bytes of the file, creates a histogram of the count of the 256 possible bytes, and systematically constructs the Huffman tree assigning shorter symbols in terms of bits to more frequent bytes in the file. This gives us 256 symbol table entries, where for each entry we record the old symbol, new symbol, and the new symbol length. This table must be stored in the header as part of the compressed file which increases the size of the file size, however, the header remains relatively small compared to the size of the image data.

The difference/delta coding step can be thought of as approximating 1D derivatives in the image represented as a surface, and in this sense, recovering the image during decoding can kind of be pictured as numerical integration where the value of the first pixel is saved in the header. Typical compression ratios hover at about 1.5, depending on the image size and content. Without difference coding, compression ratios are insubstantial.

The IM3 format follows a longer pipeline than IN3, but can achieve better compression. By design, it follows a very similar pipeline as used by JPEG image compression, involving a Discrete Cosine Transform (DCT) step transforming separate chunks of the image into the frequency domain, where we strip out low energy spatial frequency information within the image. This can give us large savings with little effect to the perceived quality of the image after reconstruction in the spatial domain.

As a first step, we transform the image into YUV colour space and downsample the U, V channels,. Although this operation is lossy, we immediately reduce the data by half. For simplicity, we only do this if both the image height and width are multiples of 16. This allows for all three channels to use 8x8 blocks while performing DCT and IDCT

The highest values in the quantization matrix above correspond with higher frequency information in the block, as this is what we can remove the most of with the least impact the overall appearance. After this quantization step, we end up with many zeroes in the DCT block which is compactly stored by run-length coding. The results applied to all 8x8 blocks are ultimately fed to the Huffman coder, and this becomes the data to our compressed .im3 file.

The values in the quantization matrix can be scaled by an adjustable internal parameter to achieve higher compression ratios. Following are some results at varying compression ratios.

Decompressing an IM3 image simply follows the ‘inverse’ operations of compression: extract the Huffman symbol table and perform Huffman decoding, reconstruct the 8x8 quantization coefficient blocks from the run-length encoded arrays, multiply these coefficients by the the scaled quantization table entries, perform Inverse Discrete Cosine Transform on each block, and finally transform from YUV back to RGB space, upsampling U, V if necessary.

This compression scheme works reasonably well, although the JPEG standard remains superior when targeting roughly the same level of quality:

Curtis Buckoll.

Custom Image Compression Formats

Sample Image

Results

Comparison With JPEG