Skip navigation
emcompress-product-icon.svg

emCompress – Walkthrough Examples

The compression tool for embedded systems emCompress, includes a set of sample applications and walkthroughs that demonstrate the use of emCompress for different scenarios.

Single-File Compression – emCompress

Integrating a compressed file into an application is straightforward:

  • Compress the file with emCompress specifying any restrictions on decompressor memory and codecs to use.
  • Write a function that processes the decoded file a chunk at a time.
  • Call the decompressor providing the file to decode and some working storage.

This example compresses a small text file, and an application prints the output.


Compress the File

First, the file is compressed with the emCompress application.

C:> emCompress.exe Jabberwocky.txt

(c) 2015 SEGGER Microcontroller GmbH & Co. KG    www.segger.com
emCompress V2.10 compiled Jun 23 2015 17:16:26
Input File:    Jabberwocky.txt
Optimization:  Level 5 (Balanced)
Restriction:   None (assume unlimited decompressor RAM)
Codec:         DEFLATE(2k,3,258) chosen from 125 candidates
Decoding:      4472 bytes required for decompression
Compression:   48.0% (52.0% of original removed)
- Sizes        1089 -> 523 bytes
Output File:   Compressed_Jabberwocky.c
Elapsed time:  1.432 s
C:> _

Call the Decompressor

The output file is included into the target application. For decompression a static workspace is required. Its size is shown in the compression information (4,472 bytes). The compressed file control structure is provided to the decompressor together with the workspace and a callback function that will receive the decompressed data.
The callback function for this example will simply print the decompressed output:

static int _PrintData(void *pUserContext, void *pData, unsigned NumBytesData) {
  return printf("%.*s", NumBytesData, pData);
}

Process the Decompressed Output

When the decoder has filled its local buffer in the workspace with decompressed data, it passed the data to the provided callback function. When the callback returns (a positive value) it continues decompression until all data is decompressed. The callback function will be called multiple times, in general, with decompressed fragments of varying lengths as the decoder works it way through the compressed bitstream.


Compile and Test

The example application can be built for Windows or any embedded target and running it prints the original input of the file:

C:> Ex1.exe
Jabberwocky
  BY LEWIS CARROLL
'Twas brillig, and the slithy toves
      Did gyre and gimble in the wabe:
All mimsy were the borogoves,
      And the mome raths outgrabe.
"Beware the Jabberwock, my son!
      The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
      The frumious Bandersnatch!"
[...]

Decompressed 1089 bytes.
C:> _

Decompression Into Memory

If there is enough RAM available to hold all decompressed content, an all-at-once decompression function can be used. Continuing with the previous example, emCompress indicated that the original content is 1,089 bytes. So, when decompressed, this is what is required.


Call the Decompressor

Instead of providing a callback function a buffer of at least 1,089 bytes is provided to the decompressor together with the compressed file control structure and the workspace for decompression.


Group Compression

If an application has many small files, in the order of a few kilobytes each, it may well be worth compressing those files in group mode. Group compression combines all input files by concatenating them into a single image which is then compressed. The advantage of this type of compression scheme is that there is more opportunity for the encoders to find redundancy in the combined image than when considering each file individually (as in the single-file example). It is quite common that read-only static content spans multiple files, for instance the content of a web server in an embedded device. Running emCompress on an example set of HTML files may reduce the uncompressed 33 kByte to 13 kByte, which is already good. However, in group mode emCompress can do better still: The 33 kByte input may be reduced to only 6.5 kByte.


Compress the Files

This example compresses a set of small text files, and an application prints the output of each file. A difference between single-file compression and group compression is that a single output file is generated by emCompress. emCompress needs to be invoked with -g and the file name of the output file.

C:> emCompress.exe -gCompressed_Poems.c *.txt
(c) 2015 SEGGER Microcontroller GmbH & Co. KG    www.segger.com
emCompress V2.10 compiled Jun 23 2015 17:16:26
Codec:         DEFLATE(1k,3,66) chosen from 126 candidates
Encoding:      33563936 bytes used during compression
Decoding:      3220 bytes required for decompression (66 for window)
Compression:   49.9% (50.1% of original removed)
- Sizes        3114 -> 1554 bytes
Elapsed time:  1.401 s
C:> _

Call the Decompressor

Now that all files are compressed in group mode, extracting each file is not different to using single-file decompression. The output file is included into the target application. For decompression a static workspace is required. Its size is shown in the compression information (3,220 bytes). The compressed file control structure of the file to be extracted is provided to the decompressor together with the workspace and a callback function that will receive the decompressed data.


Compromises with Group Compression

An application can deal with files compressed in either group mode or as single-files transparently: emCompress takes care of managing the details of decompressing both. As such there is no need to alter the code that calls the emCompress API to decompress a file if the compression is flipped between group mode and single-file compression. All features, such as data integrity checks, work just the same in both modes.
There are, however, compromises to be aware of when using group mode.

Access time

In order to decompress the contents of a file compressed in group mode, emCompress must decode the bitstream from the start, passing over compressed content of other files, before starting to decompress the content of the requested file. Whilst there is no memory overhead associated with this process, there is a time overhead: it takes longer to access files at the end of the compressed bitstream than at the start.

Dead data

All files that are compressed in group mode are packaged into a single compressed bitstream. If only a subset of the compressed files is used, the entire bitstream is linked into the application, including the compressed content of all files, even if they are never referenced. The linker has no opportunity to remove redundant data. The advice would be to group only the files that are used in the application. emCompress provides the capabilities to make appropriate decisions on how to compress files and structure the compression that best suits the application's use.


Defensive Decompression

Applications may wish to verify the integrity of the input and output bitstreams. This may be particularly appropriate when programming FPGA devices with a configuration bitstream.

Using Built-in Integrity Checks

The compressed file control structure includes the CRC of the decompressed data as well as of the compressed bitstream. It can be verified straightforward while decompressing the data by passing the CRC calculation function to the decompressor together with the other parameters.  The CRC-32 implementation is included in the emCompress shipment and can be used directly.

The verification functions perform two CRC checks, on the compressed and uncompressed bitstreams, to ensure data integrity:

  • Before decompression, the compressed bitstream’s stored CRC is checked against a newly computed CRC calculated over the compressed bitstream. If the CRCs do not match, indicating a failure in the integrity of the compressed bitstream, the bitstream is not decompressed and emCompress signals a decompression failure.
  • If the compressed bitstream is intact, the bitstream is decompressed, passing the decompressed output to the application, and a running CRC is maintained. At the end of compression, if the uncompressed bitstream’s stored CRC does not match the calculated CRC, emCompress signals a decompression failure.

The two CRC checks are made as it is very difficult to detect errors in compressed bitstreams during decompression: checking the compressed bitstream’s integrity before decompressing ensures that the decompressors are presented with a clean bitstream. Checking that the decompressed output matches what is expected gives an extra level of assurance that the decompression algorithm executed correctly and has not suffered data corruption during decompression.

Extra-Defensive Decompression

When working with a decompressed bitstream that requires absolute integrity before being processed, it may be required to ensure that both compressed and decompressed data is intact. For instance a bitstream sent to an FPGA for configuration needs to be correct and checking the CRC at the end of decompression is simply too late. I this case a "dry run" decompression can be performed. The bitstream is decompressed and the CRCs checked, but the decompressed data is not sent to a callback or stored in a buffer. When the "dry run" decompression succeeds the bitstream can be decompressed again and processed.