Archive Size & Decompression Performance

For a project I've been working on recently I need to store lots of plain text .vtk data set files. In order to reduce the disk space these data sets need, I wanted to compress them in an archive to save space and only decompress certain files when the user requests them to be loaded. Below, you can find the read (decompression) performance for loading one example file from different archive formats and the compression ratio of the file in the archive.

In my case, the time it takes to compress the files was irrelevant, as the files only need to be compressed once, but decompressed many times. In the end, I actually decided to stick with .zip files. Having fast loading times was still very important to get a responsive program. gzip (.tar.gz) may have slightly better read times, but the compression ratio was the worst of all algorithms tested. The decompression times for the other algorithms, however, were quite high compared to .zip and .tar.gz, which was also something I wanted to avoid.

If one has other requirements, like for example transferring files over a network connection, the improved compression ratio of .tar.xz, .tar.lzma or .7z might be more important than faster decompression times, though.