bzip2 Command in Linux



The bzip2 command is a versatile tool you can use for compressing and decompressing files. This command uses the Burrows-Wheeler block sorting text compression algorithm and Huffman coding.

  • Burrows-Wheeler Block sorting is a data transformation algorithm that rearranges the data in a way that makes it more compressible.
  • On the other hand, Huffman coding is a method of encoding data that reduces the size of the data by using shorter codes for more frequent elements.

Difference between bzip2 and gzip Commands

bzip2 creates smaller archives compared to gzip. This means the files compressed with bzip2 will generally take up less space. Besides that, bzip2 has a slower decompression time compared to gzip. This means it takes longer to extract files from a bzip2 archive.

In addition, bzip2 uses more memory during compression and decompression than gzip and its options are designed to be similar to those of gzip, making it easier for users familiar with gzip to use bzip2. However, they are not exactly the same, so some differences exist.

bzip2 expects a list of file names to be provided along with the command-line flags. Each file you provide is replaced by its compressed version, with the extension .bz2 added to the original file name. The compressed file retains the same modification date, permissions, and ownership as the original file. This ensures that these properties can be restored when the file is decompressed.

bzip2 also does not have mechanisms to preserve original file names, permissions, ownerships, or dates in file systems that lack these concepts or have file name length restrictions (e.g., MS DOS).

Table of Contents

Here is a comprehensive guide to the options available with the bzip2 command −

Syntax of bzip2 Command

The following is the general syntax for the bzip2 command.

bzip2 [OPTIONS] filenames ...

bzip2 Command Options

The following are different options available with the bzip2 command −

OptionsDescription
-c, --stdoutCompress or decompress to standard output. This means the output will be displayed on the terminal instead of being saved to a file.
-d, --decompressForce decompression. This flag tells bzip2 to decompress the file, regardless of the command used to invoke it.
-z, --compressThe complement to -d: forces compression, regardless of the invocation name.
-t, --testCheck the integrity of the specified file(s) without decompressing them. It performs a trial decompression and discards the result.
-f, --forceForce overwrite of output files. This flag also forces bzip2 to break hard links to files and pass through files without the correct magic header bytes.
-k, --keepKeep (don’t delete) input files during compression or decompression.
-s, --small

Reduce memory usage during compression, decompression, and testing.

-s selects a block size of 200k, which limits memory use to around the same figure, at the expense of your compression ratio. This is useful if your system has limited memory.

-q, --quietSuppress non-essential warning messages. Messages pertaining to I/O errors and other critical events will not be suppressed.
-v, --verboseShow the compression ratio for each file processed. Further -v's increase the verbosity level, spewing out lots of information which is primarily of interest for diagnostic purposes.
-L, --licenseDisplay the software version, license terms, and conditions.
-1 (or --fast) to -9 (or -best)

Set the block size to 100 k, 200 k ... 900 k when compressing. The --fast and --best aliases are primarily for GNU gzip compatibility.

In particular, --fast doesn't make things significantly faster. And --best merely selects the default behavior.

--Treats all subsequent arguments as file names, even if they start with a dash.
--repetitive-fast, --repetitive-bestThese flags are redundant in versions 0.9.5 and above. They were used in earlier versions to control the sorting algorithm.

Examples of bzip2 Command in Linux

In this section, we'll explore various examples of the bzip2 command using the options we discussed −

  • Compress a File and Output to Standard Output
  • Decompress a File
  • Test the Integrity of a Compressed File
  • Force Overwrite of an Existing Compressed File
  • Keep the Original File after Compression
  • Reduce Memory Usage during Compression
  • Suppress Non-essential Warning Messages
  • Verbose Mode to Show Compression Ratio
  • Compress a File with the Highest Compression Ratio
  • Combining and Compressing Files in a Single Command

Compress a File and Output to Standard Output

To compress a filename and save the compressed data to another filename, you can simply use the following command −

bzip2 -c sample1.txt > compressedfile.bz2

The "-c" flag tells bzip2 to write the compressed output to standard output (stdout) instead of replacing the original file.

bzip2 Command in Linux1

Decompress a File

To decompress a file that you had already compressed using bzip2, you can use the following syntax −

bzip2 -d sample2.txt.bz2
bzip2 Command in Linux2

Test the Integrity of a Compressed File

To test the integrity of a compressed file, you can use the bzip2 command with the "-t" flag:

bzip2 -t sample3.txt.bz2

This command checks if the file is intact and not corrupted. If the file passes the test, there will be no output, indicating that the file has no errors. If there are any issues, an error message will be displayed.

bzip2 Command in Linux3

Force Overwrite of an Existing Compressed File

To force the compression of a filename even if a compressed file with the same name already exists, you can use the following syntax −

bzip2 -f sample1.txt

This command compresses sample1.txt and overwrites any existing sample1.txt.bz2.

bzip2 Command in Linux4

Keep the Original File after Compression

To compress filename.txt into filename.txt.bz2 while keeping the original filename.txt intact, you can use the following syntax −

bzip2 -k sample4.txt
bzip2 Command in Linux5

Reduce Memory Usage during Compression

To reduce memory usage during the compression process, you can simply use the bzip2 command with the "-s" flag −

bzip2 -s sample5.txt

This command is particularly useful if you have a system with limited memory resources, as it makes the compression process less memory-intensive, though it might take a bit longer to complete.

bzip2 Command in Linux6

Suppress Non-essential Warning Messages

To suppress non-essential warning messages, you can use the bzip2 command with the "-q" option −

bzip2 -q sample6.txt

This command compresses sample6.txt into sample6.txt.bz2 while keeping the output quiet by not displaying non-essential warnings.

bzip2 Command in Linux7

Verbose Mode to Show Compression Ratio

To compress the filename and display detailed information, including the compression ratio, you can use the bzip2 command with the "-v" option −

bzip2 -v my_file1.txt
bzip2 Command in Linux8

Compress a File with the Highest Compression Ratio

To compress a file to the maximum compression level, you can use the following command −

bzip2 -9 my_file.txt
bzip2 Command in Linux9

Combining and Compressing Files in a Single Command

You can combine a set of MP3 music files into one compressed file using a single command by leveraging tar with the -cjf flags: Here's the single command you can use to combine and compress the MP3 files −

tar -cjf music.tar.bz2 1.mp3 2.mp3 3.mp3

In this command −

  • tar creates a new archive.
  • -c : stands for "create".
  • -j : specifies that the archive should be compressed using bzip2.
  • -f : specifies the name of the archive file to create (music.tar.bz2).
  • 1.mp3, 2.mp3, and 3.mp3 are the music files being added to the archive.
bzip2 Command in Linux10

After running this command, the resulting file music.tar.bz2 contains all the specified MP3 files compressed using bzip2. You can use the cat command to check the contents of the file.

bzip2 Command in Linux11

Conclusion

By understanding the bzip2 command and its available options, you can effectively manage file compression tasks to suit your specific needs, whether for reducing file size, preserving file attributes, or handling system limitations.