Tar Files Explained

In UNIX, tar files are often used as a form a file archival. Such archives have different uses like: backups, migrations, installations, versioning, etc… Often you will hear the term Tarball, which simply means a tar archive.

Tar simple means Tape ARchive, this was originally developed in 1979 as a solution to archive to tape. The utility as then evolved into a standard archival utility.

By itself Tar does not provide any form of checksum to assure that the files originally are the same as the one inserted in the Tar Ball. In order to introduce a checksum tar files are often compressed with a utility like gzip or bzip2. This will assure that the file will not change when it is extracted. Should the checksum not match, the file simply errors out and cannot be un-archived.

Enough talking, let’s show some usage examples of Tar. Let’s say, you have a directory that you would like to archive, let’s say, it’s /home. So, in order to do that, you would simply type the following command:

tar cvf home.tar /home

The option “c” is for create, “v” for verbose, and “f” to specify the tar file.

When you press Enter, you will see the verbose output, provided by the “v” command line argument, of all the files included in the backup. I am not listing the actual output here because it’s too big and it might break privacy rules.

When the command is finished, you will find the created tar file home.tar in the running directory which you executed tar from. The tar file is also uncompressed and no checksum was performed on the file. At this point, you can compress the archive with the following command:

gzip home.tar

After the command runs, you will have a file called home.tar.gz.

Now, tar gives you a shortcut in which you can archive and compress at the same time. Let’s now unify the archive and the compression in one command:

tar cvfz home.tgz /home

Option “z” simply means, also compress the archive with gzip.

You should have the same output as before, however, in the end, the file produced home.tgz will be much smaller, plus a checksum is inserted as part of the compression which will assure that the file cannot be changed.

Let’s assume that you are migrating the home directory to  different server. Now, you can copy the file to the destination, and with the opposite command, you can uncompress and extract the archive file:

cd /
tar xvfz home.tgz

The “c” option was changed to “x” for extract. Now, you should see the output of each file extracted, and assuming the file copy copied without transport errors, you will see no errors in the un-archival.

When the command is finished, you will have the files deployed on the new server.

This article is only meant as an introduction to tar. Tar can be a very complicated command. Use the following command to see all the options:

tar –help

I wanted to conclude this post by comparing zip and tar. Zip is a archival and compression utility, while tar is only a compression utility. To achieve the same, tar leverages external compression utility like gzip and bzip2. If you put those together, zip and tar+gzip/bzip2 have the same functionality.

Hopefully, now you are more familiar with this interesting utility.

Have fun and type away!

Leave a Reply