17.8. Archiving and Compressing Files
The tar
(tape archive) utility is used to package up a set of files
into a single file (an archive file or a tar file). The tar
command
is also used to extract the files out of a tar file. An archive file (or tar
file) contains all the file data from the files that make up the tar file plus
additional meta-data that describe its contents. The meta-data are necessary
for tar
to extract the individual file(s) out of a tar file.
Tar files are often used to distribute a set of files or to transfer a set of
files from one machine to another. They also may be used to store backups of
sets of files (file archives). For example, if a user who wants to transfer
a set of files from one system to another, they would first run tar
with
the -c
command line option to create a single tar file containing all of
the files, then scp
the resulting tar file to the remote machine.
On the remote machine they would then run tar
with -x
to extract
files from the tar file.
The tar
command has many command line options for configuring its behavior in
different ways. Table 1 lists some examples.
Option | Description |
---|---|
|
create a tar file from a set of files |
|
extract, or untar, a tar file |
|
list the contents of a tar file |
|
verbose mode, prints out information as runs command |
|
specifies the name of the tar file |
To create a tar file you need to use -c
, specify the name of the tar output
file with -f
and then give the list of files (and directories)
to include in the tar file (full directory contents will be included).
Here is an example of how to create a tar file from three regular files
(main.c
, process.c
, Makefile
) and the contents of a directory
(inputfiles/
):
$ tar -c -v -f mytarfile.tar main.c process.c Makefile intputfiles/
-v
is the verbose command line option. It lists some information about
the files being tar’ed up as tar executes.
The command line options to tar
can also be specified using an alternate,
more compact, syntax (this is often used in tar
examples you see):
$ tar -cvf mytarfile.tar main.c process.c Makefile intputfiles/ $ tar cvf mytarfile.tar main.c process.c Makefile intputfiles/
Alternatively, you can create a directory and copy the files you want to tar into it, and generate a tar file from that directory. For example:
$ mkdir mytardir $ cp *.c mytardir/. $ cp -r inputfiles mytardir/. $ tar cvf mytarfile.tar mytardir
To list the files in a tar file, use the -t
command line option:
$ tar -tvf mytarfile.tar
To extract files from a tar file, use the -x
command line option:
$ tar xvf mytarfile.tar
17.8.1. file compression
Often times file compression is used along with tar
to reduce the size of
the tar file. This is useful for both reducing the storage space required
for the tar file, and also to reduce the amount of data that are sent over
the network when a tar file is copied to a remote machine using scp
.
The gzip
and gunzip
are
used to compress and uncompress a file using gzip compression, and bzip2
and
bunzip2
are used to compress and uncompress a file using bzip compression.
Here are some example calls:
$ ls -l -rw-r--r-- 1 sarita users 40840 Apr 12 16:55 file $ gzip file # compress using gzip $ ls -l -rw-r--r-- 1 sarita users 13258 Apr 12 16:55 file.gz $ gunzip file.gz # uncompress a gziped file $ ls -l -rw-r--r-- 1 sarita users 40840 Apr 12 16:55 file $ bzip2 file # compress using bzip2 $ ls -l -rw-r--r-- 1 sarita users 12568 Apr 12 16:55 file.bz2 $ bunzip2 file.bz2 # uncompress a bzip2ed file $ ls -l -rw-r--r-- 1 sarita users 40840 Apr 12 16:55 file
tar and file compression
File compression is often used on tar files. A user can run file
compression utilities on a tar file after tar
creates it and run utilities to
uncompress the tar file before running tar
to extract files from the tar
file. However, because file compression is so commonly used with tar files,
tar
has command line options to create compressed tar files (and to extract
files from compressed tar files) using different compression algorithms (for
example, -j
uses bzip2
, and -z
uses gzip).
Here are some examples of compressing the tar file in one step using the -z
option to gzip or -j
option for bzip: to tar
:
# use -z option to compress tar file using gzip: $ tar -czvf myproj.tar.gz myproj/ $ ls myproj/ myproj.tar.gz # or use -j option to compress tar file using bzip2: $ tar -cjvf myproj.tar.bz2 myproj/ $ ls myproj/ myproj.tar.bz2
Alternatively, here is an example of separating the tar file creation and
compression into two steps (create tar file, then compress using
gzip
(or bzip2
):
$ tar cvf myproj.tar myproj/ # create a tar file from myproj $ ls myproj/ myproj.tar $ gzip myproj.tar # gzip the tar file $ ls myproj/ myproj.tar.gz # OR could use bzip $ bzip2 myproj.tar # bzip the tar file $ ls myproj/ myproj.tar.bz2
You can also use the one-step or two-step method to extract files
from a compressed tar file. For example, here is the single step way
(-z
specifies to uncompress the file first using gzip):
$ tar -xzvf myproj.tar.gz $ ls myproj/ myproj.tar.gz
Alternatively, here is the two-step method to extract files from a compressed tar file (first uncompress with gunzip or bunzip2, then untar):
$ ls myproj.tar.gz $ gunzip myproj.tar.gz $ ls myproj.tar $ tar -xvf myproj.tar $ ls myproj/ myproj.tar # or using bunzip2 for bzip'ed file $ ls myproj.tar.bz2 $ bunzip myproj.tar.bz2 $ ls myproj.tar $ tar -xvf myproj.tar $ ls myproj/ myproj.tar
17.8.2. putting it all together
Below is an example of using tar
to transfer a directory’s contents from
one machine (laptop
) to a remote machine (ada.cs.college.edu
). It
uses the scp
command to remotely copy the file between the two machines
(see Section 17.3 for more information
about scp
).
-
First, the user creates the compressed tar file on
laptop
(note: we show the two-step method to create and compress the tar file in order to show the listing of the files in the tar file, but the compressed tar file could also be created in one-step using the-cjvf
command line options):laptop$ ls myproj/ Makefile input1.txt input2.txt main.c mux.c mux.h laptop$ tar -cvf myproj.tar myproj/ # create a tar file from myproj laptop$ ls myproj/ myproj.tar laptop$ tar -tf myproj.tar # list tar file contents myproj/ myproj/Makefile myproj/input1.txt myproj/input2.txt myproj/main.c myproj/mux.c myproj/mux.h laptop$ gzip myproj.tar # compress the file
-
Then, they scp the single tar file to the remote machine:
laptop$ scp myproj.tar sam@ada.cs.college.edu:. # scp single tar file sam@ada.cs.college.edu's password:
-
Then the user can then
ssh
into the remote machine, and untar themyproj.tar
file to extract the files:laptop$ ssh sam@ada.cs.college.edu sam@ada.cs.college.edu's password: ada$ ls classes/ letters/ myproj.tar projects/ ada$ tar -xjvf myproj.tar # extract contents of tar file ada$ ls classes/ letters/ myproj/ myproj.tar projects/
If the user wanted to untar the directory somewhere other than in their
home directory, they could use mv
to move the directory where they want
(or they could have also moved the tar file to where they want it before
untaring it). For example, to move it as a subdirectory under the
project
directory:
ada$ ls classes/ letters/ myproj/ myproj.tar projects/ ada$ mv myproj project/. ada$ ls classes/ letters/ myproj.tar projects/ ada$ cd project ada$ ls myproj/
Often after untaring a tar file, the user removes the .tar
file:
rm myproj.tar ls
17.8.3. References
For more information see:
-
The man pages for these commands (e.g.,
man tar
,man gunzip
) -
most used Unix commands from cheat-sheets.org
-
Bash Reference Manual from gnu.org.