[linux] Tar a directory, but don't store full absolute paths in the archive

I have the following command in the part of a backup shell script:

tar -cjf site1.bz2 /var/www/site1/

When I list the contents of the archive, I get:

tar -tf site1.bz2
var/www/site1/style.css
var/www/site1/index.html
var/www/site1/page2.html
var/www/site1/page3.html
var/www/site1/images/img1.png
var/www/site1/images/img2.png
var/www/site1/subdir/index.html

But I would like to remove the part /var/www/site1 from directory and file names within the archive, in order to simplify extraction and avoid useless constant directory structure. Never know, in case I would extract backuped websites in a place where web data weren't stored under /var/www.

For the example above, I would like to have :

tar -tf site1.bz2
style.css
index.html
page2.html
page3.html
images/img1.png
images/img2.png
subdir/index.html

So, that when I extract, files are extracted in the current directory and I don't need to move extracted files afterwards, and so that sub-directory structures is preserved.

There are already many questions about tar and backuping in stackoverflow and at other places on the web, but most of them ask for dropping the entire sub-directory structure (flattening), or just add or remove the initial / in the names (I don't know what it changes exactly when extracting), but no more.

After having read some of the solutions found here and there as well as the manual, I tried :

tar -cjf site1.bz2 -C . /var/www/site1/
tar -cjf site1.bz2 -C / /var/www/site1/
tar -cjf site1.bz2 -C /var/www/site1/ /var/www/site1/
tar -cjf site1.bz2 --strip-components=3 /var/www/site1/

But none of them worked the way I want. Some do nothing, some others don't archive sub-directories anymore.

It's inside a backup shell script launched by a Cron, so I don't know well, which user runs it, what is the path and the current directory, so always writing absolute path is required for everything, and would prefer not changing current directory to avoid breaking something further in the script (because it doesn't only backup websites, but also databases, then send all that to FTP etc.)

How to achieve this?

Have I just misunderstood how the option -C works?

This question is related to linux bash backup tar

The answer is


The following command will create a root directory "." and put all the files from the specified directory into it.

tar -cjf site1.tar.bz2 -C /var/www/site1 .

If you want to put all files in root of the tar file, @chinthaka is right. Just cd in to the directory and do:

tar -cjf target_path/file.tar.gz *

This will put all the files in the cwd to the tar file as root files.


If you want to archive a subdirectory and trim subdirectory path this command will be useful:

tar -cjf site1.bz2 -C /var/www/ site1

Seems -C option upto tar v2.8.3 does not work consistently on all the platforms (OSes). -C option is said to add directory to the archive but on Mac and Ubuntu it adds absolute path prefix inside generated tar.gz file.

tar target_path/file.tar.gz -C source_path/source_dir

Therefore the consistent and robust solution is to cd in to source_path (parent directory of source_dir) and run

tar target_path/file.tar.gz source_dir

or

tar -cf target_path/file.tar.gz source_dir

in your script. This will remove absolute path prefix in your generated tar.gz file's directory structure.


Low reputation (too many years of lurking, sigh) so I can't yet comment inline, but I found the answer from @laktak to be the only one that worked as intended on Ubuntu 18.04 -- using tar -cjf site1.tar.bz2 -C /var/www/site1 . on my machine resulted in all the files I wanted being under ./ inside the tar.bz2 file, which is probably ok but there is some risk of inconsistent behavior across OSs when un-tarring.


Found tar -cvf site1-$seqNumber.tar -C /var/www/ site1 as more friendlier solution than tar -cvf site1-$seqNumber.tar -C /var/www/site1 . (notice the . in the second solution) for the following reasons

  • Tar file name can be insignificant as the original folder is now an archive entry
  • Tar file name being insignificant to the content can now be used for other purposes like sequence numbers, periodical backup etc.

Using the "point" leads to the creation of a folder named "point" (on Ubuntu 16).

tar -tf site1.bz2 -C /var/www/site1/ .

I dealt with this in more detail and prepared an example. Multi-line recording, plus an exception.

tar -tf site1.bz2\
    -C /var/www/site1/ style.css\
    -C /var/www/site1/ index.html\
    -C /var/www/site1/ page2.html\
    -C /var/www/site1/ page3.html\
    --exclude=images/*.zip\
    -C /var/www/site1/ images/
    -C /var/www/site1/ subdir/
/

The option -C works; just for clarification I'll post 2 examples:

  1. creation of a tarball without the full path: full path /home/testuser/workspace/project/application.war and what we want is just project/application.war so:

    tar -cvf output_filename.tar  -C /home/testuser/workspace project
    

    Note: there is a space between workspace and project; tar will replace full path with just project .

  2. extraction of tarball with changing the target path (default to ., i.e current directory)

    tar -xvf output_filename.tar -C /home/deploy/
    

    tar will extract tarball based on given path and preserving the creation path; in our example the file application.war will be extracted to /home/deploy/project/application.war.

    /home/deploy: given on extract
    project: given on creation of tarball

Note : if you want to place the created tarball in a target directory, you just add the target path before tarball name. e.g.:

tar -cvf /path/to/place/output_filename.tar  -C /home/testuser/workspace project

One minor detail:

tar -cjf site1.tar.bz2 -C /var/www/site1 .

adds the files as

tar -tf site1.tar.bz2
./style.css
./index.html
./page2.html
./page3.html
./images/img1.png
./images/img2.png
./subdir/index.html

If you really want

tar -tf site1.tar.bz2
style.css
index.html
page2.html
page3.html
images/img1.png
images/img2.png
subdir/index.html

You should either cd into the directory first or run

tar -cjf site1.tar.bz2 -C /var/www/site1 $(ls /var/www/site1)

Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to bash

Comparing a variable with a string python not working when redirecting from bash script Zipping a file in bash fails How do I prevent Conda from activating the base environment by default? Get first line of a shell command's output Fixing a systemd service 203/EXEC failure (no such file or directory) /bin/sh: apt-get: not found VSCode Change Default Terminal Run bash command on jenkins pipeline How to check if the docker engine and a docker container are running? How to switch Python versions in Terminal?

Examples related to backup

input file appears to be a text format dump. Please use psql How can I backup a Docker-container with its data-volumes? Backup/Restore a dockerized PostgreSQL database Export MySQL database using PHP only Tar a directory, but don't store full absolute paths in the archive How to extract or unpack an .ab file (Android Backup file) mysqldump with create database line Postgresql 9.2 pg_dump version mismatch How to backup Sql Database Programmatically in C# Opening a SQL Server .bak file (Not restoring!)

Examples related to tar

gzip: stdin: not in gzip format tar: Child returned status 1 tar: Error is not recoverable: exiting now Create a .tar.bz2 file Linux what does -zxvf mean in tar -zxvf <filename>? tar: file changed as we read it Create a tar.xz in one command How to tar certain file types in all subdirectories? Tar a directory, but don't store full absolute paths in the archive How to uncompress a tar.gz in another directory How to extract filename.tar.gz file Utilizing multi core for tar+gzip/bzip compression/decompression