[linux] Using Rsync include and exclude options to include directory and file by pattern

I'm having problems getting my rsync syntax right and I'm wondering if my scenario can actually be handled with rsync. First, I've confirmed that rsync is working just fine between my local host and my remote host. Doing a straight sync on a directory is successful.

Here's what my filesystem looks like:

uploads/
  1260000000/
    file_11_00.jpg
    file_11_01.jpg
    file_12_00.jpg
  1270000000/
    file_11_00.jpg
    file_11_01.jpg
    file_12_00.jpg
  1280000000/
    file_11_00.jpg
    file_11_01.jpg
    file_12_00.jpg

What I want to do is run rsync only on files that begin with "file_11_" in the subdirectories and I want to be able to run just one rsync job to sync all of these files in the subdirectories.

Here's the command that I'm trying:

rsync -nrv --include="**/file_11*.jpg" --exclude="*" /Storage/uploads/ /website/uploads/

This results in 0 files being marked for transfer in my dry run. I've tried various other combinations of --include and --exclude statements, but either continued to get no results or got everything as if no include or exclude options were set.

Anyone have any idea how to do this?

This question is related to linux unix rsync darwin

The answer is


Here's my "teach a person to fish" answer:

Rsync's syntax is definitely non-intuitive, but it is worth understanding.

  1. First, use -vvv to see the debug info for rsync.
$ rsync -nr -vvv --include="**/file_11*.jpg" --exclude="*" /Storage/uploads/ /website/uploads/

[sender] hiding directory 1280000000 because of pattern *
[sender] hiding directory 1260000000 because of pattern *
[sender] hiding directory 1270000000 because of pattern *

The key concept here is that rsync applies the include/exclude patterns for each directory recursively. As soon as the first include/exclude is matched, the processing stops.

The first directory it evaluates is /Storage/uploads. Storage/uploads has 1280000000/, 1260000000/, 1270000000/ dirs/files. None of them match file_11*.jpg to include. All of them match * to exclude. So they are excluded, and rsync ends.

  1. The solution is to include all dirs (*/) first. Then the first dir component will be 1260000000/, 1270000000/, 1280000000/ since they match */. The next dir component will be 1260000000/. In 1260000000/, file_11_00.jpg matches --include="file_11*.jpg", so it is included. And so forth.
$ rsync -nrv --include='*/' --include="file_11*.jpg" --exclude="*" /Storage/uploads/ /website/uploads/

./
1260000000/
1260000000/file_11_00.jpg
1260000000/file_11_01.jpg
1270000000/
1270000000/file_11_00.jpg
1270000000/file_11_01.jpg
1280000000/
1280000000/file_11_00.jpg
1280000000/file_11_01.jpg

https://download.samba.org/pub/rsync/rsync.1


rsync include exclude pattern examples:

"*"         means everything
"dir1"      transfers empty directory [dir1]
"dir*"      transfers empty directories like: "dir1", "dir2", "dir3", etc...
"file*"     transfers files whose names start with [file]
"dir**"     transfers every path that starts with [dir] like "dir1/file.txt", "dir2/bar/ffaa.html", etc...
"dir***"    same as above
"dir1/*"    does nothing
"dir1/**"   does nothing
"dir1/***"  transfers [dir1] directory and all its contents like "dir1/file.txt", "dir1/fooo.sh", "dir1/fold/baar.py", etc...

And final note is that simply dont rely on asterisks that are used in the beginning for evaluating paths; like "**dir" (its ok to use them for single folders or files but not paths) and note that more than two asterisks dont work for file names.


Add -m to the recommended answer above to prune empty directories.


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to unix

Docker CE on RHEL - Requires: container-selinux >= 2.9 What does `set -x` do? How to find files modified in last x minutes (find -mmin does not work as expected) sudo: npm: command not found How to sort a file in-place How to read a .properties file which contains keys that have a period character using Shell script gpg decryption fails with no secret key error Loop through a comma-separated shell variable Best way to find os name and version in Unix/Linux platform Resource u'tokenizers/punkt/english.pickle' not found

Examples related to rsync

Speed up rsync with Simultaneous/Concurrent File Transfers? How does `scp` differ from `rsync`? How to rsync only a specific list of files? rsync: difference between --size-only and --ignore-times rsync copy over only certain types of files using include option rsync - mkstemp failed: Permission denied (13) Using Rsync include and exclude options to include directory and file by pattern Copying files using rsync from remote server to local machine Why is this rsync connection unexpectedly closed on Windows? Copy or rsync command

Examples related to darwin

Is there an equivalent of lsusb for OS X Using Rsync include and exclude options to include directory and file by pattern