[linux] How to exclude a directory in find . command

I'm trying to run a find command for all JavaScript files, but how do I exclude a specific directory?

Here is the find code we're using.

for file in $(find . -name '*.js')
do 
  java -jar config/yuicompressor-2.4.2.jar --type js $file -o $file
done

This question is related to linux shell find

The answer is


To exclude multiple directories:

find . -name '*.js' -not \( -path "./dir1" -o -path "./dir2/*" \)

To add directories, add -o -path "./dirname/*":

find . -name '*.js' -not \( -path "./dir1" -o -path "./dir2/*" -o -path "./dir3/*"\)

But maybe you should use a regular expression, if there are many directories to exclude.


For what I needed it worked like this, finding landscape.jpg in all server starting from root and excluding the search in /var directory:

find / -maxdepth 1 -type d | grep -v /var | xargs -I '{}' find '{}' -name landscape.jpg

find / -maxdepth 1 -type d lists all directories in /

grep -v /var excludes `/var' from the list

xargs -I '{}' find '{}' -name landscape.jpg execute any command, like find with each directory/result from list


#find command in linux
def : find command used to locate /search files in unix /linux system ,
      find  search for files in a directory hierarchy
1)exec   Show diagnostic information relating to -exec, -execdir, -ok and -okdir
2)-options
  -H =do not follow symoblic links while except while procesing .
  -L = follow symbolic links
  -P =never follow symbolic links

  -type c
             File is of type c:

             b      block (buffered) special

             c      character (unbuffered) special

             d      directory

             p      named pipe (FIFO)

             f      regular file

             l      symbolic  link;  this  is  never true if the -L option or the -follow option is in effect, unless the
                    symbolic link is broken.  If you want to search for symbolic links when -L is in effect, use -xtype.

             s      socket

             D      door (Solaris)

    -Delete
    Delete files; true if removal succeeded.  If the removal failed, an error message  is  issued.
    If  -delete
    #fails, find's exit status will be nonzero (when it eventually exits).


find /home/mohan/a -mindepth 3 -maxdepth 3 -type f -name "*.txt" |xargs rm -rf
find -type d -name
find -type f -Name
find /path/ -type f -iname (i is case insenstive)

#find directores a/b/c and only delete c directory inside have "*.txt "
find /home/mohan/a -mindepth 3 -maxdepth 3 -type f -name "*.txt" |xargs rm -rf
find /home/mohan/a -mindepth 3 -maxdepath 3 -type f -name "*.txt" -delete


#delete particular directory have empty file and only  we can delete empty files
find /home/mohan -type f -name "*.txt" -empty -DELETE


#find multiple files and also find empty files
find /home/mohan -type f \( -name "*.sh" -o  -name "*.txt" \) -empty

#delete empty files two or more Files
find /home/mohan -type f \( -nmae "*.sh" -o -name "*.txt" \) -empty -delete


#How to append contents of multiple files into one file
find . -type f -name '*.txt' -exec cat {} + >> output.file

#last modified files finding using less than 1 min (-n)
ls -lrth|find . -type f -mmin -1

#last modified files more than 1 min (+n)
ls -lrth|find . -type f -mmin +1


#last modified files exactly one mins
find . -type f -mmin 1

last modifiedfiles exactly in one day by using command (-mtime)
find . -type f -mtime 10

#last modified more than 10 days
find . -type  f -mtime +10

#last modified less than 10 days
find . -type f -mtime -10

#How to Find Modified Files and Folders Starting from a Given Date to the Latest Date
find . -type f  -newermt "17-11-2020"

#How to Find a List of “sh” Extension Files Accessed in the Last 30 Days--- -matdimtype
ls -lrt|find . -type f -iname ".sh" -atime -30

#How to Find a List of Files Created Today, -1 means less than min,
ls -lrt | find . -type f -ctime -1 -ls

Better use the exec action than the for loop:

find . -path "./dirtoexclude" -prune \
    -o -exec java -jar config/yuicompressor-2.4.2.jar --type js '{}' -o '{}' \;

The exec ... '{}' ... '{}' \; will be executed once for every matching file, replacing the braces '{}' with the current file name.

Notice that the braces are enclosed in single quote marks to protect them from interpretation as shell script punctuation*.


Notes

* From the EXAMPLES section of the find (GNU findutils) 4.4.2 man page


I consider myself a bash junkie, BUT ... for the last 2 years have not find a single bash user friendly solution for this one. By "user-friendly" I mean just a single call away, which does not require me to remember complicated syntax + I can use the same find syntax as before , so the following solution works best for those ^^^

Copy paste this one in your shell and source the ~/.bash_aliases :

cat << "EOF" >> ~/.bash_aliases
# usage: source ~/.bash_aliases , instead of find type findd + rest of syntax
findd(){
   dir=$1; shift ;
   find  $dir -not -path "*/node_modules/*" -not -path "*/build/*" \
      -not -path "*/.cache/*" -not -path "*/.git/*" -not -path "*/venv/*" $@
}
EOF

Of course in order to add or remove dirs to exclude you would have to edit this alias func with your dirs of choice ...


find . -name '*.js' -\! -name 'glob-for-excluded-dir' -prune

how-to-use-prune-option-of-find-in-sh is an excellent answer by Laurence Gonsalves on how -prune works.

And here is the generic solution:

find /path/to/search                    \
  -type d                               \
    \( -path /path/to/search/exclude_me \
       -o                               \
       -name exclude_me_too_anywhere    \
     \)                                 \
    -prune                              \
  -o                                    \
  -type f -name '*\.js' -print

To avoid typing /path/to/seach/ multiple times, wrap the find in a pushd .. popd pair.

pushd /path/to/search;                  \
find .                                  \
  -type d                               \
    \( -path ./exclude_me               \
       -o                               \
       -name exclude_me_too_anywhere    \
     \)                                 \
    -prune                              \
  -o                                    \
  -type f -name '*\.js' -print;         \
 popd

This is the format I used to exclude some paths:

$ find ./ -type f -name "pattern" ! -path "excluded path" ! -path "excluded path"

I used this to find all files not in ".*" paths:

$ find ./ -type f -name "*" ! -path "./.*" ! -path "./*/.*"

I found the functions name in C sources files exclude *.o and exclude *.swp and exclude (not regular file) and exclude dir output with this command:

find .  \( ! -path "./output/*" \) -a \( -type f \) -a \( ! -name '*.o' \) -a \( ! -name '*.swp' \) | xargs grep -n soc_attach

-prune definitely works and is the best answer because it prevents descending into the dir that you want to exclude. -not -path which still searches the excluded dir, it just doesn't print the result, which could be an issue if the excluded dir is mounted network volume or you don't permissions.

The tricky part is that find is very particular about the order of the arguments, so if you don't get them just right, your command may not work. The order of arguments is generally as such:

find {path} {options} {action}

{path}: Put all the path related arguments first, like . -path './dir1' -prune -o

{options}: I have the most success when putting -name, -iname, etc as the last option in this group. E.g. -type f -iname '*.js'

{action}: You'll want to add -print when using -prune

Here's a working example:

# setup test
mkdir dir1 dir2 dir3
touch dir1/file.txt; touch dir1/file.js
touch dir2/file.txt; touch dir2/file.js
touch dir3/file.txt; touch dir3/file.js

# search for *.js, exclude dir1
find . -path './dir1' -prune -o -type f -iname '*.js' -print

# search for *.js, exclude dir1 and dir2
find . \( -path './dir1' -o -path './dir2' \) -prune -o -type f -iname '*.js' -print

Use the -prune option. So, something like:

find . -type d -name proc -prune -o -name '*.js'

The '-type d -name proc -prune' only look for directories named proc to exclude.
The '-o' is an 'OR' operator.


There are plenty of good answers, it just took me some time to understand what each element of the command was for and the logic behind it.

find . -path ./misc -prune -o -name '*.txt' -print

find will start finding files and directories in the current directory, hence the find ..

The -o option stands for a logical OR and separates the two parts of the command :

[ -path ./misc -prune ] OR [ -name '*.txt' -print ]

Any directory or file that is not the ./misc directory will not pass the first test -path ./misc. But they will be tested against the second expression. If their name corresponds to the pattern *.txt they get printed, because of the -print option.

When find reaches the ./misc directory, this directory only satisfies the first expression. So the -prune option will be applied to it. It tells the find command to not explore that directory. So any file or directory in ./misc will not even be explored by find, will not be tested against the second part of the expression and will not be printed.


I have found the suggestions on this page and a lot of other pages just do not work on my Mac OS X system. However, I have found a variation which does work for me.

The big idea is to search the Macintosh HD but avoid traversing all the external volumes, which are mostly Time Machine backups, image backups, mounted shares, and archives, but without having to unmount them all, which is often impractical.

Here is my working script, which I have named "findit".

#!/usr/bin/env bash
# inspired by http://stackoverflow.com/questions/4210042/exclude-directory-from-find-command Danile C. Sobral
# using special syntax to avoid traversing. 
# However, logic is refactored because the Sobral version still traverses 
# everything on my system

echo ============================
echo find - from cwd, omitting external volumes
date
echo Enter sudo password if requested
sudo find . -not \( \
-path ./Volumes/Archive -prune -o \
-path ./Volumes/Boot\ OS\ X -prune -o \
-path ./Volumes/C \
-path ./Volumes/Data -prune -o \
-path ./Volumes/jas -prune -o \
-path ./Volumes/Recovery\ HD -prune -o \
-path ./Volumes/Time\ Machine\ Backups -prune -o \
-path ./Volumes/SuperDuper\ Image -prune -o \
-path ./Volumes/userland -prune \
\) -name "$1" -print
date
echo ============================
iMac2:~ jas$

The various paths have to do with external archive volumes, Time Machine, Virtual Machines, other mounted servers, and so on. Some of the volume names have spaces in them.

A good test run is "findit index.php", because that file occurs in many places on my system. With this script, it takes about 10 minutes to search the main hard drive. Without those exclusions, it takes many hours.


For FreeBSD users:

 find . -name '*.js' -not -path '*exclude/this/dir*'

If -prune doesn't work for you, this will:

find -name "*.js" -not -path "./directory/*"

Caveat: requires traversing all of the unwanted directories.


For those of you on older versions of UNIX who cannot use -path or -not

Tested on SunOS 5.10 bash 3.2 and SunOS 5.11 bash 4.4

find . -type f -name "*" -o -type d -name "*excluded_directory*" -prune -type f

One option would be to exclude all results that contain the directory name with grep. For example:

find . -name '*.js' | grep -v excludeddir

You can also use regular expressions to include / exclude some files /dirs your search using something like this:

find . -regextype posix-egrep -regex ".*\.(js|vue|s?css|php|html|json)$" -and -not -regex ".*/(node_modules|vendor)/.*" 

This will only give you all js, vue, css, etc files but excluding all files in the node_modules and vendor folders.


I prefer the -not notation ... it's more readable:

find . -name '*.js' -and -not -path directory

a good trick for avoiding printing the pruned directories is to use -print (works for -exec as well) after the right side of the -or after -prune. For example, ...

find . -path "*/.*" -prune -or -iname "*.j2"

will print the path of all files beneath the current directory with the `.j2" extension, skipping all hidden directories. Neat. But it will also print the print the full path of each directory one is skipping, as noted above. However, the following does not, ...

find . -path "*/.*" -prune -or -iname "*.j2" -print

because logically there's a hidden -and after the -iname operator and before the -print. This binds it to the right part of the -or clause due to boolean order of operations and associativity. But the docs say there's a hidden -print if it (or any of its cousins ... -print0, etc) is not specified. So why isn't the left part of the -or printing? Apparently (and I didn't understand this from my first reading the man page), that is true if there there is no -print -or -exec ANYWHERE, in which case, -print is logically sprinkled around such that everything gets printed. If even ONE print-style operation is expressed in any clause, all those hidden logical ones go away and you get only what you specify. Now frankly, I might have preferred it the other way around, but then a find with only descriptive operators would apparently do nothing, so I guess it makes sense as it is. As mentioned above, this all works with -exec as well, so the following gives a full ls -la listing for each file with the desired extension, but not listing the first level of each hidden directory, ...

find . -path "*/.*" -prune -or -iname "*.j2" -exec ls -la -- {} +

For me (and others on this thread), find syntax gets pretty baroque pretty quickly, so I always throw in parens to make SURE I know what binds to what, so I usually create a macro for type-ability and form all such statements as ...

find . \( \( ... description of stuff to avoid ... \) -prune \) -or \
\( ... description of stuff I want to find ... [ -exec or -print] \)

It's hard to go wrong by setting up the world into two parts this way. I hope this helps, though it seems unlikely for anyone to read down to the 30+th answer and vote it up, but one can hope. :-)


The following commands works:

find . -path ./.git -prune -o -print

If You have a problem with find, use the -D tree option to view the expression analysis information.

find -D tree . -path ./.git -prune -o -print

Or the -D all, to see all the execution information.

find -D all . -path ./.git -prune -o -print

i wanted to know the number of directories, files an MB of just the current directory - and that code does exactly what i want :-)

the source

- ...    2791037 Jun  2  2011 foo.jpg
- ... 1284734651 Mär 10 16:16 foo.tar.gz
- ...          0 Mär 10 15:28 foo.txt
d ...       4096 Mär  3 17:12 HE
d ...       4096 Mär  3 17:21 KU
d ...       4096 Mär  3 17:17 LE
d ...          0 Mär  3 17:14 NO
d ...          0 Mär  3 17:15 SE
d ...          0 Mär  3 17:13 SP
d ...          0 Mär  3 17:14 TE
d ...          0 Mär  3 19:20 UN

the code

format="%s%'12d\n"

find . -type d -not -path "./*/*" | wc -l | awk -v fmt=$format '{printf fmt, " Anzahl Ordner  = ", $1-1}'
find . -type f -not -path "./*/*" | wc -l | awk -v fmt=$format '{printf fmt, " Anzahl Dateien = ", $1}'
  du . -hmS --max-depth=0 | awk -v fmt=$format '{printf fmt, " Groesse (MB)   = ", $1}'

note: the extra format="%s%'12d\n" is necessary for awk to format the numbers.

the result

Anzahl Ordner  =            8
Anzahl Dateien =            3
Groesse (MB)   =        1.228

You can use the prune option to achieve this. As in for example:

find ./ -path ./beta/* -prune -o -iname example.com -print

Or the inverse grep “grep -v” option:

find -iname example.com | grep -v beta

You can find detailed instructions and examples in Linux find command exclude directories from searching.


The -path -prune approach also works with wildcards in the path. Here is a find statement that will find the directories for a git server serving multiple git repositiories leaving out the git internal directories:

find . -type d \
   -not \( -path */objects -prune \) \
   -not \( -path */branches -prune \) \
   -not \( -path */refs -prune \) \
   -not \( -path */logs -prune \) \
   -not \( -path */.git -prune \) \
   -not \( -path */info -prune \) \
   -not \( -path */hooks -prune \)  

find . \( -path '.**/.git' -o -path '.**/.hg' \) -prune -o -name '*.js' -print

The example above finds all *.js files under the current directory, excluding folders .git and .hg, does not matter how deep these .git and .hg folders are.

Note: this also works:

find . \( -path '.*/.git' -o -path '.*/.hg' \) -prune -o -name '*.js' -print

but I prefer the ** notation for consistency with some other tools which would be off topic here.


Instead of:

for file in $(find . -name '*.js')
do 
  java -jar config/yuicompressor-2.4.2.jar --type js $file -o $file
done

...and since you don't define which subdirectory you want to exclude, you could use:

for file in $(find *.js -maxdepth 0 -name '*.js')
do 
  java -jar config/yuicompressor-2.4.2.jar --type js $file -o $file
done

This syntax will exclude all subdirectories.

Take a look at the example below: under my tmp directory I have an huge "archive" subdirectory which contains 17000-4640=12360 files. And this directory is located on a slow NFS. While the 1st syntax scans the "archive" subdirectory and performs poorly, the 2nd syntax only scans the "*pdf" files contained in my current dir and performs... not that bad.

[tmp]$ time (find . -name "*pdf" | wc -l)
17000

real    0m40.479s
user    0m0.423s
sys     0m5.606s

[tmp]$ time (find *pdf -maxdepth 0 -name "*pdf" | wc -l)
4640

real    0m7.778s
user    0m0.113s
sys     0m1.136s

That 2nd syntax is quite interesting: in the following example I want to check if file or60runm50958.pdf exists and is more than 20 minutes old. See for yourself how the 2nd syntax is more efficient. This is because it avoids scanning the archive subdirectory.

[tmp]$ time find . -name or60runm50958.pdf -mmin +20
./or60runm50958.pdf

real    0m51.145s
user    0m0.529s
sys     0m6.243s

[tmp]$ time find or60runm50958.pdf -maxdepth 0 -name or60runm50958.pdf -mmin +20
or60runm50958.pdf

real    0m0.004s
user    0m0.000s
sys     0m0.002s

I tried command above, but none of those using "-prune" works for me. Eventually I tried this out with command below:

find . \( -name "*" \) -prune -a ! -name "directory"

I find the following easier to reason about than other proposed solutions:

find build -not \( -path build/external -prune \) -name \*.js
# you can also exclude multiple paths
find build -not \( -path build/external -prune \) -not \( -path build/blog -prune \) -name \*.js

Important Note: the paths you type after -path must exactly match what find would print without the exclusion. If this sentence confuses you just make sure to use full paths through out the whole command like this: find /full/path/ -not \( -path /full/path/exclude/this -prune \) .... See note [1] if you'd like a better understanding.

Inside \( and \) is an expression that will match exactly build/external (see important note above), and will, on success, avoid traversing anything below. This is then grouped as a single expression with the escaped parenthesis, and prefixed with -not which will make find skip anything that was matched by that expression.

One might ask if adding -not will not make all other files hidden by -prune reappear, and the answer is no. The way -prune works is that anything that, once it is reached, the files below that directory are permanently ignored.

This comes from an actual use case, where I needed to call yui-compressor on some files generated by wintersmith, but leave out other files that need to be sent as-is.


Note [1]: If you want to exclude /tmp/foo/bar and you run find like this "find /tmp \(..." then you must specify -path /tmp/foo/bar. If on the other hand you run find like this cd /tmp; find . \(... then you must specify -path ./foo/bar.


This is the only one that worked for me.

find / -name MyFile ! -path '*/Directory/*'

Searching for "MyFile" excluding "Directory". Give emphasis to the stars * .


There is clearly some confusion here as to what the preferred syntax for skipping a directory should be.

GNU Opinion

To ignore a directory and the files under it, use -prune

From the GNU find man page

Reasoning

-prune stops find from descending into a directory. Just specifying -not -path will still descend into the skipped directory, but -not -path will be false whenever find tests each file.

Issues with -prune

-prune does what it's intended to, but are still some things you have to take care of when using it.

  1. find prints the pruned directory.

    • TRUE That's intended behavior, it just doesn't descend into it. To avoid printing the directory altogether, use a syntax that logically omits it.
  2. -prune only works with -print and no other actions.

    • NOT TRUE. -prune works with any action except -delete. Why doesn't it work with delete? For -delete to work, find needs to traverse the directory in DFS order, since -deletewill first delete the leaves, then the parents of the leaves, etc... But for specifying -prune to make sense, find needs to hit a directory and stop descending it, which clearly makes no sense with -depth or -delete on.

Performance

I set up a simple test of the three top upvoted answers on this question (replaced -print with -exec bash -c 'echo $0' {} \; to show another action example). Results are below

----------------------------------------------
# of files/dirs in level one directories
.performance_test/prune_me     702702    
.performance_test/other        2         
----------------------------------------------

> find ".performance_test" -path ".performance_test/prune_me" -prune -o -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
  [# of files] 3 [Runtime(ns)] 23513814

> find ".performance_test" -not \( -path ".performance_test/prune_me" -prune \) -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
  [# of files] 3 [Runtime(ns)] 10670141

> find ".performance_test" -not -path ".performance_test/prune_me*" -exec bash -c 'echo "$0"' {} \;
.performance_test
.performance_test/other
.performance_test/other/foo
  [# of files] 3 [Runtime(ns)] 864843145

Conclusion

Both f10bit's syntax and Daniel C. Sobral's syntax took 10-25ms to run on average. GetFree's syntax, which doesn't use -prune, took 865ms. So, yes this is a rather extreme example, but if you care about run time and are doing anything remotely intensive you should use -prune.

Note Daniel C. Sobral's syntax performed the better of the two -prune syntaxes; but, I strongly suspect this is the result of some caching as switching the order in which the two ran resulted in the opposite result, while the non-prune version was always slowest.

Test Script

#!/bin/bash

dir='.performance_test'

setup() {
  mkdir "$dir" || exit 1
  mkdir -p "$dir/prune_me/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p/q/r/s/t/u/w/x/y/z" \
    "$dir/other"

  find "$dir/prune_me" -depth -type d -exec mkdir '{}'/{A..Z} \;
  find "$dir/prune_me" -type d -exec touch '{}'/{1..1000} \;
  touch "$dir/other/foo"
}

cleanup() {
  rm -rf "$dir"
}

stats() {
  for file in "$dir"/*; do
    if [[ -d "$file" ]]; then
      count=$(find "$file" | wc -l)
      printf "%-30s %-10s\n" "$file" "$count"
    fi
  done
}

name1() {
  find "$dir" -path "$dir/prune_me" -prune -o -exec bash -c 'echo "$0"'  {} \;
}

name2() {
  find "$dir" -not \( -path "$dir/prune_me" -prune \) -exec bash -c 'echo "$0"' {} \;
}

name3() {
  find "$dir" -not -path "$dir/prune_me*" -exec bash -c 'echo "$0"' {} \;
}

printf "Setting up test files...\n\n"
setup
echo "----------------------------------------------"
echo "# of files/dirs in level one directories"
stats | sort -k 2 -n -r
echo "----------------------------------------------"

printf "\nRunning performance test...\n\n"

echo \> find \""$dir"\" -path \""$dir/prune_me"\" -prune -o -exec bash -c \'echo \"\$0\"\'  {} \\\;
name1
s=$(date +%s%N)
name1_num=$(name1 | wc -l)
e=$(date +%s%N)
name1_perf=$((e-s))
printf "  [# of files] $name1_num [Runtime(ns)] $name1_perf\n\n"

echo \> find \""$dir"\" -not \\\( -path \""$dir/prune_me"\" -prune \\\) -exec bash -c \'echo \"\$0\"\' {} \\\;
name2
s=$(date +%s%N)
name2_num=$(name2 | wc -l)
e=$(date +%s%N)
name2_perf=$((e-s))
printf "  [# of files] $name2_num [Runtime(ns)] $name2_perf\n\n"

echo \> find \""$dir"\" -not -path \""$dir/prune_me*"\" -exec bash -c \'echo \"\$0\"\' {} \\\;
name3
s=$(date +%s%N)
name3_num=$(name3 | wc -l)
e=$(date +%s%N)
name3_perf=$((e-s))
printf "  [# of files] $name3_num [Runtime(ns)] $name3_perf\n\n"

echo "Cleaning up test files..."
cleanup

Not sure if this would cover all edge cases, but following would be pretty straight forward and simple to try:

ls -1|grep -v -e ddl -e docs| xargs rm -rf

This should remove all files/directories from the current directory excpet 'ddls' and 'docs'.


find -name '*.js' -not -path './node_modules/*' -not -path './vendor/*'

seems to work the same as

find -name '*.js' -not \( -path './node_modules/*' -o -path './vendor/*' \)

and is easier to remember IMO.


TLDR: understand your root directories and tailor your search from there, using the -path <excluded_path> -prune -o option. Do not include a trailing / at the end of the excluded path.

Example:

find / -path /mnt -prune -o -name "*libname-server-2.a*" -print


To effectively use the find I believe that it is imperative to have a good understanding of your file system directory structure. On my home computer I have multi-TB hard drives, with about half of that content backed up using rsnapshot (i.e., rsync). Although backing up to to a physically independent (duplicate) drive, it is mounted under my system root (/) directory: /mnt/Backups/rsnapshot_backups/:

/mnt/Backups/
+-- rsnapshot_backups/
    +-- hourly.0/
    +-- hourly.1/
    +-- ...
    +-- daily.0/
    +-- daily.1/
    +-- ...
    +-- weekly.0/
    +-- weekly.1/
    +-- ...
    +-- monthly.0/
    +-- monthly.1/
    +-- ...

The /mnt/Backups/rsnapshot_backups/ directory currently occupies ~2.9 TB, with ~60M files and folders; simply traversing those contents takes time:

## As sudo (#), to avoid numerous "Permission denied" warnings:

time find /mnt/Backups/rsnapshot_backups | wc -l
60314138    ## 60.3M files, folders
34:07.30    ## 34 min

time du /mnt/Backups/rsnapshot_backups -d 0
3112240160  /mnt/Backups/rsnapshot_backups    ## 3.1 TB
33:51.88    ## 34 min

time rsnapshot du    ## << more accurate re: rsnapshot footprint
2.9T    /mnt/Backups/rsnapshot_backups/hourly.0/
4.1G    /mnt/Backups/rsnapshot_backups/hourly.1/
...
4.7G    /mnt/Backups/rsnapshot_backups/weekly.3/
2.9T    total    ## 2.9 TB, per sudo rsnapshot du (more accurate)
2:34:54          ## 2 hr 35 min

Thus, anytime I need to search for a file on my / (root) partition, I need to deal with (avoid if possible) traversing my backups partition.


EXAMPLES

Among the approached variously suggested in this thread (How to exclude a directory in find . command), I find that searches using the accepted answer are much faster -- with caveats.

Solution 1

Let's say I want to find the system file libname-server-2.a, but I do not want to search through my rsnapshot backups. To quickly find a system file, use the exclude path /mnt (i.e., use /mnt, not /mnt/, or /mnt/Backups, or ...):

## As sudo (#), to avoid numerous "Permission denied" warnings:

time find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
/usr/lib/libname-server-2.a
real    0m8.644s              ## 8.6 sec  <<< NOTE!
user    0m1.669s
 sys    0m2.466s

## As regular user (victoria); I also use an alternate timing mechanism, as
## here I am using 2>/dev/null to suppress "Permission denied" warnings:

$ START="$(date +"%s")" && find 2>/dev/null / -path /mnt -prune -o \
    -name "*libname-server-2.a*" -print; END="$(date +"%s")"; \
    TIME="$((END - START))"; printf 'find command took %s sec\n' "$TIME"
/usr/lib/libname-server-2.a
find command took 3 sec     ## ~3 sec  <<< NOTE!

... finds that file in just a few seconds, while this take much longer (appearing to recurse through all of the "excluded" directories):

## As sudo (#), to avoid numerous "Permission denied" warnings:

time find / -path /mnt/ -prune -o -name "*libname-server-2.a*" -print
find: warning: -path /mnt/ will not match anything because it ends with /.
/usr/lib/libname-server-2.a
real    33m10.658s            ## 33 min 11 sec (~231-663x slower!)
user    1m43.142s
 sys    2m22.666s

## As regular user (victoria); I also use an alternate timing mechanism, as
## here I am using 2>/dev/null to suppress "Permission denied" warnings:

$ START="$(date +"%s")" && find 2>/dev/null / -path /mnt/ -prune -o \
    -name "*libname-server-2.a*" -print; END="$(date +"%s")"; \
    TIME="$((END - START))"; printf 'find command took %s sec\n' "$TIME"
/usr/lib/libname-server-2.a
find command took 1775 sec    ## 29.6 min

Solution 2

The other solution offered in this thread (SO#4210042) also performs poorly:

## As sudo (#), to avoid numerous "Permission denied" warnings:

time find / -name "*libname-server-2.a*" -not -path "/mnt"
/usr/lib/libname-server-2.a
real    33m37.911s            ## 33 min 38 sec (~235x slower)
user    1m45.134s
 sys    2m31.846s

time find / -name "*libname-server-2.a*" -not -path "/mnt/*"
/usr/lib/libname-server-2.a
real    33m11.208s            ## 33 min 11 sec
user    1m22.185s
 sys    2m29.962s

SUMMARY | CONCLUSIONS

Use the approach illustrated in "Solution 1"

find / -path /mnt -prune -o -name "*libname-server-2.a*" -print

i.e.

... -path <excluded_path> -prune -o ...

noting that whenever you add the trailing / to the excluded path, the find command then recursively enters (all those) /mnt/* directories -- which in my case, because of the /mnt/Backups/rsnapshot_backups/* subdirectories, additionally includes ~2.9 TB of files to search! By not appending a trailing / the search should complete almost immediately (within seconds).

"Solution 2" (... -not -path <exclude path> ...) likewise appears to recursively search through the excluded directories -- not returning excluded matches, but unnecessarily consuming that search time.


Searching within those rsnapshot backups:

To find a file in one of my hourly/daily/weekly/monthly rsnapshot backups):

$ START="$(date +"%s")" && find 2>/dev/null /mnt/Backups/rsnapshot_backups/daily.0 -name '*04t8ugijrlkj.jpg'; END="$(date +"%s")"; TIME="$((END - START))"; printf 'find command took %s sec\n' "$TIME"
/mnt/Backups/rsnapshot_backups/daily.0/snapshot_root/mnt/Vancouver/temp/04t8ugijrlkj.jpg
find command took 312 sec   ## 5.2 minutes: despite apparent rsnapshot size
                            ## (~4 GB), it is in fact searching through ~2.9 TB)

Excluding a nested directory:

Here, I want to exclude a nested directory, e.g. /mnt/Vancouver/projects/ie/claws/data/* when searching from /mnt/Vancouver/projects/:

$ time find . -iname '*test_file*'
./ie/claws/data/test_file
./ie/claws/test_file
0:01.97

$ time find . -path '*/data' -prune -o -iname '*test_file*' -print
./ie/claws/test_file
0:00.07

Aside: Adding -print at the end of the command suppresses the printout of the excluded directory:

$ find / -path /mnt -prune -o -name "*libname-server-2.a*"
/mnt
/usr/lib/libname-server-2.a

$ find / -path /mnt -prune -o -name "*libname-server-2.a*" -print
/usr/lib/libname-server-2.a

I was using find to provide a list of files for xgettext, and wanted to omit a specific directory and its contents. I tried many permutations of -path combined with -prune but was unable to fully exclude the directory which I wanted gone.

Although I was able to ignore the contents of the directory which I wanted ignored, find then returned the directory itself as one of the results, which caused xgettext to crash as a result (doesn't accept directories; only files).

My solution was to simply use grep -v to skip the directory that I didn't want in the results:

find /project/directory -iname '*.php' -or -iname '*.phtml' | grep -iv '/some/directory' | xargs xgettext

Whether or not there is an argument for find that will work 100%, I cannot say for certain. Using grep was a quick and easy solution after some headache.


This works because find TESTS the files for the pattern "*foo*":

find ! -path "dir1" ! -path "dir2" -name "*foo*"

but it does NOT work if you don't use a pattern (find does not TEST the file). So find makes no use of its former evaluated "true" & "false" bools. Example for not working use case with above notation:

find ! -path "dir1" ! -path "dir2" -type f

There is no find TESTING! So if you need to find files without any pattern matching use the -prune. Also, by the use of prune find is always faster while it really skips that directories instead of matching it or better not matching it. So in that case use something like:

find dir -not \( -path "dir1" -prune \) -not \( -path "dir2" -prune \) -type f

or:

find dir -not \( -path "dir1" -o -path "dir2" -prune \) -type f

Regards


For a working solution (tested on Ubuntu 12.04 (Precise Pangolin))...

find ! -path "dir1" -iname "*.mp3"

will search for MP3 files in the current folder and subfolders except in dir1 subfolder.

Use:

find ! -path "dir1" ! -path "dir2" -iname "*.mp3"

...to exclude dir1 AND dir2


This is suitable for me on a Mac:

find . -name *.php -or -path "./vendor" -prune -or -path "./app/cache" -prune

It will exclude vendor and app/cache dir for search name which suffixed with php.


None of previous answers is good on Ubuntu. Try this:

find . ! -path "*/test/*" -type f -name "*.js" ! -name "*-min-*" ! -name "*console*"

I have found this here


If search directories has pattern (in my case most of the times); you can simply do it like below:

find ./n* -name "*.tcl" 

In above example; it searches in all the sub-directories starting with "n".


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to shell

Comparing a variable with a string python not working when redirecting from bash script Get first line of a shell command's output How to run shell script file using nodejs? Run bash command on jenkins pipeline Way to create multiline comments in Bash? How to do multiline shell script in Ansible How to check if a file exists in a shell script How to check if an environment variable exists and get its value? Curl to return http status code along with the response docker entrypoint running bash script gets "permission denied"

Examples related to find

Find a file by name in Visual Studio Code Explaining the 'find -mtime' command find files by extension, *.html under a folder in nodejs MongoDB Show all contents from all collections How can I find a file/directory that could be anywhere on linux command line? Get all files modified in last 30 days in a directory FileNotFoundError: [Errno 2] No such file or directory Linux find and grep command together find . -type f -exec chmod 644 {} ; Find all stored procedures that reference a specific column in some table