[bash] How to loop through file names returned by find?

x=$(find . -name "*.txt")
echo $x

if I run the above piece of code in Bash shell, what I get is a string containing several file names separated by blank, not a list.

Of course, I can further separate them by blank to get a list, but I'm sure there is a better way to do it.

So what is the best way to loop through the results of a find command?

This question is related to bash find

The answer is


If you can assume the file names don't contain newlines, you can read the output of find into a Bash array using the following command:

readarray -t x < <(find . -name '*.txt')

Note:

  • -t causes readarray to strip newlines.
  • It won't work if readarray is in a pipe, hence the process substitution.
  • readarray is available since Bash 4.

Bash 4.4 and up also supports the -d parameter for specifying the delimiter. Using the null character, instead of newline, to delimit the file names works also in the rare case that the file names contain newlines:

readarray -d '' x < <(find . -name '*.txt' -print0)

readarray can also be invoked as mapfile with the same options.

Reference: https://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream


TL;DR: If you're just here for the most correct answer, you probably want my personal preference, find . -name '*.txt' -exec process {} \; (see the bottom of this post). If you have time, read through the rest to see several different ways and the problems with most of them.


The full answer:

The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:

for i in $x; do # Not recommended, will break on whitespace
    process "$i"
done

Marginally better, cut out the temporary variable x:

for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
    process "$i"
done

It is much better to glob when you can. White-space safe, for files in the current directory:

for i in *.txt; do # Whitespace-safe but not recursive.
    process "$i"
done

By enabling the globstar option, you can glob all matching files in this directory and all subdirectories:

# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
    process "$i"
done

In some cases, e.g. if the file names are already in a file, you may need to use read:

# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
    process "$line"
done < filename

read can be used safely in combination with find by setting the delimiter appropriately:

find . -name '*.txt' -print0 | 
    while IFS= read -r -d '' line; do 
        process "$line"
    done

For more complex searches, you will probably want to use find, either with its -exec option or with -print0 | xargs -0:

# execute `process` once for each file
find . -name \*.txt -exec process {} \;

# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +

# using xargs*
find . -name \*.txt -print0 | xargs -0 process

# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument

find can also cd into each file's directory before running a command by using -execdir instead of -exec, and can be made interactive (prompt before running the command for each file) using -ok instead of -exec (or -okdir instead of -execdir).

*: Technically, both find and xargs (by default) will run the command with as many arguments as they can fit on the command line, as many times as it takes to get through all the files. In practice, unless you have a very large number of files it won't matter, and if you exceed the length but need them all on the same command line, you're SOL find a different way.


Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt") from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt" you will get 2 separated strings for processing, if you process x in a loop. You can improve this by changing delimiter (bash IFS Variable) e.g. to \r\n, but filenames can include control characters - so this is not a (completely) safe method.

From my point of view, there are 2 recommended (and safe) patterns for processing files:

1. Use for loop & filename expansion:

for file in ./*.txt; do
    [[ ! -e $file ]] && continue  # continue, if file does not exist
    # single filename is in $file
    echo "$file"
    # your code here
done

2. Use find-read-while & process substitution

while IFS= read -r -d '' file; do
    # single filename is in $file
    echo "$file"
    # your code here
done < <(find . -name "*.txt" -print0)

Remarks

on Pattern 1:

  1. bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see Bash Manual, Filename Expansion
  2. shell option nullglob can be used to avoid this extra line.
  3. "If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above)
  4. shell option globstar: "If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match." see Bash Manual, Shopt Builtin
  5. other options for filename expansion: extglob, nocaseglob, dotglob & shell variable GLOBIGNORE

on Pattern 2:

  1. filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL. see also Gnu Findutils Manpage, Unsafe File Name Handling, safe File Name Handling, unusual characters in filenames. See David A. Wheeler below for detailed discussion of this topic.

  2. There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes:

    files_found=1
    find . -name "*.txt" -print0 | 
       while IFS= read -r -d '' file; do
           # single filename in $file
           echo "$file"
           files_found=0   # not working example
           # your code here
       done
    [[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
    

    When you try this piece of code, you will see, that it does not work: files_found is always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.
    See I set variables in a loop that's in a pipeline. Why do they disappear... (from Greg's Bash FAQ) for a detailed discussion on this topic.

Additional References & Sources:


(Updated to include @Socowi's execellent speed improvement)

With any $SHELL that supports it (dash/zsh/bash...):

find . -name "*.txt" -exec $SHELL -c '
    for i in "$@" ; do
        echo "$i"
    done
' {} +

Done.


Original answer (shorter, but slower):

find . -name "*.txt" -exec $SHELL -c '
    echo "$0"
' {} \;

based on other answers and comment of @phk, using fd #3:
(which still allows to use stdin inside the loop)

while IFS= read -r f <&3; do
    echo "$f"

done 3< <(find . -iname "*filename*")

I like to use find which is first assigned to variable and IFS switched to new line as follow:

FilesFound=$(find . -name "*.txt")

IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
    echo "${counter}: ${file}"
    let counter++;
done
IFS="$IFSbkp"

As commented by @Konrad Rudolph this will not work with "new lines" in file name. I still think it is handy as it covers most of the cases when you need to loop over command output.


How about if you use grep instead of find?

ls | grep .txt$ > out.txt

Now you can read this file and the filenames are in the form of a list.


# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
  process_one $x
done

or

# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one

You can put the filenames returned by find into an array like this:

array=()
while IFS=  read -r -d ''; do
    array+=("$REPLY")
done < <(find . -name '*.txt' -print0)

Now you can just loop through the array to access individual items and do whatever you want with them.

Note: It's white space safe.


find . -name "*.txt"|while read fname; do
  echo "$fname"
done

Note: this method and the (second) method shown by bmargulies are safe to use with white space in the file/folder names.

In order to also have the - somewhat exotic - case of newlines in the file/folder names covered, you will have to resort to the -exec predicate of find like this:

find . -name '*.txt' -exec echo "{}" \;

The {} is the placeholder for the found item and the \; is used to terminate the -exec predicate.

And for the sake of completeness let me add another variant - you gotta love the *nix ways for their versatility:

find . -name '*.txt' -print0|xargs -0 -n 1 echo

This would separate the printed items with a \0 character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargs picks them up one by one then ...


You can store your find output in array if you wish to use the output later as:

array=($(find . -name "*.txt"))

Now to print the each element in new line, you can either use for loop iterating to all the elements of array, or you can use printf statement.

for i in ${array[@]};do echo $i; done

or

printf '%s\n' "${array[@]}"

You can also use:

for file in "`find . -name "*.txt"`"; do echo "$file"; done

This will print each filename in newline

To only print the find output in list form, you can use either of the following:

find . -name "*.txt" -print 2>/dev/null

or

find . -name "*.txt" -print | grep -v 'Permission denied'

This will remove error messages and only give the filename as output in new line.

If you wish to do something with the filenames, storing it in array is good, else there is no need to consume that space and you can directly print the output from find.


find <path> -xdev -type f -name *.txt -exec ls -l {} \;

This will list the files and give details about attributes.


What ever you do, don't use a for loop:

# Don't do this
for file in $(find . -name "*.txt")
do
    …code using "$file"
done

Three reasons:

  • For the for loop to even start, the find must run to completion.
  • If a file name has any whitespace (including space, tab or newline) in it, it will be treated as two separate names.
  • Although now unlikely, you can overrun your command line buffer. Imagine if your command line buffer holds 32KB, and your for loop returns 40KB of text. That last 8KB will be dropped right off your for loop and you'll never know it.

Always use a while read construct:

find . -name "*.txt" -print0 | while read -d $'\0' file
do
    …code using "$file"
done

The loop will execute while the find command is executing. Plus, this command will work even if a file name is returned with whitespace in it. And, you won't overflow your command line buffer.

The -print0 will use the NULL as a file separator instead of a newline and the -d $'\0' will use NULL as the separator while reading.