[bash] Find files containing a given text

In bash I want to return file name (and the path to the file) for every file of type .php|.html|.js containing the case-insensitive string "document.cookie" | "setcookie"

How would I do that?

This question is related to bash find

The answer is


find . -type f -name '*php' -o -name '*js' -o -name '*html' |\
xargs grep -liE 'document\.cookie|setcookie'

Sounds like a perfect job for grep or perhaps ack

Or this wonderful construction:

find . -type f \( -name *.php -o -name *.html -o -name *.js \) -exec grep "document.cookie\|setcookie" /dev/null {} \;

find them and grep for the string:

This will find all files of your 3 types in /starting/path and grep for the regular expression '(document\.cookie|setcookie)'. Split over 2 lines with the backslash just for readability...

find /starting/path -type f -name "*.php" -o -name "*.html" -o -name "*.js" | \
 xargs egrep -i '(document\.cookie|setcookie)'

Just to include one more alternative, you could also use this:

find "/starting/path" -type f -regextype posix-extended -regex "^.*\.(php|html|js)$" -exec grep -EH '(document\.cookie|setcookie)' {} \;

Where:

  • -regextype posix-extended tells find what kind of regex to expect
  • -regex "^.*\.(php|html|js)$" tells find the regex itself filenames must match
  • -exec grep -EH '(document\.cookie|setcookie)' {} \; tells find to run the command (with its options and arguments) specified between the -exec option and the \; for each file it finds, where {} represents where the file path goes in this command.

    while

    • E option tells grep to use extended regex (to support the parentheses) and...
    • H option tells grep to print file paths before the matches.

And, given this, if you only want file paths, you may use:

find "/starting/path" -type f -regextype posix-extended -regex "^.*\.(php|html|js)$" -exec grep -EH '(document\.cookie|setcookie)' {} \; | sed -r 's/(^.*):.*$/\1/' | sort -u

Where

  • | [pipe] send the output of find to the next command after this (which is sed, then sort)
  • r option tells sed to use extended regex.
  • s/HI/BYE/ tells sed to replace every First occurrence (per line) of "HI" with "BYE" and...
  • s/(^.*):.*$/\1/ tells it to replace the regex (^.*):.*$ (meaning a group [stuff enclosed by ()] including everything [.* = one or more of any-character] from the beginning of the line [^] till' the first ':' followed by anything till' the end of line [$]) by the first group [\1] of the replaced regex.
  • u tells sort to remove duplicate entries (take sort -u as optional).

...FAR from being the most elegant way. As I said, my intention is to increase the range of possibilities (and also to give more complete explanations on some tools you could use).


Try something like grep -r -n -i --include="*.html *.php *.js" searchstrinhere .

the -i makes it case insensitlve

the . at the end means you want to start from your current directory, this could be substituted with any directory.

the -r means do this recursively, right down the directory tree

the -n prints the line number for matches.

the --include lets you add file names, extensions. Wildcards accepted

For more info see: http://www.gnu.org/software/grep/