[git] How to grep (search) committed code in the Git history

I have deleted a file or some code in a file sometime in the past. Can I grep in the content (not in the commit messages)?

A very poor solution is to grep the log:

git log -p | grep <pattern>

However, this doesn't return the commit hash straight away. I played around with git grep to no avail.

This question is related to git grep diff

The answer is


I took Jeet's answer and adapted it to Windows (thanks to this answer):

FOR /F %x IN ('"git rev-list --all"') DO @git grep <regex> %x > out.txt

Note that for me, for some reason, the actual commit that deleted this regex did not appear in the output of the command, but rather one commit prior to it.


For simplicity, I'd suggest using GUI: gitk - The Git repository browser. It's pretty flexible

  1. To search code:

    Enter image description here
  2. To search files:

    Enter image description here
  3. Of course, it also supports regular expressions:

    Enter image description here

And you can navigate through the results using the up/down arrows.


You should use the pickaxe (-S) option of git log.

To search for Foo:

git log -SFoo -- path_containing_change
git log -SFoo --since=2009.1.1 --until=2010.1.1 -- path_containing_change

See Git history - find lost line by keyword for more.


As Jakub Narebski commented:

  • this looks for differences that introduce or remove an instance of <string>. It usually means "revisions where you added or removed line with 'Foo'".

  • the --pickaxe-regex option allows you to use extended POSIX regex instead of searching for a string. Example (from git log): git log -S"frotz\(nitfol" --pickaxe-regex


As Rob commented, this search is case-sensitive - he opened a follow-up question on how to search case-insensitive.


If you want to browse code changes (see what actually has been changed with the given word in the whole history) go for patch mode - I found a very useful combination of doing:

git log -p
# Hit '/' for search mode.
# Type in the word you are searching.
# If the first search is not relevant, hit 'n' for next (like in Vim ;) )

Whenever I find myself at your place, I use the following command line:

git log -S "<words/phrases i am trying to find>" --all --oneline  --graph

Explanation:

  1. git log - Need I write more here; it shows the logs in chronological order.
  2. -S "<words/phrases i am trying to find>" - It shows all those Git commits where any file (added/modified/deleted) has the words/phrases I am trying to find without '<>' symbols.
  3. --all - To enforce and search across all the branches.
  4. --oneline - It compresses the Git log in one line.
  5. --graph - It creates the graph of chronologically ordered commits.

git log can be a more effective way of searching for text across all branches, especially if there are many matches, and you want to see more recent (relevant) changes first.

git log -p --all -S 'search string'
git log -p --all -G 'match regular expression'

These log commands list commits that add or remove the given search string/regex, (generally) more recent first. The -p option causes the relevant diff to be shown where the pattern was added or removed, so you can see it in context.

Having found a relevant commit that adds the text you were looking for (for example, 8beeff00d), find the branches that contain the commit:

git branch -a --contains 8beeff00d

Adding more to the answers already present. If you know the file in which you might have made do this:

git log --follow -p -S 'search-string' <file-path>

--follow: lists the history of a file


Okay, twice just today I've seen people wanting a closer equivalent for hg grep, which is like git log -pS but confines its output to just the (annotated) changed lines.

Which I suppose would be handier than /pattern/ in the pager if you're after a quick overview.

So here's a diff-hunk scanner that takes git log --pretty=%h -p output and spits annotated change lines. Put it in diffmarkup.l, say e.g. make ~/bin/diffmarkup, and use it like

git log --pretty=%h -pS pattern | diffmarkup | grep pattern
%option main 8bit nodefault
        // vim: tw=0
%top{
        #define _GNU_SOURCE 1
}
%x commitheader
%x diffheader
%x hunk
%%
        char *afile=0, *bfile=0, *commit=0;
        int aline,aremain,bline,bremain;
        int iline=1;

<hunk>\n        ++iline; if ((aremain+bremain)==0) BEGIN diffheader;
<*>\n   ++iline;

<INITIAL,commitheader,diffheader>^diff.*        BEGIN diffheader;
<INITIAL>.*     BEGIN commitheader; if(commit)free(commit); commit=strdup(yytext);
<commitheader>.*

<diffheader>^(deleted|new|index)" ".*   {}
<diffheader>^"---".*            if (afile)free(afile); afile=strdup(strchrnul(yytext,'/'));
<diffheader>^"+++".*            if (bfile)free(bfile); bfile=strdup(strchrnul(yytext,'/'));
<diffheader,hunk>^"@@ ".*       {
        BEGIN hunk; char *next=yytext+3;
        #define checkread(format,number) { int span; if ( !sscanf(next,format"%n",&number,&span) ) goto lostinhunkheader; next+=span; }
        checkread(" -%d",aline); if ( *next == ',' ) checkread(",%d",aremain) else aremain=1;
        checkread(" +%d",bline); if ( *next == ',' ) checkread(",%d",bremain) else bremain=1;
        break;
        lostinhunkheader: fprintf(stderr,"Lost at line %d, can't parse hunk header '%s'.\n",iline,yytext), exit(1);
        }
<diffheader>. yyless(0); BEGIN INITIAL;

<hunk>^"+".*    printf("%s:%s:%d:%c:%s\n",commit,bfile+1,bline++,*yytext,yytext+1); --bremain;
<hunk>^"-".*    printf("%s:%s:%d:%c:%s\n",commit,afile+1,aline++,*yytext,yytext+1); --aremain;
<hunk>^" ".*    ++aline, ++bline; --aremain; --bremain;
<hunk>. fprintf(stderr,"Lost at line %d, Can't parse hunk.\n",iline), exit(1);

Search in any revision, any file (unix/linux):

git rev-list --all | xargs git grep <regexp>

Search only in some given files, for example XML files:

git rev-list --all | xargs -I{} git grep <regexp> {} -- "*.xml"

The result lines should look like this: 6988bec26b1503d45eb0b2e8a4364afb87dde7af:bla.xml: text of the line it found...

You can then get more information like author, date, and diff using git show:

git show 6988bec26b1503d45eb0b2e8a4364afb87dde7af

Scenario: You did a big clean up of your code by using your IDE. Problem: The IDE cleaned up more than it should and now you code does not compile (missing resources, etc.)

Solution:

git grep --cached "text_to_find"

It will find the file where "text_to_find" was changed.

You can now undo this change and compile your code.


git rev-list --all | xargs -n 5 git grep EXPRESSION

is a tweak to Jeet's solution, so it shows results while it searches and not just at the end (which can take a long time in a large repository).


So are you trying to grep through older versions of the code looking to see where something last exists?

If I were doing this, I would probably use git bisect. Using bisect, you can specify a known good version, a known bad version, and a simple script that does a check to see if the version is good or bad (in this case a grep to see if the code you are looking for is present). Running this will find when the code was removed.


Jeet's answer works in PowerShell.

git grep -n <regex> $(git rev-list --all)

The following displays all files, in any commit, that contain a password.

# Store intermediate result
$result = git grep -n "password" $(git rev-list --all)

# Display unique file names
$result | select -unique { $_ -replace "(^.*?:)|(:.*)", "" }

My favorite way to do it is with git log's -G option (added in version 1.7.4).

-G<regex>
       Look for differences whose added or removed line matches the given <regex>.

There is a subtle difference between the way the -G and -S options determine if a commit matches:

  • The -S option essentially counts the number of times your search matches in a file before and after a commit. The commit is shown in the log if the before and after counts are different. This will not, for example, show commits where a line matching your search was moved.
  • With the -G option, the commit is shown in the log if your search matches any line that was added, removed, or changed.

Take this commit as an example:

diff --git a/test b/test
index dddc242..60a8ba6 100644
--- a/test
+++ b/test
@@ -1 +1 @@
-hello hello
+hello goodbye hello

Because the number of times "hello" appears in the file is the same before and after this commit, it will not match using -Shello. However, since there was a change to a line matching hello, the commit will be shown using -Ghello.


For anyone else trying to do this in Sourcetree, there is no direct command in the UI for it (as of version 1.6.21.0). However, you can use the commands specified in the accepted answer by opening Terminal window (button available in the main toolbar) and copy/pasting them therein.

Note: Sourcetree's Search view can partially do text searching for you. Press Ctrl + 3 to go to Search view (or click Search tab available at the bottom). From far right, set Search type to File Changes and then type the string you want to search. This method has the following limitations compared to the above command:

  1. Sourcetree only shows the commits that contain the search word in one of the changed files. Finding the exact file that contains the search text is again a manual task.
  2. RegEx is not supported.

Examples related to git

Does the target directory for a git clone have to match the repo name? Git fatal: protocol 'https' is not supported Git is not working after macOS Update (xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools) git clone: Authentication failed for <URL> destination path already exists and is not an empty directory SSL_connect: SSL_ERROR_SYSCALL in connection to github.com:443 GitLab remote: HTTP Basic: Access denied and fatal Authentication How can I switch to another branch in git? VS 2017 Git Local Commit DB.lock error on every commit How to remove an unpushed outgoing commit in Visual Studio?

Examples related to grep

grep's at sign caught as whitespace cat, grep and cut - translated to python How to suppress binary file matching results in grep Linux find and grep command together Filtering JSON array using jQuery grep() Linux Script to check if process is running and act on the result grep without showing path/file:line How do you grep a file and get the next 5 lines How to grep, excluding some patterns? Fast way of finding lines in one file that are not in another?

Examples related to diff

Create patch or diff file from git repository and apply it to another different git repository Comparing the contents of two files in Sublime Text Git diff between current branch and master but not including unmerged master commits Fast way of finding lines in one file that are not in another? Python - difference between two strings How to see the changes in a Git commit? unix diff side-to-side results? Find the files existing in one directory but not in the other git diff between two different files How to get the difference (only additions) between two files in linux