Thursday, November 18, 2021

[SOLVED] How to grep Git commit diffs or contents for a certain word

Issue

In a Git code repository I want to list all commits that contain a certain word. I tried this

git log -p | grep --context=4 "word"

but it does not necessarily give me back the filename (unless it's less that five lines away from the word I searched for. I also tried

git grep "word"

but it gives me only present files and not the history.

How do I search the entire history so I can follow changes on a particular word? I intend to search my codebase for occurrences of word to track down changes (search in files history).


Solution

If you want to find all commits where the commit message contains a given word, use

$ git log --grep=word

If you want to find all commits where "word" was added or removed in the file contents (to be more exact: where the number of occurrences of "word" changed), i.e., search the commit contents, use a so-called 'pickaxe' search with

$ git log -Sword

In modern Git there is also

$ git log -Gword

to look for differences whose added or removed line matches "word" (also commit contents).

Note that -G by default accepts a regex, while -S accepts a string, but it can be modified to accept regexes using the --pickaxe-regex.

To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

+    return !regexec(regexp, two->ptr, 1, &regmatch, 0);
...
-    hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);

While git log -G"regexec\(regexp" will show this commit, git log -S"regexec\(regexp" --pickaxe-regex will not (because the number of occurrences of that string did not change).


With Git 2.25.1 (Feb. 2020), the documentation is clarified around those regexes.

See commit 9299f84 (06 Feb 2020) by Martin Ågren (``). (Merged by Junio C Hamano -- gitster -- in commit 0d11410, 12 Feb 2020)

diff-options.txt: avoid "regex" overload in the example \

Reported-by: Adam Dinwoodie
Signed-off-by: Martin Ågren
Reviewed-by: Taylor Blau

When we exemplify the difference between -G and -S (using --pickaxe-regex), we do so using an example diff and git diff invocation involving "regexec", "regexp", "regmatch", etc.

The example is correct, but we can make it easier to untangle by avoiding writing "regex.*" unless it's really needed to make our point.

Use some made-up, non-regexy words instead.

The git diff documentation now includes:

To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

+    return frotz(nitfol, two->ptr, 1, 0);
...
-    hit = frotz(nitfol, mf2.ptr, 1, 0);

While git log -G"frotz\(nitfol" will show this commit, git log -S"frotz\(nitfol" --pickaxe-regex will not (because the number of occurrences of that string did not change).



Answered By - Jakub Narębski