Issue
I have used git grep
for years to search for fixed strings and haven't used it much for doing regular expression searches.
I have places in the code with non-localized strings. For example:
JLabel label = buildLabel("Alphabet");
In this case buildLabel()
is an inherited utility method. There are also buildBoldLabel()
, buildMultiLineLabel()
, and buildTextArea()
.
So I would like to search my code for uses of these methods without a lookup for the localized string. The correct call should be:
JLabel label = buildLabel(getString("Alphabet"));
I am very familiar with regular expressions and I see that git grep
supports Perl character classes. So I figured that it would be very easy:
$ git grep -P "buildLabel(\"\w+\")"
This returns no results. So I tried it without the Perl extension.
$ git grep "buildLabel(\"[a-zA-Z_]+\")"
Still ... no results. I verified that I could search with a fixed string.
$ git grep "buildLabel(\"Alphabet\")"
That returned the instance in the code that I already knew existed. However ...
$ git grep -P "buildLabel(\"Alphabet\")"
Returns no results.
I also tried changing the quote characters and got the same results.
$ git grep -P 'buildLabel("\w+")'
... no results
$ git grep -P 'buildLabel("Alphabet")'
... no results
$ git grep 'buildLabel("Alphabet")'
... 1 expected result
I tried on Linux with the same results.
UPDATE:
Thanks to @wiktor-stribiżew commenting that with PCRE the parens need to be escaped (I am always confused by that).
$ git grep -P 'buildLabel\("\w+"\)'
... returns 1 expected result.
However, why don't these work?
$ git grep 'buildLabel("[a-zA-Z_]+")'
$ git grep 'buildLabel\("[a-zA-Z_]+"\)'
$ git grep 'buildLabel\("[a-zA-Z_][a-zA-Z_]*"\)'
(in case + isn't implemented)
So what am I doing wrong with git grep
? Or is it broken?
FYI: I am using git version 2.35.1 from Homebrew on macOS Big Sur.
Solution
Regex vs. fixed string search
Please refer to the git grep help:
-G
--basic-regexp
Use POSIX extended/basic regexp for patterns. Default is to use basic regexp.
So, by default, git grep
treats the pattern string as a POSIX BRE regex, not as a fixed string.
To make git grep
treat the pattern as a fixed string you need -F
:
-F
--fixed-strings
Use fixed strings for patterns (don’t interpret pattern as a regex).
Regex issues
You can enable PCRE regex syntax with -P
option, and in that case you should refer to PCRE documentation.
In your git grep -P "buildLabel(\"\w+\")"
, the parentheses must be escaped in order to be matched as literal parentheses, i.e. it should be git grep -P "buildLabel\(\"\w+\"\)"
.
In git grep 'buildLabel("[a-zA-Z_]+")'
, you are using the POSIX BRE regex, and +
is parsed as a literal +
char, not as a one or more quantifier. You can use git grep 'buildLabel("[a-zA-Z_]\{1,\}")'
in POSIX BRE though. If it is a GNU grep, you could use git grep 'buildLabel("[a-zA-Z_]\+")'
(not sure it works with git
).
The git grep 'buildLabel\("[a-zA-Z_]+"\)'
does not work because \(...\)
(escaped pair of parentheses) define a capturing group and do not thus match literal parentheses.
The git grep -e 'buildLabel\("[a-zA-Z_][a-zA-Z_]*"\)'
is the same POSIX BRE, to make it a POSIX ERE, you need to use the -E
option, git grep -E 'buildLabel\("[a-zA-Z_][a-zA-Z_]*"\)'
. Or git grep -E 'buildLabel\("[a-zA-Z_]+"\)'
, the unescaped +
is a quantifier in POSIX ERE.
Also, see What special characters must be escaped in regular expressions?
Answered By - Wiktor Stribiżew Answer Checked By - Marilyn (WPSolving Volunteer)