Issue
I have searched extensively and cannot figure out what I am doing wrong here. I have a text file that may contain a string similar to the following:
/dev/dir1/dir2 200G 22G 179G 11% /usr/dir3/dir4
I generally know what the sting will look like up until the disk percentage indicator (i.e. 11%), but in the final part of the string I need to figure out if it ends in the usr (or sub) directories.
I want to use grep to do this search but am having problems. For example, the following command gives me output, but once i replace any of the "." characters where the "G" or "%" would be, or if I try to add "/usr/.*" at the end it refuses to return anything.
$ egrep ^/dev/dir1/dir2\s*\d*.\s*\d*.\s*\d*.\s*\d*.\s*.*$ testfile
/dev/dir1/dir2 200G 22G 179G 11% /usr/dir3/dir4
Solution
grep
's extended regular expressions do not support using \d
to match digits. Instead, use [0-9]
or [:digit:]
. You can use the following grep
command:
egrep '^/dev/dir1/dir2\s*[0-9]*G\s*[0-9]*G\s*[0-9]*G\s*[0-9]*%\s*.*$'
You can also pass grep
the -P
option to enable Perl compatible regular expressions, which do support \d
:
grep -P '^/dev/dir1/dir2\s*\d*G\s*\d*G\s*\d*G\s*\d*%\s*.*$'
Note the use of grep
instead of egrep
in the above command; -P
is incompatible with egrep
.
As a side note, I prefer to use +
instead of *
when I can, because it is stricter and can cause errors to become apparent sooner. For example, I assume there will always be at least one space and one digit in each place in the input, so you can use \s+
and [0-9]+
(or \d+
). If your original pattern had used +
, it would not have matched at all in the first place (whether it was quoted or not), and you would have known you had a problem even before adding the G
or %
to it. A working example is
egrep '^/dev/dir1/dir2\s+[0-9]+.\s+[0-9]+.\s+[0-9]+.\s+[0-9]+.\s+.+$'
Answered By - Lithis