Issue
I want to find and print all 4 digit numbers (but not the whole line) in a file using regex in bash.
My sample file looks like this:
12
123
1234
2345 foo foo foo
foo foo 3456 foo foo
# 8912 foo foo foo foo
#7654
-8999
\6478
/9023
$7654
A3356
8349B
1439$
1762\
12345
123456
0000
0001
I would like my output to include only 4 digits numbers (any number that has normal or special characters appended to it must be ignored):
1234
2345
3456
8912
0000
0001
The closest I have been able to come to this is:
grep -E '(^|[^0-9])[0-9]{4}($|[^0-9])' file_with_numbers.txt
which errantly captures numbers with special characters appended and also prints the whole line when I only want the six values as shown above:
1234
2345 foo foo foo foo
foo foo 3456 foo foo
# 8912 foo foo foo foo
#7654
-8999
\6478
/9023
$7654
A3356
1439$
1762\
0000
0001
Any suggestions on how I can get the exact desired output are appreciated. I am having trouble finding info for the appended special character exclusion as well as showing only the number and not the whole line.
Solution
Using (^|[^0-9])
and ($|[^0-9])
will make it part of the match.
You can make use of lookarounds asserting a whitespace boundary on the left and right.
To make use of the lookarounds, you can use -P
to enable Perl compatible regular expressions.
grep -Po '(?<!\S)[0-9]{4}(?!\S)' file_with_numbers.txt
Output
1234
2345
3456
8912
0000
0001
Answered By - The fourth bird Answer Checked By - Cary Denson (WPSolving Admin)