Issue
I want to use my id_file to search my big_file extracting lines that match the id at the beginning of the line in big_file.
I'm a beginner and I'm struggling with grep (version grep (BSD grep) 2.5.1-FreeBSD
) and understanding the solutions as cited below.
My id_file
contains id's:
67b
84D
118
136
166
My big_file
looks something like this:
118 ABL1_BCR
118 AC005258
166 HSP90AB1
166 IKZF2_SP
166 IL1RAP_D
136 ABL1_BCR
136 ABL1_BCR
555 BCR_136
555 BCR_136
555 BCR_136
59 UNC45B_M 166
59 WASF2_GN 166
59 YPEL5_CX 166
As suggested by Chris Seymour here
Try 1: I used
grep -wFf id_file big_file
That didn't work obviously, as the numbers occur elsewhere in the lines of the big_file
.
Try 2: I modified the id_file;
^67b
^84D
^118
^136
^166
And ran grep -wFf id_file big_file
again.
Of course, that didn't work either
I looked at batimar's take here but I'm failing to implement the suggestion.
Better usage is taking only some patterns from some file and this patterns use for your file
grep '^PAT' patterns.txt | grep -f - myfile
This will take all patterns from file patterns.txt starting with PAT and use this patterns from the next grep to search in myfile.
I tried to reproduce the code above with my example in several ways but apparently I just don't get what they mean there as none of it worked.
There were 2 outcomes to my tinkering 1: No such file or directory
or no output at all.
Is there even a way to do this with grep only?
I'd greatly appreciate if anyone was able to break it down for me.
Solution
This seems to be an issue with BSD grep
. See
https://unix.stackexchange.com/questions/352977/why-does-this-bsd-grep-result-differ-from-gnu-grep for similar issues.
You can use awk
as an alternate (there's probably a duplicate somewhere with this exact solution):
awk 'NR==FNR{a[$1]; next} $1 in a' id_file large_file
NR==FNR{a[$1]; next}
builds an associative array with first field ofid_file
as keys$1 in a
will be true if first field of a line fromlarge_file
matches any of the keys in arraya
. If so, entire line will be printed.
Answered By - Sundeep Answer Checked By - Pedro (WPSolving Volunteer)