Issue
This below shows this some lines of csv file, i want to get the results that only get the Population column with only using grep
command.
id,Association,Population,Variant(s),Gene(s),PubMed
1,non-significant,Dutch,HLA-B40,HLA-B,1859103
2,non-significant,Dutch,HLA-DRB5,HLA-DRB5,1859103
3,non-significant,Finnish,APOB,APOB,8018664
4,significant,Finnish,APOC3,APOC3,8018664
5,significant,Finnish,E2/E3/E4,APOE,8018664
6,significant,French,I/D,ACE,8136829
results i want:
Dutch
Dutch
Finnish
Finnish
Finnish
French
The command i made for this problem was
cat information.csv | grep -Eo '^([^,]*,){2}[^,]*'
which got results below
id,Association,Population
1,non-significant,Dutch
2,non-significant,Dutch
3,non-significant,Finnish
4,significant,Finnish
5,significant,Finnish
6,significant,French
how can i get rid of the rest of things without using awk
sed
or any other things?
Solution
You may use a GNU grep with a PCRE pattern:
grep -Po '^([^,]*,){2}\K[^,]*' file
Here,
^
- start of string([^,]*,){2}
- two occurrences of any zero or more chars other than,
and then a,
\K
- match reset operator discarding all text matched so far[^,]*
- zero or more chars other than a comma.
Answered By - Wiktor Stribiżew Answer Checked By - Mary Flores (WPSolving Volunteer)