Issue
Imagine that we have this data:
##sequence-region P51451 1 505
##sequence-region P22223 1 829
P22223 UniProtKB Transmembrane 655 677 . . . Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255
##sequence-region Q01518 1 475
##sequence-region Q96MP8 1 289
##sequence-region Q9HCJ2 1 640
Q9HCJ2 UniProtKB Transmembrane 528 548 . . . Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255
##sequence-region P48059 1 325
##sequence-region Q9UHB6 1 759
##sequence-region P16581 1 610
P16581 UniProtKB Transmembrane 557 578 . . . Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255
And the final output would be to get the rows that contain the word 'transmembrane' and its corresponding top row only:
##sequence-region P22223 1 829
P22223 UniProtKB Transmembrane 655 677 . . . Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255
##sequence-region Q9HCJ2 1 640
Q9HCJ2 UniProtKB Transmembrane 528 548 . . . Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255
##sequence-region P16581 1 610
P16581 UniProtKB Transmembrane 557 578 . . . Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255
I am trying with grep but I am a little bit stuck
Thanks!
Solution
If you've got GNU grep (the standard grep
on Linux) and your data are in the file data.txt
you can use:
grep -w Transmembrane --before-context=1 --no-group-separator data.txt
- The
-w
option will cause the match to apply to only whole words in the input. So, for instance,Transmembrane123
won't be matched. That might not be what you want. --before-context=1
causesgrep
to print one line in the input before every matched line.--no-group-separator
causesgrep
to print no separator between groups of matched line and previous line. Normally it prints a separator line containing--
.
Answered By - pjh Answer Checked By - Dawn Plyler (WPSolving Volunteer)