Saturday, April 9, 2022

[SOLVED] How to select a row by name and also the previous row in bash or python?

Issue

Imagine that we have this data:

##sequence-region P51451 1 505
##sequence-region P22223 1 829
P22223  UniProtKB   Transmembrane   655 677 .   .   .   Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255 
##sequence-region Q01518 1 475
##sequence-region Q96MP8 1 289
##sequence-region Q9HCJ2 1 640
Q9HCJ2  UniProtKB   Transmembrane   528 548 .   .   .   Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255 
##sequence-region P48059 1 325
##sequence-region Q9UHB6 1 759
##sequence-region P16581 1 610
P16581  UniProtKB   Transmembrane   557 578 .   .   .   Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255

And the final output would be to get the rows that contain the word 'transmembrane' and its corresponding top row only:

##sequence-region P22223 1 829
P22223  UniProtKB   Transmembrane   655 677 .   .   .   Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255 
##sequence-region Q9HCJ2 1 640
Q9HCJ2  UniProtKB   Transmembrane   528 548 .   .   .   Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255 
##sequence-region P16581 1 610
P16581  UniProtKB   Transmembrane   557 578 .   .   .   Note=Helical;Ontology_term=ECO:0000255;evidence=ECO:0000255

I am trying with grep but I am a little bit stuck

Thanks!


Solution

If you've got GNU grep (the standard grep on Linux) and your data are in the file data.txt you can use:

grep -w Transmembrane --before-context=1 --no-group-separator data.txt
  • The -w option will cause the match to apply to only whole words in the input. So, for instance, Transmembrane123 won't be matched. That might not be what you want.
  • --before-context=1 causes grep to print one line in the input before every matched line.
  • --no-group-separator causes grep to print no separator between groups of matched line and previous line. Normally it prints a separator line containing --.


Answered By - pjh
Answer Checked By - Dawn Plyler (WPSolving Volunteer)