Issue
I am working with the log filles arranged in the following format:
fÆ’dfFinding intramodel H-bonds
Constraints relaxed by 0.5 angstroms and 20 degrees
Models used:
1.1 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.2 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.3 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.4 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.5 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.6 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.7 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.8 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.9 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.10 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.11 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.12 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.13 SarsCov2_structure49R_nsp5holo_rep1.pdb
1.14 SarsCov2_structure49R_nsp5holo_rep1.pdb
14 H-bonds
H-bonds (donor, acceptor, hydrogen, D..A dist, D-H..A dist):
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.1/? ASN 142 ND2 SarsCov2_structure49R_nsp5holo_rep1.pdb #1.1/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.1/? ASN 142 1HD2 3.102 2.145
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.3/? GLU 166 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.3/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.3/? GLU 166 H 3.011 2.024
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.4/? GLU 166 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.4/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.4/? GLU 166 H 3.037 2.132
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.5/? HIS 163 NE2 SarsCov2_structure49R_nsp5holo_rep1.pdb #1.5/A UNL 888 O no hydrogen 3.388 N/A
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.5/? GLU 166 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.5/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.5/? GLU 166 H 2.806 1.792
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.7/? THR 26 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.7/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.7/? THR 26 H 3.093 2.142
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.7/? GLY 143 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.7/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.7/? GLY 143 H 3.030 2.193
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.9/? GLN 189 NE2 SarsCov2_structure49R_nsp5holo_rep1.pdb #1.9/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.9/? GLN 189 2HE2 3.052 2.301
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.10/? GLU 166 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.10/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.10/? GLU 166 H 2.854 1.868
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.12/? GLY 143 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.12/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.12/? GLY 143 H 3.103 2.070
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.13/? GLY 143 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.13/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.13/? GLY 143 H 3.161 2.224
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.13/? CYS 145 SG SarsCov2_structure49R_nsp5holo_rep1.pdb #1.13/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.13/? CYS 145 HG 3.421 2.842
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.14/? ASN 142 ND2 SarsCov2_structure49R_nsp5holo_rep1.pdb #1.14/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.14/? ASN 142 2HD2 3.055 2.465
SarsCov2_structure49R_nsp5holo_rep1.pdb #1.14/? CYS 145 N SarsCov2_structure49R_nsp5holo_rep1.pdb #1.14/A UNL 888 O SarsCov2_structure49R_nsp5holo_rep1.pdb #1.14/? CYS 145 H 2.924 2.143
I need to find the first occurence of the "GLU 166 N" pattern and print the number present on the same line just before the pattern as #1.number/?, associated with this pattern. So in the example the detected number should be 3 (since the associating number is #1.3/?).
I would start from basic pattern-detection
awk '/GLU 166 N/' file
but how to find correctly the number defined just before the pattern and print it as output ? Finally, in the case if the pattern can not be found, I would like that the script prints 1.
Solution
$ awk -vn=1 '/GLU 166 N/ {gsub(/.*\.|\/\?/,"",$2); n=$2; exit} END {print n}' file
3
$ awk -vn=1 '/GLU 166 N/ {gsub(/.*\.|\/\?/,"",$2); n=$2; exit} END {print n}' /dev/null
1
What you look for is in the second field ($2
). gsub(/.*\.|\/\?/,"",$2)
replaces in $2
all leading characters up to (and including) the period, and the trailing /?
by the empty string.
Answered By - Renaud Pacalet Answer Checked By - Katrina (WPSolving Volunteer)