Issue
I have a tab-delimited file as
[ moleculetype ]
; Name nrexcl
AL7 3
[ atoms ]
; nr type resnr resid atom cgnr charge mass
1 CB 1 AL6 C4 1 -0.1435 12.0110
2 CB 1 AL6 C5 2 -0.1500 12.0110
3 CB 1 AL6 C6 3 -0.1500 12.0110
4 CB 1 AL6 C7 4 0.0825 12.0110
5 CB 1 AL6 O8 5 -0.1500 12.0110
[ bonds ]
; ai aj fu b0 kb, b0 kb
16 7 1 0.10930 287014.9 0.10930 287014.9
15 7 1 0.10930 287014.9 0.10930 287014.9
7 8 1 0.14180 303937.5 0.14180 303937.5
7 17 1 0.10930 287014.9 0.10930 287014.9
8 9 1 0.13550 349343.9 0.13550 349343.9
20 12 1 0.10190 390836.6 0.10190 390836.6
I want the output as
[ moleculetype ]
; Name nrexcl
AL7 3
[ atoms ]
; nr type resnr resid atom cgnr charge mass
1 CB 1 AL6 C 1 -0.1435 12.0110
2 CB 1 AL6 C 2 -0.1500 12.0110
3 CB 1 AL6 C 3 -0.1500 12.0110
4 CB 1 AL6 C 4 0.0825 12.0110
5 CB 1 AL6 O 5 -0.1500 12.0110
[ bonds ]
; ai aj fu b0 kb, b0 kb
16 7 1 0.10930 287014.9 0.10930 287014.9
15 7 1 0.10930 287014.9 0.10930 287014.9
7 8 1 0.14180 303937.5 0.14180 303937.5
7 17 1 0.10930 287014.9 0.10930 287014.9
8 9 1 0.13550 349343.9 0.13550 349343.9
20 12 1 0.10190 390836.6 0.10190 390836.6
where the section under [ atoms ] is modified. The fifth column is modified containing only the strings and not numbers. Please suggest a way out of this.
The problem is that the normal awk function cannot be applied as the fifth column not only contains C6/C7/O8 but also other things as can be seen under [ bonds ]. I have tried with grep and awk as
grep -A307 -P 'atoms' filename | awk -F, 'sub("[0-9]+\s""",$9)' OFS=,
But it is taking the whole file which is not desired.
Solution
This might work for you (GNU sed):
sed -E '/^\[/h;G;/\n\[ atoms \]/{/^;|^$/!{s/(\S)\S*/\1/5}};P;d' file
Make a copy of any line beginning [
in the hold space.
Append the hold space to every line.
If the second line begins [ atoms]
, process the line, otherwise print the first line and delete the remainder.
If the start of the current line is either ;
or empty, print the first line and delete the remainder.
Otherwise, replace the fifth field with its first character.
Print the first line and delete the remainder.
Answered By - potong Answer Checked By - Timothy Miller (WPSolving Admin)