Sunday, March 13, 2022

[SOLVED] sed - how to delete multiple sections in a file that start with a line with a known pattern and end with a line with a known pattern

Issue

I have a file testing123.txt with this content:

01 this is the start of the file
02 start of section one with pattern abc
03 first line
04 second line
05 third line
06 fourth line
07 the_end
08 start of section two with pattern xyz
09 first line
10 second line
11 third line
12 the_end
13 start of section three with pattern abc
14 first line
15 second line
16 third line
17 fourth line
18 fifth line
19 the_end
20 start of section four with pattern klm
21 first line
22 second line
23 third line
24 fouth line
25 fifth line
26 sixth line
27 the_end
28 start of section five with pattern abc
29 first line
30 3second line
31 third line
32 fourth line
33 fifth line
34 sixt line
35 seventh line
36 eighth line
37 the_end
38 start of section six with pattern tuv
39 first line
40 second line
41 the_end
42 start of section seven with pattern abc
43 first line
44 second line
45 the_end
46 start of section eight with pattern abc
47 first line
48 second line
49 third line
50 fourth line
51 the_end
52 this is the end of the file

The aim is to delete all sections that start with lines containing pattern 'pattern abc' up to and including the next line with pattern 'the_end'. Sections can be as long as 14 lines or as short as 3 lines or any number of lines in between. The file testing123.txt can be as long as 1400 lines. The end result in this example should be:

01 this is the start of the file
08 start of section two with pattern xyz
09 first line
10 second line
11 third line
12 the_end
20 start of section four with pattern klm
21 first line
22 second line
23 third line
24 fouth line
25 fifth line
26 sixth line
27 the_end
38 start of section six with pattern tuv
39 first line
40 second line
41 the_end
52 this is the end of the file

This is what I have now (although I have tried a lot more):

#!/bash/bin
PATTERN_1='pattern abc'
ENDLINE='the_end'
sed -i "/$PATTERN_1/,/$ENDLINE/d" testing123.txt

However, in this example that would just delete all lines starting at the second line of the file up to and including the next to last line of the file with pattern 'the_end', leaving me with an almost empty file, which is obviously not what I want.

This is for a bash script I'm writing in Linux (Mint) with GNU/sed. Can sed actually do this? If so, can you tell me how?


Solution

The is so much easier with a flag and with awk:

start='pattern abc'
end='the_end'

awk -v f=1 -v st="$start" -v end="$end" '
match($0, st) {f=0}
f
match($0, end){f=1}' file
# prints your desired output

With GNU sed, this works:

sed "/${start}/,/${end}/d" file


Answered By - dawg
Answer Checked By - Dawn Plyler (WPSolving Volunteer)