Issue
Trying to split the column of text based on the text that may or may not be there.
sample file:
TEXT1D1NEWBWP210HTEXT2
TEXT1D1BWP210HTEXT2
TEXT1D1NEWBWP210HTEXT2 NEWBWP 210H
TEXT1D1BWP210HTEXT2 BWP 210H
cmd used --- Expecting that "?" would check if the substring "NEW" is present and print in case it is there.
cat <text_file> | sed -e 's/.*\(\s*\)\(NEW\)\?\(BWP\)\([0-9]\+\)H.*/\0 \2\3 \4H/'
Output from the above cmd is
TEXT1D1NEWBWP210HTEXT2 BWP 210H
TEXT1D1BWP210HTEXT2 BWP 210H
Not sure what I am doing wrong here... :)
Solution
Using GNU or BSD sed for -E
so you don't need all those backslashes (you were already using GNU sed for \s
):
$ sed -Ee 's/((NEW)?BWP)([0-9]+)H.*/& \1 \3H/' file
TEXT1D1NEWBWP210HTEXT2 NEWBWP 210H
TEXT1D1BWP210HTEXT2 BWP 210H
The main problem with your regexp was the initial .*
would consume the optional NEW
if it was present.
Answered By - Ed Morton Answer Checked By - Pedro (WPSolving Volunteer)