Tuesday, November 2, 2021

[SOLVED] Wildcard sed search/remove within other text in the same line

Issue

I'm trying to remove a matching string with partial wildcards using sed, and the searches I've done for answers on this site either don't seem to apply or I can't convert them to my situation.

Below is the string of text I need to remove:

www.foo.com.cp123.bar.com

It is in a file with other entries on the same line. The line that has my entries always starts with serveralias:, however, as below:

serveralias: www.domain.com mail.domain.com www.foo.com.cp123.bar.com domain.com

I can identify what I need to remove via the 'cp123.bar.com' text as that always stays the same. It's the preceding 'www.foo.com' that changes. It can appear just once or multiple times within the line, but it will always end in 'cp123.bar.com'. I've tried the following two commands based on my research:

sed 's/\ .*cp123.bar.com\ //g' file.txt

sed 's/\ [^:]+$cp123.bar.com\ //g' file.txt

I'm using the spaces between each entry as the start and stop point for the find/replace(delete), but that's a band-aid and not always going to work since the entry I need to delete is occasionally at the end of the line (without a space afterward). If I don't include the spaces, though, everything gets removed since I'm using wildcards, including the www.domain.com, mail.domain.com, etc. text I need to keep there. Running either of the sed commands above doesn't do anything, just prints what's currently in the file.

Any ideas on what I need to change? I'm happy to clarify anything if need be.


Solution

Sed requires an -r flag to be able to use enhanced regular expressions. Without the -r, the + won't work in the regexps. Thus, a

sed -r 's/ +[^ ]+\.cp123\.bar\.com//g'

will do what you want. It removes the following substrings:

  • one or more space
  • followed by one or more non-space
  • followed by .cp123.bar.com


Answered By - peterh