Thursday, February 3, 2022

[SOLVED] Replace a specific character at any word's begin and end in bash

Issue

I need to remove the hyphen '-' character only when it matches the pattern 'space-[A-Z]' or '[A-Z]-space'. (Assuming all letters are uppercase, and space could be a space, or newline)

sample.txt

I AM EMPTY-HANDED AND I- WA-
-ANT SOME COO- COOKIES

I want the output to be

I AM EMPTY-HANDED AND I WA
ANT SOME COO COOKIES

I've looked around for answers using sed and awk and perl, but I could only find answers relating to removing all characters between two patterns or specific strings, but not a specific character between [A-Z] and space.

Thanks heaps!!


Solution

remove the hyphen '-' character only when it matches the pattern 'space-[A-Z]' or '[A-Z]-space'. Assuming all letters are uppercase, and space could be a space, or newline

It's:

sed 's/\( \|^\)-\([A-Z]\)/\1\2/g; s/\([A-Z]\)-\( \|$\)/\1\2/g'
  • s - substitute
    • /
    • \( \|^\) - space or beginning of the line
    • - - hyphen...
    • \(A-Z]\) - a single upper case character
    • /
    • \1\2 - The \1 is replaced by the first \(...\) thing. So it is replaced by a space or nothing. \2 is replaced by the single upper case character found. Effectively - is removed.
    • /
    • g apply the regex globally
  • ; - separate two s commands
  • s
    • Same as above. The $ means end of the line.


Answered By - KamilCuk
Answer Checked By - Marilyn (WPSolving Volunteer)