Thursday, October 28, 2021

[SOLVED] Substituting with sed ignores whitespace

Issue

Let's say I have a file named m1.txt, whose contents (- . ... - / -. --- / .----) I want to decode from Morse to text. Here's what I have written:

sed -i 's/.- /A/g' m1.txt
sed -i 's/-... /B/g' m1.txt
sed -i 's/-.-. /C/g' m1.txt

and so on, including numbers, and later on:

sed -i 's:/ : :g' m1.txt
cat m1.txt

in order to clear the separating slash and output the message.

The expected output is TEST NO 1, however the program outputs D...AE-ED.--A instead, ignoring any whitespaces and returning a wrong message. What have I done wrong?


Solution

Two things:

  1. You need to escape ., since it will match any character
  2. You need to consider interactions, since . (E) will match a single dot, it will also match the end of (N) -. unless you consider them in the right order or preserve the separators (on both the left and right).

(1) is obvious to solve, simply escape the .. For (2), we can first map the beginning and end of the line to spaces (so that we have separators consistently on the left and right, and then we preserve the separators on output (to avoid interactions). Then, in the second to last line, clean up the spaces. Then, in the final line, map your slashes to the word break spaces which are the only spaces you want in your output.

I didn't do the whole alphabet for you, just enough for you to see the idea.

Here's a working solution:

$ cat m1.txt 
- . ... - / -. --- / .----
$ cat morse.sed 
s/^/ /
s/$/ /
s/ - / T /g
s/ \. / E /g
s/ \.\.\. / S /g
s/ -\.\ / N /g
s/ --- / O /g
s/ \.---- / 1 /g
s/[ ]\+//g
s:/: :g
$ cat m1.txt | sed -f morse.sed 
TEST NO 1

Note: this answer was revised because I didn't at first realize that the ( and ) were not part of the input. Also, @Bach Lien's idea to map the anchors to spaces in the first lines is a great one, it makes things a lot cleaner, so I've incorporated that idea.



Answered By - JawguyChooser