Thursday, February 3, 2022

[SOLVED] Why does `sed -e 's/[0-9]*/& &/'` have very different results with minor input changes?

Issue

I am trying to understand the below result and differences, why the result is not abc 123 123 when [0-9]* would match 123

  • Why is this first answer not abc 123 123?

    $ sed -e 's/[0-9]*/& &/' <<<'abc 123'
     abc 123
    
  • Why is the second result 123 123 abc?

    $ sed -e 's/[0-9]*/& &/' <<<'123 abc'
    123 123 abc
    
  • Why does the third one have extra spaces?
    $ sed -e 's/[0-9]*/& &/g' <<<'abc 123'
     a b c  123 123
    

Solution

sed will use the first match. Since [0-9]* matches not just 3 digits 123 but also 0 digits (the empty string), it will match the empty string marked with parentheses here: ()abd 123 and replace it with () ()abd 123

If you instead use one of:

sed -r 's/[0-9]+/& &/'
sed 's/[0-9][0-9]*/& &/' 

then you require that it matches at least 1 digit, and then it'll behave the way you expect.



Answered By - that other guy
Answer Checked By - Katrina (WPSolving Volunteer)