Thursday, February 3, 2022

[SOLVED] How the Substitution of the 3rd to the last occurrence using SED command works?

Issue

Let's consider the 'file.txt', click for the file contents

I want to substitute the 3rd to the last occurrence of "p".

sed -E 's/(.*)p((.*p){2})/\1@\2/' file.txt

Here, "p" is substituted by "@". I want to know how it works. Can anyone explain me ?


Solution

  • sed
  • -E - Use extended regex. Compatible with GNU and BSD sed.
  • ' - quote the argument.
  • s - substitute
    • / - separator
    • ( - start first group
    • .* - match anything.
    • ) - end first group.
    • p - match p. Effectively, first group will contain all characters from the line up until a p.
    • ( start second group.
    • ( start third group. Notice the order.
    • .*p) match anything up until a p ...
    • {2} ... two times. So effectively, this will make sure there are at least two p in the rest of the line.
    • ) close second group. So second group will contain something, a p, something and a p.
    • / separator. Next comes replacement.
    • \1 - backreference to first group. So is substituted for all character from the beginning of the line up until first p without the p.
    • @ - a @
    • \2 - backreference to second group. So is substituted for all characters after the first p without it, the second p, something between second and third p and the third p.
    • / separator
  • ' - end single quote.

The (.*p){2} means the same as .*p.*p



Answered By - KamilCuk
Answer Checked By - Senaida (WPSolving Volunteer)