Monday, November 27, 2023

[SOLVED] Regex in sed to replace Json values

November 27, 2023 regex, sed

Issue

I have this kind of text, where I want to hide only values of some fields: l1 and x2. Here is example:

{
    "info":
    {
        "l1": 77,
        "x2": 77,
    },
    "user": "2323",
    "id": "xxxx",
    "time": 1679955931845,
    "msgType": "oyui"
}

I have come up with perfect regex which is working fine as "regex": (?<=(l1|x2)":)(.*?)(?=,) But now I want to use it in Linux with sed, which seems to be way too complex. At the end of day I made it work in two sed statements, but now I cannot find place for myself, because of not knowing how it can be done within one regex with `sed.

UPDATE

There are good answers if somebody would stop upon such issue. However, in my case, I specifically need to use sed statement, because this is the input required for configuration in other services (in my case Splunk and Field Filtering option with sed https://docs.splunk.com/Documentation/Splunk/9.0.4/Security/setfieldfiltering)

Solution

sed does not support the dialect you are trying to use. But Perl does.

perl -ne 'if (m/(?<=(l1|x2)":)(.*?)(?=,)/) { print "$1: $2\n" }'

Splunk basically borrows its regex engine from Perl (or PCRE?) so it should be convenient and natural to go back and forth between Perl and Splunk (though I should think you would never want to go back if you manage to leave ...)

Perl has some superficial similarities with sed, so you can say things like

perl -pe 's%(?<=foo)bar(?=baz)%quux%g'

which should be reasonably transparent if you are familiar with sed. There's even a tool s2p which automatically translates sed scripts to Perl scripts.

Parenthetically, many Splunk patterns seem to use named groups; you can use the built-in hash %+ in Perl to access these. ¹

perl -ne 'if (m/(?<=(?P<thing>l1|x2)":)(?P<value>.*?)(?=,)/) { print "$+{thing}: $+{value}\n" }'

If you genuinely need to use sed specifically, you need to refactor your regular expression to a BRE or at least an ERE - the latter is feasible if your sed has a (non-standard, but common) -r or -E option;

sed -nE 's/.*(l1|x2)":([^,]*),.*/"\1": "\2"/p'

This isn't exactly equivalent, obviously; the lookarounds have no real equivalent in traditional regex, so I just converted them to regular matches; and [^,]* isn't at all the same as .*? but in this case I'm guessing it's what you actually mean. Without seeing your actual data, it's hard to tell, but I can't imagine a scenario where the non-greedy regex would do something different. (More generally, [^,]* cannot match a comma, whereas .*? before a comma could still match a comma if that will allow the overall regex to reach a match.)

Without more information about what exactly you are hoping the parenthesized groups should do, this can obviously only be just a hint for how to actually solve your problem.

The corresponding POSIX BRE regex would have backslashes before each (, |, or ).

¹ The hash is named %+ but an individual hash value is accessed like $+{"key"}. The mnemonic is that % is a sigil for the entire hash and $ is the sigil for a scalar such as an individual value out of the hash.

Many people are critical of Perl's "arcane" syntax but they clearly haven't seen Splunk's.

Answered By - tripleee

Answer Checked By - Clifford M. (WPSolving Volunteer)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, November 27, 2023

[SOLVED] Regex in sed to replace Json values

Issue

Solution

Popular Posts

Labels