Saturday, April 9, 2022

[SOLVED] Linux Grep Command - Extract multiple texts between strings

Issue

Context;

After running the following command on my server:

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022 > analisis.txt

I get a text file with thousands of lines like this example:

loggers1/PCRF1_17868/PCRF12_01_03_2022_00_15_39.log:[C]|01-03-2022:00:18:20:183401|140404464875264|TRACKING: CCR processing Compleated for SubId-5281181XXXXX, REQNO-1, REQTYPE-3, SId-mscp01.herpgwXX.epc.mncXXX.mccXXX.XXXXX.org;25b8510c;621dbaab;3341100102036XX-27cf0XXX, RATTYPE-1004, ResCode-5005 |processCCR|ProcessingUnit.cpp|423

(X represents incrementing numbers)

Problem:

The output is filled with unnecessary data. The only string portions I need are the MSISDN,IMSI comma separated for each line, like this:

5281181XXXXX,3341100102036XX

Steps I tried

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022| grep -o -P '(?<=SubId-).*?(?=, REQ)' > analisis1.txt

This gave me the first part of the solution

5281181XXXXX

However, when I tried to get the second string located between '334110' and "-"

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022| grep -o -P '(?<=SubId-).?(?=, REQ)' | grep -o -P '(?<=334110).?(?=-)' > analisis1.txt

it doesn't work.

Any input will be appreciated.


Solution

To get 5281181XXXXX or the second string located between '334110' and "-" you can use a pattern like:

\b(?:SubId-|334110)\K[^,\s-]+

The pattern matches:

  • \b A word boundary to prevent a partial word match
  • (?: Non capture group to match as a whole
    • SubId- Match literally
    • | Or
    • 334110 Match literally
  • ) Close the non capture group
  • \K Forget what is matched so far
  • [^,\s-]+ Match 1+ occurrences of any char except a whitespace char , or -

See the matches in this regex demo.

That will match:

5281181XXXXX
0102036XX

The command could look like

zgrep "ResCode-5005" /loggers1/PCRF*/_01_03_2022 | grep -oP '\b(?:SubId-|334110)\K[^,\s-]+' > analisis1.txt


Answered By - The fourth bird
Answer Checked By - Senaida (WPSolving Volunteer)