Issue
I'm writing a script in bash where I use the grep
function with a regex expression to extract an id which I will be using as a variable.
The goal is to extract all characters until it finds /
, but the caracter '
and }
should be ignored.
file.txt:
{'name': 'projects/data/locations/us-central1/datasets/dataset/source1/messages/B0g2_e8gG_xaZzpbliWvjlShnVdRNEw='}
command:
cat file.txt | grep -oP "[/]+^"
The current command isn't working.
desired output:
B0g2_e8gG_xaZzpbliWvjlShnVdRNEw=
Solution
The regex you gave was: [/]+^
It has a few mistakes:
- Your use of
^
at the end seems to imply you think you can ask the software to search backwards - You can't; [/]
matches only the slash character.
Your sample shows what appears to be a malformed JSON object containing a key-value pair, each enclosed in single-quotes. JSON requires double-quotes so perhaps it is not JSON.
If several assumptions are made, it is possible to extract the section of the input that you seem to want:
- file contains a single line; and
- key and value are strings surrounded by single-quote; and
- either:
- the value part is immediately followed by
}
; or - the name part cannot contain
/
- the value part is immediately followed by
You are using -P
option to grep, so lookaround operators are available.
(?<=/)[^/]+(?=')
- lookbehind declares match is preceded by
/
- one or more non-slash (the match)
- lookahead declares match is followed by
'
[^/]+(?='})
- one or more non-slash (the match)
- lookahead declares match is followed by
'
then}
Note that the match begins as early in the line as possible and with greedy +
it is as long as possible.
Answered By - jhnc Answer Checked By - Katrina (WPSolving Volunteer)