Issue
I have files with lines such as these:
@Dacor 125#Apples were stored in Section 1.#Delivered on 02/03/2023. All ok.#
I am trying to develop a regex so I use gsub
to produce:
Apples were stored in Section 1
However, I don't know how I can use regex to skip a middle section sandwiched between two #
, even if I treat #
as a delimiter.
So far, I have tried:
awk 'match($0, /@([^#]+)#(.*)#/, arr) {print arr[2]}'
This generates:
Apples were stored in Section 1.#Delivered on 02/03/2023. All ok.
I am unable to get the correct output.
Solution
Method 1:
Use cut
to print the second column in #
-delimited file:
cut -f2 -d'#' in_file > out_file
Method 2:
Use GNU grep
:
grep -Po '^[^#]*#\K[^#]*' in_file > out_file
Here, GNU grep
uses the following options:
-P
: Use Perl regexes.
-o
: Print the matches only (1 match per line), not the entire lines.
^[^#]*#
: beginning of the line, then 0 or more non-#
characters, followed by literal #
.
\K
: Cause the regex engine to "keep" everything it had matched prior to the \K
and not include it in the match. Specifically, ignore the preceding part of the regex when printing the match.
See also:
Answered By - Timur Shtatland Answer Checked By - Mildred Charles (WPSolving Admin)