Sunday, March 13, 2022

[SOLVED] SED or AWK extract between string to end of line leave only first result found

Issue

Trying to write an email processor extracting some data from email (dovecot/postfix based), so file located in

/home/moderator/Maildir/cur/1619183102.V97eI6001a560M865218.example.com:2,S

let's say

/home/moderator/Maildir/cur/file

Email file text contains text and HTML

Subject: New user
New user created 
User name:Billy Jean
<html><head><title>New user</title>
</head>
<body>
<p>New user created</p>
User name:Billy Jean<br>
</body>

The task is to extract exactly user name Billy Jean between

User name:

and end of line

but leave only first instance to avoid duplicates (ignore HTML line User name:Billy Jean<br>)

Already tested some variants from StackOverFlow like

awk '/^User name:/{print $NF}' /home/moderator/Maildir/cur/file

but it does not give a necessary result and does not correspond the exact matter of my question.

Thx for any ideas to try,


Solution

With your shown samples, please try following awk code. Look for string which you want to search and exit on its first existence after printing needed value.

awk -F':' '/^User name:/{print $NF;exit}' /home/moderator/Maildir/cur/file

Bonus solution: In case your awk program has more things to handle and we can't get out of program without doing all stuff, then add a simple condition check with print so that it prints only very first occurrence of string.

awk -F':' '/^User name:/ && ++count==1{print $NF} {your rest of code here....}' /home/moderator/Maildir/cur/file


Answered By - RavinderSingh13
Answer Checked By - Marilyn (WPSolving Volunteer)