Tuesday, March 15, 2022

[SOLVED] Extracting numbers from an email address prior to the @ sign

Issue

I am hoping someone can help me (complete newbie to regex)

I am trying to extract only the email addresses with numeric entries before the @

for example

[email protected] would not be selected

[email protected] would be selected

The other problem i have is that each email address also has a phone number afterwards in a ten digit format.

Example

[email protected] 5555511111

[email protected] 5555511112

Everything i have tried seems to either select the phone number too or nothing at all.

Thanks in advance and apologies if there is not enough information. I will try and answer any questions people have.

Things i have tried so far

grep -E [0-9][@] data.csv

grep -E [0-9] data.csv

grep -E \d+(?=@) data.csv (this does not complete due to a syntax error)

As i say, i'm a complete novice at this and still trying to get my head round the regex stuff


Solution

You can use

grep -Eo '^[^@[:space:]0-9]*[0-9][^@[:space:]]*@[^[:space:]]+' file

Details

  • ^ - start of string
  • [^@[:space:]0-9]* - zero or more chars other than a @, whitespace and a digit
  • [0-9] - a digit
  • [^@[:space:]]* - zero or more chars other than a @, whitespace and a digit
  • @ - a @ char
  • [^[:space:]]+ - one or more chars other than whitespace.

See an online demo:

s='[email protected] 5555511111
[email protected] 5555511112
[email protected] 6453453344'
grep -Eo '^[^@[:space:]0-9]*[0-9][^@[:space:]]*@[^[:space:]]+' <<< "$s"
# => [email protected]
#    [email protected]


Answered By - Wiktor Stribiżew
Answer Checked By - Gilberto Lyons (WPSolving Admin)