Thursday, October 28, 2021

[SOLVED] extract all email addresses from all csv files in working directory using Linux

Issue

I am trying to grep all of the email addresses from all csv files in a working directory and print them to \n delimiter text file. I tried:

egrep -o '.*@.*' *.csv > alltheemails.txt

But, this seems to capture the entire line.

Then, I tried:

egrep -o ',.*@.*,' csv/*.csv > alltheemails.txt

I was attempting to only copy the email address and maybe the , delimiter, which can change later. This also copied the entire line.

Then, I tried:

egrep -o ',.*@.*,' csv/*.csv | sed -e 's/^,...@//g' | tee alltheemails.txt

This still captured everything in front of the email. I tried:

egrep -o ',.*@.*,' csv/*.csv | sed -e 's/*^,.*@//g' | tee alltheemails.txt

And many other variations, including:

sed -e 's/.*^[[a-zA-Z0-9]*\.\_\-\+\*@[[a-zA-Z0-9]-\.]*\.[a-zA-Z0-9]{3}$]/.*^[[a-zA-Z0-9]*\.\_\-\+\*@[[a-zA-Z0-9]-\.]*\.[a-zA-Z0-9]{3}$/g' csv/*.csv | egrep -eo | tee alltheemails.txt

This produced:

firstname,surname,lead,ip,address,city,state,postal,phone,date,range,daytime,interest,sex,dob,worktime,profit_estim,extra2

Please help me. Thank you!


Solution

Perl solution for all .csv files in the current directory
The email address can be in any field

perl -lne 'print $1 if /([^,@"]+@[^,@"]+)/' *.csv > alltheemails.txt

Prints the match $1
From the regular expression /([^,@"]+@[^,@"]+)/
[^,@"]+ = one or more occurrences of any character except ,@"

input:

name,surname"[email protected],address
name,surname,nomail,address2
nam,test,[email protected]"new york, central park
al,ternative,[email protected],paris
alternative,[email protected],paris

output:

[email protected]
[email protected]
[email protected]
[email protected]

If you prefer awk:

awk '{if (match($0, /[^,@"]+@[^,@"]+/, m)) print m[0]}' *.csv > alltheemails.txt



Answered By - Chris Koknat