Issue
How to merge two files based on column as a key, match to $1 column from a.txt and append b.txt based on $3 column
a.txt
aa; 2.5; 0.001;
ab; 1.5; 0.003;
ac; 0.4; 0.002;
b.txt
20-Nov-2014; 1775.00; aa;
20-Nov-2014; 1775.00; aa;
20-Nov-2014; 1463.40; ab;
20-Nov-2014; 1463.40; ac;
20-Nov-2014; 1463.40; ab;
Desired output look like this
20-Nov-2014; 1775.00; aa; aa; 2.5; 0.001;
20-Nov-2014; 1775.00; aa; aa; 2.5; 0.001;
20-Nov-2014; 1463.40; ab; ab; 1.5; 0.003;
20-Nov-2014; 1463.40; ac; ac; 0.4; 0.002;
20-Nov-2014; 1463.40; ab; ab; 1.5; 0.003;
Solution
$ awk -F';' 'FNR==NR{a[$1]=$0;next;} {print $0" " a[substr($3,2)];}' a.txt b.txt
20-Nov-2014; 1775.00; aa; aa; 2.5; 0.001;
20-Nov-2014; 1775.00; aa; aa; 2.5; 0.001;
20-Nov-2014; 1463.40; ab; ab; 1.5; 0.003;
20-Nov-2014; 1463.40; ac; ac; 0.4; 0.002;
20-Nov-2014; 1463.40; ab; ab; 1.5; 0.003;
How it works
awk
implicitly loops over every line in the files. Each line is divided into fields.
-F';'
This tells
awk
to use the semicolon as the field separator.FNR==NR{a[$1]=$0;next;}
NR is the number of lines that have been read in so far and FNR is the number of lines that have been read in so far from the current file. Consequently, when
FNR==NR
, we are still reading the first file,a.txt
. In that case, this sets assigns the whole line that was just read in,$0
, to arraya
under the key$1
, the third field.next
tellsawk
to skip the rest of the commands below and jump to the next line and start over.print $0" " a[substr($3,2)]
If we get here, that means we are working on the second file
b.txt
. In that case, print each line of this file followed by the line from arraya
with the key matching the third field.In file
b.txt
, the third field starts with a space. When looking up this field in arraya
, that space is removed with thesubstr
function.
Answered By - John1024 Answer Checked By - Timothy Miller (WPSolving Admin)