Issue
I'm trying to compare three files with some common and unique values in them, I want to club all the three files in one file. I want to find the common values between all three files, second, all the values which are atleast present in any two file third, the values which are unique to one file versus other two. I wrote a code for this but it is only giving me the common ones only.
I only want to make compare the "name" column between the file (Not all colum), however want to print all three columns from all three files
$ cat file_1
id,name,value
1,a,20
2,b,34
3,c,5
$ cat file_2
id,name,value
1,a,27
2,b,55
7,d,15
9,z,100
$ cat file_3
id,name,value
1,a,77
2,b,95
11,d,83
6,y,109
Expected output:
id,name,value,id2,name2,value2,id3,name3,value3
1,a,20,1,a,27,1,a,77
2,b,34,2,b,55,2,b,95
3,c,5,NA,NA,NA,NA,NA,NA
NA,NA,NA,7,d,15,11,d,83
NA,NA,NA,9,z,100,NA,NA,NA
NA,NA,NA,NA,NA,NA,6,y,109
I tried this code but is is only giving me the comparision between two file. and not giving me the unique values
#!/usr/bin/awk -f
BEGIN {
FS = OFS = ",";
print "id,name,value,id2,name2,value2,id3,name3,value3";
}
function print_na_row(name) {
print name, "NA", "NA", "NA", "NA", "NA", "NA";
}
# Read file1.csv and store the data in an array
NR == FNR && FNR > 1 {
id = $1;
name = $2;
value = $3;
file1_data[name] = id "," value;
next;
}
# Read file2.csv and file3.csv and merge with data from file1.csv
FNR > 1 {
id = $1;
name = $2;
value = $3;
current_row = id "," value;
if (name in file1_data) {
print name, file1_data[name], current_row;
delete file1_data[name]; # Remove the matched entry to handle unique values in file1.csv
} else {
print_na_row(name);
}
}
END {
for (id in file1_data) {
print_na_row(name);
}
}
Please help me with this.
Solution
One awk
idea:
awk '
BEGIN { FS = OFS = "," }
FNR==1 { fcnt++; next }
{ names[$2]; lines[$2 FS fcnt] = $0 }
END { print "id,name,value,id2,name2,value2,id3,name3,value3"
for (name in names) {
out = sep = ""
for (i=1; i<=fcnt; i++) {
out = out sep (lines[name FS i] ? lines[name FS i] : "NA,NA,NA")
sep = OFS
}
print out
}
}
' file_1 file_2 file_3
NOTES:
- per OP's comment we don't have to worry about ordering otherwise ...
- OP would need to specify the sorting criteria
This generates:
id,name,value,id2,name2,value2,id3,name3,value3
NA,NA,NA,NA,NA,NA,6,y,109
NA,NA,NA,9,z,100,NA,NA,NA
1,a,20,1,a,27,1,a,77
2,b,34,2,b,55,2,b,95
3,c,5,NA,NA,NA,NA,NA,NA
NA,NA,NA,7,d,15,11,d,83
Answered By - markp-fuso Answer Checked By - Timothy Miller (WPSolving Admin)