Issue
i need your help in unix/awk to order the data in csv file (comma separated file) to certain order given in another file. my knowledge is limited on this.
PRODUCTCODE SITE BATCHID LV1P_DESCRIPTION
PRD-0989 Static BATCH01 Estract
PRD-0990 Static BATCH02 Xstract
PRD-0991 Static BATCH03 8stract
PRD-0992 Static BATCH04 errs
PRD-0993 Static BATCH05 dgerrr
PRD-0994 Static BATCH06 Narros
PRD-0995 Static BATCH07 sef3r
and if pass input column order as [This could be stored in another file]
PRODUCTCODE,LV1P_DESCRIPTION,BATCHID,SITE
then my output file should come as below
PRODUCTCODE LV1P_DESCRIPTION BATCHID SITE
PRD-0989 Estract BATCH01 Static
PRD-0990 Xstract BATCH02 Static
PRD-0991 8stract BATCH03 Static
PRD-0992 errs BATCH04 Static
PRD-0993 dgerrr BATCH05 Static
PRD-0994 Narros BATCH06 Static
PRD-0995 sef3r BATCH07 Static
Solution
If:
- your CSV is simple (no quoted fields containing commas, one line per record...),
- and each record has exactly the same number of fields,
- and you specify the desired output order with exactly the same number of fields,
you can try (with any POSIX awk
, tested with GNU awk
and the awk
that comes with macOS, output prefixed with -|
):
awk -F, -v OFS=, -v o='PRODUCTCODE,LV1P_DESCRIPTION,BATCHID,SITE' '
BEGIN {n=split(o,h)}
NR==1 {print o; for(i=1;i<=n;i++) a[$i]=i; next}
{for(i=1;i<=n;i++) printf("%s%s", $(a[h[i]]), i==n?"\n":OFS)}
' foo.csv
-| PRODUCTCODE,LV1P_DESCRIPTION,BATCHID,SITE
-| PRD-0989,Estract,BATCH01,Static
-| PRD-0990,Xstract,BATCH02,Static
-| PRD-0991,8stract,BATCH03,Static
-| PRD-0992,errs,BATCH04,Static
-| PRD-0993,dgerrr,BATCH05,Static
-| PRD-0994,Narros,BATCH06,Static
-| PRD-0995,sef3r,BATCH07,Static
- Set comma as input and output field separators (
-F, -v OFS=,
). - Pass desired output order as variable
o
(-v o='...'
). - In
BEGIN
block split variableo
on input fields separator, store the result in arrayh
and the number of columns in variablen
. - For the first line store in array
a
the fields as keys and the field numbers as values (a[$i]=i
). Go to next line (next
). - For the other lines print the
n
first fields in desired order ($(a[h[i]])
), followed by the output field separator except the last that is followed by a newline.
If the desired output order is stored in the first line of another file, say bar.csv
:
awk -F, -v OFS=, '
NR==1 {o=$0; n=split(o,h); next}
NR!= FNR && FNR==1 {print o; for(i=1;i<=n;i++) a[$i]=i; next}
{for(i=1;i<=n;i++) printf("%s%s", $(a[h[i]]), i==n?"\n":OFS)}
' bar.csv foo.csv
NR==1
is true only for the first line of the first file.NR!= FNR && FNR==1
is true only for the first line of the second file.
Answered By - Renaud Pacalet Answer Checked By - Timothy Miller (WPSolving Admin)