Friday, January 19, 2024

[SOLVED] Rearranging column/values in csv file based on input column order

January 19, 2024 awk, bash, shell

Issue

i need your help in unix/awk to order the data in csv file (comma separated file) to certain order given in another file. my knowledge is limited on this.

example: File to rearrange.

PRODUCTCODE SITE    BATCHID LV1P_DESCRIPTION
PRD-0989    Static  BATCH01 Estract 
PRD-0990    Static  BATCH02 Xstract 
PRD-0991    Static  BATCH03 8stract 
PRD-0992    Static  BATCH04 errs
PRD-0993    Static  BATCH05 dgerrr
PRD-0994    Static  BATCH06 Narros
PRD-0995    Static  BATCH07 sef3r

and if pass input column order as [This could be stored in another file]

PRODUCTCODE,LV1P_DESCRIPTION,BATCHID,SITE

then my output file should come as below

PRODUCTCODE LV1P_DESCRIPTION    BATCHID SITE
PRD-0989    Estract     BATCH01 Static
PRD-0990    Xstract     BATCH02 Static
PRD-0991    8stract     BATCH03 Static
PRD-0992    errs    BATCH04 Static
PRD-0993    dgerrr  BATCH05 Static
PRD-0994    Narros  BATCH06 Static
PRD-0995    sef3r   BATCH07 Static

Solution

If:

your CSV is simple (no quoted fields containing commas, one line per record...),
and each record has exactly the same number of fields,
and you specify the desired output order with exactly the same number of fields,

you can try (with any POSIX awk, tested with GNU awk and the awk that comes with macOS, output prefixed with -| ):

awk -F, -v OFS=, -v o='PRODUCTCODE,LV1P_DESCRIPTION,BATCHID,SITE' '
  BEGIN {n=split(o,h)}
  NR==1 {print o; for(i=1;i<=n;i++) a[$i]=i; next}
  {for(i=1;i<=n;i++) printf("%s%s", $(a[h[i]]), i==n?"\n":OFS)}
' foo.csv
-| PRODUCTCODE,LV1P_DESCRIPTION,BATCHID,SITE
-| PRD-0989,Estract,BATCH01,Static
-| PRD-0990,Xstract,BATCH02,Static
-| PRD-0991,8stract,BATCH03,Static
-| PRD-0992,errs,BATCH04,Static
-| PRD-0993,dgerrr,BATCH05,Static
-| PRD-0994,Narros,BATCH06,Static
-| PRD-0995,sef3r,BATCH07,Static

Set comma as input and output field separators (-F, -v OFS=,).
Pass desired output order as variable o (-v o='...').
In BEGIN block split variable o on input fields separator, store the result in array h and the number of columns in variable n.
For the first line store in array a the fields as keys and the field numbers as values (a[$i]=i). Go to next line (next).
For the other lines print the n first fields in desired order ($(a[h[i]])), followed by the output field separator except the last that is followed by a newline.

If the desired output order is stored in the first line of another file, say bar.csv:

awk -F, -v OFS=, '
  NR==1 {o=$0; n=split(o,h); next}
  NR!= FNR && FNR==1 {print o; for(i=1;i<=n;i++) a[$i]=i; next}
  {for(i=1;i<=n;i++) printf("%s%s", $(a[h[i]]), i==n?"\n":OFS)}
' bar.csv foo.csv

NR==1 is true only for the first line of the first file.
NR!= FNR && FNR==1 is true only for the first line of the second file.

Answered By - Renaud Pacalet

Answer Checked By - Timothy Miller (WPSolving Admin)

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, January 19, 2024

[SOLVED] Rearranging column/values in csv file based on input column order

Issue

Solution

Popular Posts

Labels