Sunday, February 27, 2022

[SOLVED] How can I append new lines in a CSV file and modify them in Unix

Issue

I am new to Unix in general and starting to learn shell scripting. I am working with a CSV file with the below sample rows (it's a large CSV file with 4 entries for each item):

Table 1
Item ID   Time                  Available   Location 
0001      02/02/2021  08:00     Y           NJ
0001      02/02/2021  09:00     N           UT
0001      02/02/2021  10:00     Y           AZ
0001      02/02/2021  11:00     Y           CA
0002      02/02/2021  08:00     Y           NJ
0002      02/02/2021  09:00     N           UT
0002      02/02/2021  10:00     Y           AZ
0002      02/02/2021  11:00     Y           CA      

I have another CSV with a bunch of item IDs as follows:

Table 2
Item ID   Item_Name   Item_Aux_ID    Item_Aux_name
1001      IT_1        3323           IT_Aux_1
1002      IT_2        3325           IT_Aux_2
1003      IT_3        3328           IT_Aux_3
1010      IT_4        3333           IT_Aux_4

I would like to create new entries in the first CSV file (one entry for each Item in the second CSV file). Each new entry should be the same as the first row of the Table1 with the Item ID replaced appropriately. The expected output would be:

Table 1
Item ID   Time                  Available   Location 
0001      02/02/2021  08:00     Y           NJ
0001      02/02/2021  09:00     N           UT
0001      02/02/2021  10:00     Y           AZ
0001      02/02/2021  11:00     Y           CA
0002      02/02/2021  08:00     Y           NJ
0002      02/02/2021  09:00     N           UT
0002      02/02/2021  10:00     Y           AZ
0002      02/02/2021  11:00     Y           CA  
1001      02/02/2021  08:00     Y           NJ
1002      02/02/2021  08:00     Y           NJ
1003      02/02/2021  08:00     Y           NJ
1010      02/02/2021  08:00     Y           NJ 

How do I write a script to achieve the above in Unix? Thanks in advance.


Solution

One awk idea:

awk '
NR==3 { # 1st file: skip 1st two lines (the header rows) then ...

        copyline=$0            # make a copy of the 3rd line and ...
        nextfile               # skip to the next file
      }

FNR>2 { # 2nd file: skip 1st two lines (the header rows) and ...
        # replace the 1st field of variable "copyline" with 1st field of current input line and ...
        # print the modified "copyline" to stdout
                                
        print gensub(/^[^[:space:]]*/,$1,1,copyline)
      }
' file1.csv file2.csv

Comments removed:

awk '
NR==3 { copyline=$0; nextfile }
FNR>2 { print gensub(/^[^[:space:]]*/,$1,1,copyline) }
' file1.csv file2.csv

Collapsed further into a one-liner:

awk 'NR==3{copyline=$0;nextfile}FNR>2{print gensub(/^[^[:space:]]*/,$1,1,copyline)}' file1.csv file2.csv

This generates:

1001      02/02/2021  08:00     Y           NJ
1002      02/02/2021  08:00     Y           NJ
1003      02/02/2021  08:00     Y           NJ
1010      02/02/2021  08:00     Y           NJ

Once OP is satisfied with the output, and assuming the desire is to append the output to the first file, then ...

# change this:

' file1.csv file2.csv

# to this:

' file1.csv file2.csv >> file1.csv


Answered By - markp-fuso
Answer Checked By - Clifford M. (WPSolving Volunteer)