Issue
I dynamically iterate through a csv file and select columns that fit the criteria I need. My CSV is separated by commas. I save these indexes to an array that looks like
echo "${cols_needed[@]}"
1 3 4 7 8
I then need to write these columns to a new file and I've tried the following cut and awk commands, however, as the array is dynamically created, I cant seem to find the right commands that can select them all at once. I have tried cut, awk and paste commands.
awk -v fields=${cols_needed[@]} 'BEGIN{ n = split(fields,f) }
{ for (i=1; i<=n; ++i) printf "%s%s", $f[i], (i<n?OFS:ORS) }' test.csv
This throws an error as it cannot split the fields unless I hard code them (even then, it can only do 2), split on spaces.
fields="1 2’
I have tried to dynamically create -f parameters, but can only do so with one variable in a loop like so
for item in "${cols_needed[@]}";
do
cat test.csv | cut -f$item
done
which outputs one column at a time.
And I have tried to dynamically create it with commas - input as 1,3,4,7...
cat test.csv | cut -f${cols_needed[@]};
which also does not work!
Any help is appreciated! I understand awk does not work like bash and we cannot pass variables around in the same way. I feel like I'm going around in circles a bit! Thanks in advance.
Solution
Your first approach is ok, just:
- change
-v fields=${cols_needed[@]}
to-v fields="${cols_needed[*]}"
, to pass the array as a single shell word - add
FS=OFS=","
to BEGIN, after splitting (you want to split on spaces, before FS is changed to,
) - ie.
BEGIN {n = split(fields, f); FS=OFS=","}
Also, if there are no commas embedded in quoted csv fields, you can use cut
:
IFS=,; cut -d, -f "${cols_needed[*]}" test.csv
If there are embedded commas, you can use gawk
's FPAT
, to only split fields on unquoted commas.
Here's an example using that.
# prepend $ to each number
for i in "${cols_needed[@]}"; do
fields[j++]="\$$i"
done
IFS=,
gawk -v FPAT='([^,]+)|(\"[^\"]+\")' -v OFS=, "{print ${fields[*]}}"
Injecting shell code in to an awk command is generally not great practice, but it's ok here IMO.
Answered By - dan