Issue
I have a similar problem to solve to this: How to split a single file into multiple text files in Unix . But I would like to have output text files named in a sequential way not the value in the column.
I have a very big tab-delimited text file:
A 2
A 4
B 4
B 6
A 1
C 5
I want to split this file into a number of text files in unix based on the first column, but the files would not be named by the column value like the suggested solution in the link above i.e.:
awk '{print $0 >> ($1 ".txt"); close($1 ".txt")}'
The output should be:
tmp1.txt
A 2
A 4
A 1
tmp2.txt
B 4
B 6
tmp3.txt
C 5
and so on for each value in the first column.
Solution
This will output to a new, sequentially-named tmp file every time there's a new value in the first column.
awk 'BEGIN { i = 1 }
function seq(col1) {
if (!(col1 in map))
map[col1] = i++;
return map[col1];
}
{ print $0 >> ("tmp" seq($1) ".txt"); close("tmp" seq($1) ".txt") }'
After running with your sample input:
$ head tmp*
==> tmp1.txt <==
A 2
A 4
A 1
==> tmp2.txt <==
B 4
B 6
==> tmp3.txt <==
C 5
Answered By - Mark Plotnick Answer Checked By - David Goodson (WPSolving Volunteer)