Issue
I have dates in a DD/MM/YYYY format, with the caveat that for days <10 dates are single digits.
name,date,time
DEF,1/02/2021,06:00
HIJ,31/01/2021,07:50
ABC,1/04/2021,05:50
I only want to sort by the date column. I find this challenging since the days part of the date value is variable in length.
Solution
There isn't an easy way to sort the date field ([d]d/mm/yyyy
) directly within a comma-separated field (sort
natively understands US mm/dd/yyyy
style dates but not this international alternative). However, picking out that date field and reordering the fields as yyyy-mm-dd
makes it much easier, and sort
can handle this as a standard alphanumeric sort. The process of temporarily pulling out the primary sort key is known as a Schwartzian transform.
Here you would sort the result, then throw away this first (temporary) field to regain the original unchanged data.
awk -F'[,/]' '{printf "%04d-%02d-%02d\t%s\n", $4,$3,$2, $0}' datafile | sort | cut -f2-
name,date,time
HIJ,31/01/2021,07:50
ABC,1/02/2021,05:50
DEF,1/02/2021,06:00
You can see the result of the intermediate process if you strip back either before or after the sort
command.
Answered By - roaima Answer Checked By - Gilberto Lyons (WPSolving Admin)