Saturday, April 9, 2022

[SOLVED] Grep total amount of specific elements based on date

Issue

Is there a way in linux to filter multiple files with bunch of data in one command without writing a script?

For this example I want to know how many males appear by date. Also the problem is that a specific date (January 3rd) appears in 2 seperate files:

file1

Jan  1 john male=yes
Jan  1 james male=yes
Jan  2 kate male=no 
Jan  3 jonathan male=yes

file2

Jan  3 alice male=no
Jan  4 john male=yes 
Jan  4 jonathan male=yes
Jan  4 alice male=no

I want the total amount of males for each date from all files. If there are no males for a specific date, no output will be given.

Jan  1 2 
Jan  3 1
Jan  4 2

The only way I can think of is count the total amount of male genders given a specific date, but this would not performant as in real-world examples there could be much more files and manually entering all the dates would be a waste of time. Any help would be appreciated, thank you!

localhost:~# cat file1 file2 | grep "male=yes" | grep "Jan  1" | wc -l
2

Solution

grep -h 'male=yes' file? | \
    cut -c-6 | \
    awk '{c[$0] += 1} END {for(i in c){printf "%6s %4d\n", i, c[i]}}'

The grep will print the male lines, cut will remove everything but the first 6 chars (date) and awk will count every date and printout every date and the counter in the end.

Given your files the output will be:

Jan  1    2
Jan  3    1
Jan  4    2


Answered By - dgw
Answer Checked By - Timothy Miller (WPSolving Admin)