Issue
I'm working on a Bash script to parse Postgres error logs and pull out log entries between certain dates/times. The complicating factor is that entries can be multiline, and only the first line includes the time stamp. Log entries look like this:
YYYY-MM-DD hh:mm:ss UTC:<IP address>(<port>):<user>@<host>:[pid]:<msgtype>:<message>
where <message>
can be 1 or more lines. <msgtype>
is ERROR
, STATEMENT
, DETAIL
, etc. Almost any <msgtype>
can have any number of lines.
Awk will be processing multiple files on the command line, and while the lines in the file are in timestamp order, the files aren't necessarily so. Currently I'm just doing a simple compare (awk '{if ($1 >= "$first") { print } }'
)* where $first
is set to the beginning time stamp. Adding a check for a $last
is trivial, the problem is getting those lines that don't start with a timestamp, and only those following a matching one.
Can someone point me in the right direction for this?
*It just accorred to me that this will only compare the date and not the time, so can someone help with this part as well? Can I do awk '{if ( ($1" "$2) >= "$first") { print } }'
?
ETA: sample log entry:
2023-11-07 07:01:25 UTC::@:[605]:ERROR: could not connect to the publisher: connection to server at "<ip addr>", port 5432 failed: FATAL: pg_hba.conf rejects replication connection for host "<ip addr>", user "<userid>", SSL on
connection to server at "<ip addr>", port 5432 failed: FATAL: pg_hba.conf rejects replication connection for host "<ip addr>", user "<userid>", SSL off
Solution
It's possible that this might be what you're trying to do but without useful sample input/output that demonstrates all your requirements, it's a guess based on multiple assumption:
$ awk -v beg='2023-11-07 05:00:00' -v end='2024-12-01 07:00:00' '
match($0,/^[0-9]{4}([-: ][0-9]{2}){5}/) { cur = substr($0,RSTART,RLENGTH) }
(beg <= cur) && (cur <= end)
' file
2023-11-07 07:01:25 UTC::@:[605]:ERROR: could not connect to the publisher: connection to server at "<ip addr>", port 5432 failed: FATAL: pg_hba.conf rejects replication connection for host "<ip addr>", user "<userid>", SSL on
connection to server at "<ip addr>", port 5432 failed: FATAL: pg_hba.conf rejects replication connection for host "<ip addr>", user "<userid>", SSL off
You may or may not want to tighten the date+time regexp I'm using depending on what else can exist at the start of a line in your log file.
Answered By - Ed Morton Answer Checked By - Senaida (WPSolving Volunteer)