Friday, October 28, 2022

[SOLVED] Check if file contains same text in consecutive lines

Issue

I want to check if a log file has any instance where two or more consecutive lines contains the same text using bash. The text will be specified. The timestamp and any other text after the third field are to be ignored in the comparison.

i.e grep... "error" /tmp/file.txt

this file will match:

2020-01-01 05:05 text1
2020-01-01 05:07 error
2020-01-01 05:15 error
2020-01-01 05:25 error
2020-01-01 05:45 text2

this won't

2020-01-01 05:05 text1
2020-01-01 05:15 error
2020-01-01 05:25 text2
2020-01-01 05:45 error
2020-01-01 05:05 text3

Any ideas using grep, sed or awk? Ideally I'd like to have an exit value 0 for match and 1 for not match.


Solution

Looks like uniq does everything you need.

-d, --repeated
only print duplicate lines, one for each group

-s, --skip-chars=N
avoid comparing the first N characters

So this should work for you:

uniq --skip-chars=17 -d /tmp/file.txt

Tested on my machine:

$ cat in.txt 
2020-01-01 05:05 text1
2020-01-01 05:07 error
2020-01-01 05:15 error
2020-01-01 05:25 error
2020-01-01 05:45 text2

$ uniq --skip-chars=17 -d in.txt 
2020-01-01 05:07 error


Answered By - Marek R
Answer Checked By - Gilberto Lyons (WPSolving Admin)