Issue
I am on a Ubuntu OS, in a Bash shell, trying to use grep
to find all occurrences of substring engineBreakdown()
inside a .tra extention log file, let's say my_log_16.tra
, and save the results inside a file, let's say results_16.txt
So I run
cat /path/to/my_log_16.tra | grep "engineBreakdown()" > results_16.txt
and when I run less results_16.txt
I actually see that there inside are saved some lines containing the substring, but they are not all the lines I expected.
In fact, when I manually search the occurrences of engineBreakdown()
down my_log_16.tra
, I see that there are other lines containing the substring, but these are not saved into results_16.txt
. So it seems that my command only saves the first occurrences of the substring.
I think the grep may fail because my_log_16.tra
is a very large file ( about 100 MB ).
If this is the cause, is there a more reliable way to find all occurrences of a substring in a very big file?
version and alias of grep
grep --version
grep (GNU grep) 2.25 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and others, see <http://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
$ type -a grep
grep is aliased to `grep --color=auto' grep is /bin/grep
Example of lines from my_log_16.tra
lines correctly detected and saved into results_16.txt
[I 2022-10-16 07:26:35.449 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.846 Rservice:75] engineBreakdown()
[I 2022-10-16 07:26:35.848 Rservice:75] engineBreakdown()
a piece of the file where the substring appears, but it is not saved into results_16.txt
[I 2022-10-16 11:32:48.039 web:2064] 200 GET /static/ui-src/default/img/Customer.png?v=0.9702853857687699 (127.0.0.1) 10.49ms
[I 2022-10-16 11:32:49.778 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:50.125 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:50.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.123 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-16 11:32:55.128 Rservice:75] engineBreakdown()
[I 2022-10-16 11:32:55.134 Rservice:75] engineBreakdown()
another piece of the file where the substring appears, but it is not saved into results_16.txt
[I 2022-10-17 04:00:35.127 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:35.138 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:39.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:39.220 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:39.228 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.233 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.237 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:39.243 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:40.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:40.128 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:40.133 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:44.206 websocketclient:62] Connection : url::ws://127.0.0.1:9999/request
[I 2022-10-17 04:00:44.221 websocketclient:62] Connection : url::ws://127.0.0.1:9999/auxiliary
[I 2022-10-17 04:00:44.227 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.232 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.234 channels:75] _on_connection_error, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:44.237 channels:82] _on_connection_close, host=127.0.0.1, port=9999
[I 2022-10-17 04:00:45.122 websocketclient:62] Connection : url::ws://localhost:3333/ws
[I 2022-10-17 04:00:45.126 Rservice:75] engineBreakdown()
[I 2022-10-17 04:00:45.128 Rservice:75] engineBreakdown()
update 1
I also tryed with
grep "engineBreakdown()" /path/to/my_log_16.tra > results_16.txt
but the result is the same.
update 2
As suggested, double quotes might not be enough to handle the parentheses properly, so I removed the parentheses from the input substring and changed the double quotes to single ones
grep "engineBreakdown" /path/to/my_log_16.tra > results_16.txt
grep 'engineBreakdown' /path/to/my_log_16.tra > results_16.txt
but the result is the same.
Solution
You can try if this awk
helps.
Data
$ cat file
engineBreakdown()
engineBreakdown() engineBreakdown() engineBreakdown() engineBreakdown()
engineBreakdown()
$ awk -v var="engineBreakdown()" '
$0~var{
printf NR
for(i=1;i<=NF;i++){
if($i~var){x++}
}
print " # matches: " x
x=0
}' file
1 # matches: 1
2 # matches: 4
3 # matches: 1
Just printing the lines (like grep) without substring detection simply do
$ awk -v var="engineBreakdown()" '$0~var{ print }' file
engineBreakdown()
engineBreakdown() engineBreakdown() engineBreakdown() engineBreakdown()
engineBreakdown()
Answered By - Andre Wildberg Answer Checked By - Willingham (WPSolving Volunteer)