Issue
I am trying to parse some apache access.log and get the user agent. a line looks like
54.183.192.175 - - [27/Nov/2015:16:52:37 +0000] "GET / HTTP/1.0" 200 329 "-" "Mozilla/5.0 (Windows NT 6.3; rv:36.0 Gecko/20100101 Firefox/36.0"
I went to reg101 site and ended up with the expression .*".*".*".*"(.*)"
which in the site perfectly matches the user agent.
then I tried to use that regex in a grep command and it simply does not return anything.
I tried with single quotes and scapeing the double quotes withtout success. someone could point it to me how should I do it?
grep -o '.*".*".*".*"(.*)"' access.log -- no results at all
grep -o .*\".*\".*\".*\"(.*)\" access.log -- error `bash: syntax
error near unexpected token
('
Solution
To extract string in last pair of ""
, awk
would be simplest solution:
awk -F '"' '{print $(NF-1)}' httpd.log
Mozilla/5.0 (Windows NT 6.3; rv:36.0 Gecko/20100101 Firefox/36.0
How it works:
- By using
-F '"'
we use"
as field separator $(NF-1)
getslast - 1
field
Answered By - anubhava