12

Currently I am running a command like this, to get the most requested content:

grep "17\/Jul\/2011" other_vhosts_access.log | awk '{print $8}' | sort | uniq -c | sort -nr

I want to now see the user agent strings, but the problem is they include several spaces. Here is a typical log file line. The UA is the last section delimited by quotation marks:

example.com:80 [ip] - - [17/Jul/2011:23:59:59 +0100] "GET [url] HTTP/1.1" 200 6449 "[referer]" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.122 Safari/534.30"

Is there a better tool than awk for this?

2 Answers 2

24

If that format is consistent and the field is really wrapped in double quotes you can use either awk or cut with " as the field delimiter:

awk -F\" '{print $6}'

or:

cut -d\" -f 6
3
perl -ne 'if(/"([^"]+)"$/){$ua{$1}++;} END{for(keys %ua){print "$ua{$_} $_\n"}}' \
  access_log

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.