I have this command which outputs 2 columns separated by ⎟. First column is the number of occurrence, second is the IP address. And the whole thing is sorted by ascending # of occurrence.
awk '{ips[$1]++} END {for (ip in ips) { printf "%5s %-1s %-3s\n", ips[ip], "⎟", ip}}' "${ACCESSLOG}" | sort -nk1
19 ⎟ 76.20.221.34
19 ⎟ 76.9.214.2
22 ⎟ 105.152.107.118
26 ⎟ 24.185.179.32
26 ⎟ 42.117.198.229
26 ⎟ 83.216.242.69
etc.
Now i would like to add a third column in there. In the bash shell, if you do, for instance:
host 72.80.99.43
you'll get:
43.99.80.72.in-addr.arpa domain name pointer pool-72-80-99-43.nycmny.fios.verizon.net.
So for every IP appearing in the list, i want to show in the third column its associated host. And i want to do that from within awk. So calling host from awk and passing it the parameter ip. And ideally, skipping all the standard stuff and only showing the hostname like so: nycmny.fios.verizon.net.
So my final command would look like this:
awk '{ips[$1]++} END {for (ip in ips) { printf "%5s %-1s %-3s %20s\n", ips[ip], "⎟", ip, system( "host " ip )}}' "${ACCESSLOG}" | sort -nk1
Thanks
hostfor each IP, but it will still be slow (that's the nature of DNS). Beware however that you'll put a lot of strain to your DNS server no matter how you do it, so it might be a good idea to talk to your sysadmin before doing this on regular basis.httpd, they can resolve logs as long as the each line in the file to process begins with an IP address (followed by a blank). Apache comes with an utility namedlogresolve, that does exactly that.