1

I have the following log in my server. If we look at column 1, the entries could be a single IP address (117.199.183.116) or multiple Ip address (115.248.95.5, 115.112.231.105) Since space is the delimiter between the various entries in a line so using cut -d " " -f 1,10 to separate columns will give different results for line 1 and line 2. So can anyone tell me how to solve this problem of getting the exact result.

117.199.183.116 - [11/Dec/2013:23:00:29 -0600] "GET /promotions/getConfig/ HTTP/1.1" 200 2841 36 TLSv1 DHE-RSA-SEED-SHA
115.248.95.5, 115.112.231.105 - [11/Dec/2013:23:00:29 -0600] "GET /promotions/getConfig/ HTTP/1.1" 200 3142 36 TLSv1 DHE-RSA-SEED-SHA
182.243.43.29 - [11/Dec/2013:23:00:29 -0600] "GET /promotions/getConfig/ HTTP/1.1" 200 3124 36 TLSv1 DHE-RSA-SEED-SHA
182.127.213.39 - [11/Dec/2013:23:00:29 -0600] "GET /promotions/getConfig/ HTTP/1.1" 200 2933 36 TLSv1 DHE-RSA-SEED-SHA

The expected output is:

117.199.183.116 36
115.248.95.5, 115.112.231.105 36
182.243.43.29 36
1182.127.213.39 36

To be more exact the log entries are something like

222.86.58.126 - [17/Dec/2013:08:21:40 -0600] "GET /promotions/getConfig/ HTTP/1.1" 200 1505 36 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"i TLSv1.2 DHE-RSA-SEED-SHA
218.95.69.175, 22.234.234.12 - [17/Dec/2013:08:21:40 -0600] "GET /promotions/getConfig/ HTTP/1.1" 200 1477 36 "http://www.duba.com/static/js/storage/storage.swf?v=2&fun=swfStorage._init" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"i TLSv1.2 DHE-RSA-SEED-SHA 

So can there be a more generic solution so that we can pick any two columns.

2
  • Whether you need the ip address only or you going to do some modification in other field. Commented Dec 13, 2013 at 7:15
  • What is your expected output. Commented Dec 13, 2013 at 7:16

4 Answers 4

2

This awk should work:

awk '{s=$0; sub(/ -.*$/, "", s); k=0;
     for (i=5; i<=NF-3; i++) if ($i ~ /^HTTP\//) {k=i; break} print s, $(k+3)}' file.log
117.199.183.116 36
115.248.95.5, 115.112.231.105 36
182.243.43.29 36
182.127.213.39 36
Sign up to request clarification or add additional context in comments.

2 Comments

What if in the columns after 36 there exists few entries which have spaces in between then ie. instead of DHE-RSA-SEED-SHA there is DHE-RSA-SEED SHA for one of the entry then {$NF-2)} will also fail. Can there be a generic solution of the same.
Check the updated code now, it should work with any # of fields in the log lines.
1

If you want to take the all repeated ip address from the single line, try this

  sed 's/\([0-9.,]*\)-\(.*\)/\1/g'

Comments

0

If you want to get the IP address and the 2nd last column, try this one

 sed -r 's/(.*) - .* ([0-9]+) .*/\1 \2/'

1 Comment

This is just an example. I require any column after the ist column.
0

From what I understand in your log message, the line starts with IP Address/es followed by a '-' and then a timestamp and smoe other data. I would advice you to perform a cut using '-' as the delimiter, take up the f1 as your IP address, and you can make the rest into another string, which you may then de-limit by space if required.

1 Comment

If you look at the expected output, you'll see that there's an extra field 36 that needs to be parsed out from the log line. So your solution almost works, but not quite.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.