Python: Selecting a section of a string

Question

I have a list of strings that I pulled from a text file. I need to read each line and "select" two specific parts. Here is an example line from the text file (firewall report):

2011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for Workstations:192.168.2.85/1440 to Servers:192.168.1.6/43032 duration 0:00:00 bytes 2093 TCP FINs

I need to save the IP address that comes after "Workstations:" and know that they are the "workstation IPs" and I need to save the server IPs as such as well.

I imagine the best technique would be to create two lists, one for workstation IPs and one for server IPs, and read each line and write the IPs to their respective lists.

But in order to do that I need to select them, which I might do like this:

workstationIPs = []
serverIPs = []
for line in report:
    workstationIPs.append(line[a:b])
    serverIPs.append(line[c:d])

With 'a' being the start of the workstation IP and 'b' being the end (and 'c' and 'd' relating to server IPs).

However, not all the lines are the same length, so that method of selection won't work. Does anyone have any ideas on how to extract those two strings from the line?

PS: this is my first question, so please let me know of errors and I can resubmit it. Thanks!)

What about using Regular Expressions? import re. Or you can split your line into parts with the split method of a string object and then extract the elements you want. But the best way is probably a regex — tuxtimo
– tuxtimo, Commented Oct 28, 2015 at 23:27

deltab · Accepted Answer · 2015-10-28 23:46:52Z

1

You can use str.partition to split the string up and get the parts you want:

workstation_ip = line.partition('Workstations:')[2].partition('/')[0]
server_ip = line.partition('Servers:')[2].partition('/')[0]

To avoid repetition, make a function:

def between(line, preceding, following):
    return line.partition(preceding)[2].partition(following)[0]
...
workstation_ip = between(line, 'Workstations:', '/')
server_ip = between(line, 'Servers:', '/')

edited Oct 28, 2015 at 23:46

answered Oct 28, 2015 at 23:35

deltab

2,57624 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

R Nar · Accepted Answer · 2015-10-28 23:34:02Z

1

use regex!

import re
workstationIPs = []
serverIPs = []
for line in report:
    workstationIPs.append(re.search(r'Workstations:((?:\d{1,3}\.){3}\d{1,3})',line).group(1))
    serverIPs.append(r're.search(Servers:((?:\d{1,3}\.){3}\d{1,3})',line).group(1))

example:

>>> s = '011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for **Workstations:192.168.2.85/1440** to **Servers:192.168.1.6/43032** duration 0:00:00 bytes 2093 TCP FINs'
>>> re.search(r'Workstations:((?:\d{1,3}\.){3}\d{1,3})',s).group(1)
'192.168.2.85'

answered Oct 28, 2015 at 23:34

R Nar

5,5231 gold badge20 silver badges34 bronze badges

Comments

philngo · Accepted Answer · 2015-10-28 23:34:39Z

0

If the number of spaces is consistent, you could try this, which splits on whitespace, removes the astrisks, and takes the content after the first colon

workstationIPs = []
serverIPs = []
for line in report:
    items = line.split()
    workstationIPs.append(items[14].strip('*').split(':')[1])
    serverIPs.append(items[16].strip('*').split(':')[1])

answered Oct 28, 2015 at 23:34

philngo

9417 silver badges12 bronze badges

Comments

heinst · Accepted Answer · 2015-10-28 23:41:51Z

This is one way you could do it, using split and list comp:

str = "2011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for **Workstations:192.168.2.85/1440** to **Servers:192.168.1.6/43032** duration 0:00:00 bytes 2093 TCP FINs"
workstationIPs = [item.split(':')[1].replace("**", "").split("/")[0] for item in str.split(' ') if "**Workstations:" in item]
serverIPs = [item.split(':')[1].replace("**", "").split("/")[0] for item in str.split(' ') if "**Servers:" in item]
print workstationIPs
print serverIPs

Or with regex and list comp:

import re
str = "2011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for **Workstations:192.168.2.85/1440** to **Servers:192.168.1.6/43032** duration 0:00:00 bytes 2093 TCP FINs"
workstationIPs = [re.findall(r'[0-9]+(?:\.[0-9]+){3}', item)[0] for item in str.split(' ') if "**Workstations:" in item]
serverIPs = [re.findall(r'[0-9]+(?:\.[0-9]+){3}', item)[0] for item in str.split(' ') if "**Servers:" in item]
print workstationIPs
print serverIPs

Both yield:

['192.168.2.85']
['192.168.1.6']

Collectives™ on Stack Overflow

Python: Selecting a section of a string

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related