4

Given the following string (or similar strings, some of which may contain more than one IP address):

from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03

I wish to extract the first and only the first IP address, in Python. A first attempt with something like ([0-9]{2,}\.){3}([0-9]{2,}){1} when tried out on nregex.com, looks almost OK, matching the IP address fine, but also matches the other substring which roughly resembles an IP address (170.2015.06.03.14.12.03). When the same pattern is passed to re.compile/re.findall though, the result is:

[(u'243.', u'70'), (u'06.', u'03')]

So clearly the regex is no good. How can I improve it so that it's neater and catches all IPV4 address, and how can I make it such that it only matches the first?

Many thanks.

2
  • 2
    Will the IP addresses always be within square brackets? Commented Jun 4, 2015 at 21:15
  • @Mr.Bultitude yes for the purposes of this exercise I'm only checking "Received: from" headers, and from what I can tell, for these,the IP address is always contained in []. Commented Jun 4, 2015 at 21:47

2 Answers 2

11

Use re.search with the following pattern:

>>> s = 'from mail2.oknotify2.com (mail2.oknotify2.com. [208.83.243.70]) by mx.google.com with ESMTP id dp5si2596299pdb.170.2015.06.03.14.12.03'
>>> import re
>>> re.search(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', s).group()
'208.83.243.70'
Sign up to request clarification or add additional context in comments.

1 Comment

It may be prudent to ensure that the IP within the brackets are captured explicitly if that's the OPs desire: re.search(r'\[(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]', s).group(1)
1

The regex you want is r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'. This catches 4 1- to 4-digit numbers separated by dots.

If the IP number always comes before other numbers in the string, you can avoid selecting it by using a non-greedy function such as re.find. In contrast, re.findall will catch both 208.83.243.70 and 015.06.03.14.

Are you OK with using the brackets to single out the IP number? if so, you can change the regex to r'\[(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\]'. It would be safer that way.

1 Comment

thanks. I don't see re having a function find. Were you referring to something else?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.