2

I have a text file with entries like this:

 Interface01 :
     adress
        192.168.0.1
next-interface:
 interface02:
     adress
        10.123.123.214
next-interface:
 interface01 :
     adress
        172.123.456.123

I'd like to parse it and get only the IP address corresponding to Interface01

I tried may things with python re.finall but couldn't get anything matching

 i = open(f, r, encoding='UTF-8')
 txt = i.read()
 interface = re.findall(r'Interface01 :\s*(.adress*)n',txt,re.DOTALL)

but nothing works.

The expected result is 192.168.0.1.

5
  • What's that "n" at the end of the regex? Commented Mar 22, 2018 at 13:47
  • 1
    Try Interface01\s*:\s*adress\s+(.*). Remove re.DOTALL. Use just re.search to get the first match. See ideone.com/QoE1uF. Can there be more IPs per interface? Commented Mar 22, 2018 at 13:48
  • Thanks for your help. @Maroun, The n is for the beginning of "new-interface". To Wiktor ok i try it now and tells you Commented Mar 22, 2018 at 13:52
  • Thank you very much @wiktor this solves my problem Commented Mar 22, 2018 at 13:57
  • @ricardo I posted an answer with explanation. Commented Mar 22, 2018 at 14:01

6 Answers 6

3

You may use

Interface01\s*:\s*adress\s+(.*)

See the regex demo. In Python, use re.search to get the first match since you only want to extract 1 IP address.

Pattern details:

  • Interface01 - a literal substring
  • \s*:\s* - a : enclosed with 0+ whitespaces
  • adress - a literal substring
  • \s+ - 1+ whitespaces
  • (.*) - Group 1: any 0+ chars other than line break chars.

Python demo:

import re
reg = r"Interface01\s*:\s*adress\s+(.*)"

with open('filename') as f:
    m = re.search(reg, f.read())
    if m:
        print(m.group(1))

# => 192.168.0.1
Sign up to request clarification or add additional context in comments.

Comments

2

How about creating a pattern that said "Interface01", then skip all chars that are not digits, then get the digits and dots?

re.findall(r'Interface01[^0-9]+([0-9.]+)', text)

Result:

['192.168.0.1']

Update

Thanks to @zipa, here is the updated regex:

re.findall(r'[iI]nterface01[^0-9]+([0-9.]+)', text)

Result:

['192.168.0.1', '172.123.456.123'

1 Comment

Small suggestion - change Interface into [Ii]nterface
0
interface = re.findall(r'Interface01 :\s*.adress\s*(.*?)$',txt,re.S|re.M)        

Comments

0

You could try something like this:

interface = re.findall(r'Interface01 :\n +adress\n +(\d+.\d+.\d+.\d+)', txt)
# ['192.168.0.1']

Comments

0

For getting one single match it's better to use re.serach() function:

import re

with open('filename') as f:
    pat = r'Interface01 :\s*\S+\s*((?:[0-9]{1,3}\.){3}[0-9]{1,3})'
    result = re.search(pat, f.read()).group(1)

print(result)

The output:

192.168.0.1

Comments

0

you can use Interface01 :\n.*?\n(.*)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.