1

I am trying to parse /etc/network/interfaces config file in Ubuntu so I need divide string into list of strings where each string begins with one of the given keywords.

According to manual:

The file consists of zero or more "iface", "mapping", "auto", "allow-" and "source" stanzas.

So If the file contains:

auto lo eth0
allow-hotplug eth1

iface eth0-home inet static
    address 192.168.1.1
    netmask 255.255.255.0

I would like to get list:

['auto lo eth0', 'allow-hotplug eth1', 'iface eth0-home inet static\n address...']

Now I have function like this:

def get_sections(text):
    start_indexes = [s.start() for s in re.finditer('auto|iface|source|mapping|allow-', text)]
    start_indexes.reverse()
    end_idx = -1
    res = []
    for i in start_indexes:
        res.append(text[i: end_idx].strip())
        end_idx = i
        res.reverse()
    return res

But it isn't nice...

2
  • 2
    Alternatively, you could use something like confparse, which apparently supports network interface files. Commented Jan 18, 2012 at 12:40
  • You can simplify this code quite a bit by extracting the slices directly from the start_indexes. Commented Jan 18, 2012 at 13:24

2 Answers 2

3

You can do it in a single regex:

>>> reobj = re.compile("(?:auto|allow-|iface)(?:(?!(?:auto|allow-|iface)).)*(?<!\s)", re.DOTALL)
>>> result = reobj.findall(subject)
>>> result
['auto lo eth0', 'allow-hotplug eth1', 'iface eth0-home inet static\n    address 192.168.1.1\n    netmask 255.255.255.0']

Explanation:

(?:auto|allow-|iface)   # Match one of the search terms
(?:                     # Try to match...
 (?!                    #  (as long as we're not at the start of
  (?:auto|allow-|iface) #  the next search term):
 )                      #  
 .                      # any character.
)*                      # Do this any number of times.
(?<!\s)                 # Assert that the match doesn't end in whitespace

Of course you can also map the results into a list of tuples as requested in your comment:

>>> reobj = re.compile("(auto|allow-|iface)\s*((?:(?!(?:auto|allow-|iface)).)*)(?<!\s)", re.DOTALL)
>>> result = [tuple(match.groups()) for match in reobj.finditer(subject)]
>>> result
[('auto', 'lo eth0'), ('allow-', 'hotplug eth1'), ('iface', 'eth0-home inet static\n    address 192.168.1.1\n    netmask 255.255.255.0')]
Sign up to request clarification or add additional context in comments.

1 Comment

For the first time it seems little complicated, but is short and much better than my version. But would it be hard to get list of (group, string) e.g. [('auto', 'auto lo eth0'), ('iface', iface eth0 inet static'), ...]??
2

Your were very close to a clean solution when you computed the start indicies. With those, you can add a single line to extract the required slices:

indicies = [s.start() for s in re.finditer(
            'auto|iface|source|mapping|allow-', text)]
answer = map(text.__getslice__, indicies, indicies[1:] + [len(text)])

2 Comments

This is nice too, but needs little fix for me: map(text.__getslice__, indicies, indicies[1:] + [len(text)])
@marcinpz Okay, edited to match your requirements. I think this is much cleaner than creating a giant, hairy regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.