0

Input:

A->(B, 1), (C, 2), (AKSDFSDF, 1231231) ...

Expected output:

[('A', 1, 2, 1231231)]

Cannot seem to get it to work. My code:

import re

pattern = r"([a-zA-z]+)->(.*)"
r = re.compile(pattern)

print r.findall("A->(B, 1), (C, 2), (AKSDFSDF, 1231231)")
>>> [('A', '(B, 1), (C, 2), (AKSDFSDF, 1231231)')]

That's close enough, but surely it's possible to extract exactly what I want?

I would have though this could work, but it doesnt:

pattern = r"([a-zA-z]+)->([\([a-zA-Z]+,([0-9]+)\)]*)"

That throws empty output (ie. []), while this:

pattern = r"([a-zA-z]+)->((\([a-zA-Z]+,([0-9]+)\))*)"
>>> [('A', '', '', '')]

Any idea?

1 Answer 1

2

You can use a positive lookahead assertion to pick words starting with a word boundary \b and followed by - or ):

import re

s = 'A->(B, 1), (C, 2), (AKSDFSDF, 1231231)'
pattern = re.compile(r'\b\w+(?=-|\))')
print pattern.findall(s)
#['A', '1', '2', '1231231']

Try it out: https://repl.it/DqSe/0

Sign up to request clarification or add additional context in comments.

2 Comments

@emihir0 Don't forget to accept if the answer helped :)
yes, sorry about that, you answered so quickly that the system wouldn't let me accept the answer yet and then I went off =). Just out of curiosity, why is the \b needed there? For my example it works even without it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.