If I have a string s = "Name: John, Name: Abby, Name: Kate". How do I extract everything in between Name: and ,. So I'd want to have an array a = John, Abby, Kate
Thanks!
No need for a regex:
>>> s = "Name: John, Name: Abby, Name: Kate"
>>> [x[len('Name: '):] for x in s.split(', ')]
['John', 'Abby', 'Kate']
Or even:
>>> prefix = 'Name: '
>>> s[len(prefix):].split(', ' + prefix)
['John', 'Abby', 'Kate']
Now if you still think a regex is more appropriate:
>>> import re
>>> re.findall('Name:\s+([^,]*)', s)
['John', 'Abby', 'Kate']
The interesting question is how you would choose among the many ways to do this in Python. The answer using "split" is nice if you're confident that the format will be exact. If you would like some protection from minor format changes, a regular expression might be useful. You should think through what parts of the format are most likely to be stable, and capture those in your regular expression, while leaving flexibility for the others. Here is an example that assumes that the names are alphabetic, and that the word "Name" and the colon are stable:
import re
s = "Name: John, Name: Abby, Name: Kate"
names = [i.group(1) for i in re.finditer("Name:\s+([A-Za-z]*)", s)]
print names
You might instead want to allow for hyphens or other characters inside a name; you can do so by changing the text inside [A-Za-z].
A good page about Python regular expressions with lots of examples is http://docs.python.org/howto/regex.html.
re.findall("Name:\s+([A-Za-z]*)", s)Few more ways to do it
>>> s
'Name: John, Name: Abby, Name: Kate'
Method 1:
>>> [x.strip() for x in s.split("Name:")[1:]]
['John,', 'Abby,', 'Kate']
Method 2:
>>> [x.rsplit(":",1)[-1].strip() for x in s.split(",")]
['John', 'Abby', 'Kate']
Method 3:
>>> [x.strip() for x in re.findall(":([^,]*)",s)]
['John', 'Abby', 'Kate']
Method 4:
>>> [x.strip() for x in s.replace('Name:','').split(',')]
['John', 'Abby', 'Kate']
Also note, how I always consistently applied strip which makes sense if their can be multiple spaces between 'Name:' token and the actual Name.
Method 2 and 3 can be used in a more generalized way.