9

Hello I am new into regex and I'm starting out with python. I'm stuck at extracting all words from an English sentence. So far I have:

import re

shop="hello seattle what have you got"
regex = r'(\w*) '
list1=re.findall(regex,shop)
print list1

This gives output:

['hello', 'seattle', 'what', 'have', 'you']

If I replace regex by

regex = r'(\w*)\W*'

then output:

['hello', 'seattle', 'what', 'have', 'you', 'got', '']

whereas I want this output

['hello', 'seattle', 'what', 'have', 'you', 'got']

Please point me where I am going wrong.

0

1 Answer 1

21

Use word boundary \b

import re

shop="hello seattle what have you got"
regex = r'\b\w+\b'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']

or simply \w+ is enough

import re

shop="hello seattle what have you got"
regex = r'\w+'
list1=re.findall(regex,shop)
print list1

OP : ['hello', 'seattle', 'what', 'have', 'you', 'got']
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.