python findall regex expression

Question

I got a long string and i need to find words which contain the character 'd' and afterwards the character 'e'.

l=[" xkn59438","yhdck2","eihd39d9","chdsye847","hedle3455","xjhd53e","45da","de37dp"]
b=' '.join(l)
runs1=re.findall(r"\b\w?d.*e\w?\b",b)
print(runs1)

\b is the boundary of the word, which follows with any char (\w?) and etc. I get an empty list.

Your strings are only words, why do you need a word boundary anyway? — cs95
– cs95, Commented Jun 11, 2018 at 16:17
Why not apply a regex to each word in the list individually? Why join them into a massive string? — Aran-Fey
– Aran-Fey, Commented Jun 11, 2018 at 16:18
@Aran-Fey Doing re.search i successfully done by making a for loop, just trying to understand how to use re.findall. — david007killer
– david007killer, Commented Jun 11, 2018 at 16:26
Why are you joining it? This makes your solution that much more complicated, and searching one big strings with a complex regex may end up being worse than searching smaller strings with a simpler regex. — cs95
– cs95, Commented Jun 11, 2018 at 16:30
@coldspeed Yeah i know, i just wanted to understand how to use re.findall, and couldn't grasp why my expression doesn't work. I have already done the same with a for loop for the smaller expression by re.search. — david007killer
– david007killer, Commented Jun 11, 2018 at 16:32

cs95 · Accepted Answer · 2018-06-11 16:39:00Z

1

You can massively simplify your solution by applying a regex based search on each string individually.

>>> p = re.compile('d.*e')
>>> list(filter(p.search, l))

Or,

>>> [x for x in l if p.search(x)]

['chdsye847', 'hedle3455', 'xjhd53e', 'de37dp']

Why didn't re.findall work? You were searching one large string, and your greedy match in the middle was searching across strings. The fix would've been

>>> re.findall(r"\b\S*d\S*e\S*", ' '.join(l))
['chdsye847', 'hedle3455', 'xjhd53e', 'de37dp']

Using \S to match anything that is not a space.

edited Jun 11, 2018 at 16:39

answered Jun 11, 2018 at 16:18

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Aaditya Ura · Accepted Answer · 2018-06-11 16:32:19Z

0

You can filter the result :

import re
l=[" xkn59438","yhdck2","eihd39d9","chdsye847","hedle3455","xjhd53e","45da","de37dp"]

pattern = r'd.*?e'

print(list(filter(lambda x:re.search(pattern,x),l)))

output:

['chdsye847', 'hedle3455', 'xjhd53e', 'de37dp']

answered Jun 11, 2018 at 16:32

Aaditya Ura

12.8k7 gold badges60 silver badges96 bronze badges

2 Comments

cs95 Over a year ago

This is essentially the same as this, but uglier and less efficient. Having a non-greedy capture group makes no difference on strings this small.

Aaditya Ura Over a year ago

Thank you for comment bro, I just tried my approach because i wanted to help, But if you don't like I can delete :)

score 0 · Accepted Answer · 2018-06-11 16:54:32Z

0

Something like this maybe

\b\w*d\w*e\w*

Note that you can probably remove the word boundary here because
the first \w guarantees a word boundary before.

The same \w*d\w*e\w*

edited Jun 11, 2018 at 16:54

answered Jun 11, 2018 at 16:21

user557597

5 Comments

david007killer Over a year ago

Thank you, can you explain please why my expression didn't fulfill the same?

cs95 Over a year ago

@david007killer Because it was wrong? And why are you insisting on joining the strings? Is there some secret requirement here that you've decided not to mention inyour answer?

user557597 Over a year ago

Your regex has this part .* which will match non-words and is also greedy. Where as this regex will limit the chars to words only. This would be considered a pure answer unencumbered by whether it's a string or a variable.

david007killer Over a year ago

@coldspeed I have answered your question above, and i'm sorry i didn't clarify properly my intentions .

david007killer Over a year ago

@sln Thank you.

Collectives™ on Stack Overflow

python findall regex expression

3 Answers 3

Comments

2 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related