re.findall(r'[\w]+@+[\w.]','blahh [email protected] yipee']
returns ['ggg@g']
Why doesn't it returns ['[email protected]'] or at least ['ggg@google']?
re.findall(r'[\w]+@+[\w.]','blahh [email protected] yipee']
returns ['ggg@g']
Why doesn't it returns ['[email protected]'] or at least ['ggg@google']?
\w+@+[\w.]+
^^
You have failed to add a quantifier.So it will get only one character after @.
It should be
`re.findall(r'[\w]+@+[\w.]+','blahh [email protected] yipee')`
Also if there can be only one @ you can remove the quantifier ahead of it to make it \w+@[\w.]+
Output:['[email protected]']
Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
Here in [\w]+@+[\w.], you are just checking for single character after @.
That's why it just compare g after @ and stops.
You must check the multiple occurrences of word after @ by using * or +.
*= Zero or more occurrences Ex. ggg@google,com, ggg@
+=One or more occurrences Ex ggg@g, [email protected]
re.findall(r'[\w]+@+[\w.]','blahh [email protected] yipee'), lets break it down:
At first [\w] will match any alphanumeric character so, it will match all the characters except spaces and "@".
Then [\w]+ will match one or more of the successive alphanumeric character so that leaves us with blahh, ggg, google, com and yipee.
Now [\w]+@ will match a "@" after the previously matches, but onlyggg has a "@" character immediately after it so only ggg@ is matched.
Again, [\w]+@+ will match "@" one or more time, as we have only one "@" after ggg so the previous match remains the same i.e. ggg@.
Next we have [\w]+@+[\w.] means that there can be a single alphanumeric character or a literal . after the match, ggg@ has g after it so its get selected making the match ggg@g.
So, finally we get [ggg@g] as the result.
To print ['[email protected]'] try this:
re.findall(r'\w+@\w+\.\w+','blahh [email protected] yipee')