Trying to extract ONLY last name using regex from list

Question

I'm having some problem extracting the last name from a list.

list = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']

for item in list:
    print(item)
    print(re.findall(r'(\s(.*))', item))

But the output is as such:

Cristiano Ronaldo
[(' Ronaldo', 'Ronaldo')]
L. Messi
[(' Messi', 'Messi')]
M. Neuer
[(' Neuer', 'Neuer')]
L. Suarez
[(' Suarez', 'Suarez')]
De Gea
[(' Gea', 'Gea')]
Z. Ibrahimovic
[(' Ibrahimovic', 'Ibrahimovic')]
G. Bale
[(' Bale', 'Bale')]
J. Boateng
[(' Boateng', 'Boateng')]
R. Lewandowski
[(' Lewandowski', 'Lewandowski')]

I am curious as to why the last names were returned twice; I only want to get back the last names once.

Can any of you kind folks help? Thank you!

You have 2 nested groups, one that includes the space and one that doesn't. Your regex wouldn't handle the case where middle names were included? Why not split the string and return the last element? — Iain Shelvington
– Iain Shelvington, Commented Dec 23, 2019 at 7:41
You are capturing two groups. I would do it like this. \w+$ — yabberth
– yabberth, Commented Dec 23, 2019 at 7:41

Rakesh · Accepted Answer · 2019-12-23 07:41:41Z

3

Use str.split() with negative indexing

Ex:

lst = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']

for item in lst:
    print(item)
    print(item.split()[-1])

Output:

Ronaldo
Messi
Neuer
Suarez
Gea
Ibrahimovic
Bale
Boateng
Lewandowski

answered Dec 23, 2019 at 7:41

Rakesh

82.9k17 gold badges85 silver badges122 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

FlorianGD · Accepted Answer · 2019-12-23 07:44:18Z

3

You create 2 group with the two pairs of brackets. Remove the outer one and you will get only the last name:

list = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski'] 
for item in list: 
    print(item) 
    print(re.findall(r'\s(.*)', item))

answered Dec 23, 2019 at 7:44

FlorianGD

2,4461 gold badge18 silver badges33 bronze badges

Comments

Toto · Accepted Answer · 2019-12-23 10:53:42Z

1

\S matches any character that is not a space.

list = ['Cristiano Ronaldo', 'L. Messi', 'M. Neuer', 'L. Suarez', 'De Gea', 'Z. Ibrahimovic', 'G. Bale', 'J. Boateng', 'R. Lewandowski']

for item in list:
    print(item)
    print(re.findall(r'\S+$', item)) # match 1 or more non space before end of string

Output:

Cristiano Ronaldo
['Ronaldo']
L. Messi
['Messi']
M. Neuer
['Neuer']
L. Suarez
['Suarez']
De Gea
['Gea']
Z. Ibrahimovic
['Ibrahimovic']
G. Bale
['Bale']
J. Boateng
['Boateng']
R. Lewandowski
['Lewandowski']

answered Dec 23, 2019 at 10:53

Toto

91.7k63 gold badges97 silver badges135 bronze badges

Comments

Ron Serruya · Accepted Answer · 2019-12-23 07:42:00Z

0

Check this out https://regex101.com/r/CGrruO/1

You can see that your regex returns 2 matches.
You added another set of () so you got two matches, one with space and one without.

Changing to \s(.*) should work

answered Dec 23, 2019 at 7:42

Ron Serruya

4,4963 gold badges21 silver badges33 bronze badges

Collectives™ on Stack Overflow

Trying to extract ONLY last name using regex from list

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related