Pandas if statement for finding substring in string

Question

I have a dataframe with the following column:

print(df):
Name
James#4567547
Mick#5456535
Tash
Liv#5468646
Nathan
Chris

You wil see some rows have the # and some dont. How can I loop through and retain all names and remove the # if present and anything after it. To get:

print(df):
Name
James
Mick
Tash
Liv
Nathan
Chris

I have tried:

if df['Name'].str.contains('#').any():
    df['Name'] = df['Name'].str.split('#',1)[0]

else:
    df['Name'] = df['Name']

But am getting a ValueError: Length of values does not match length of index at the str.split line. Any ideas? thanks!

David Erickson · Accepted Answer · 2020-06-06 01:24:16Z

2

This one would be good for str.split(), and this is the syntax.

df['Name'] = df['Name'].str.split('#').str[0]

answered Jun 6, 2020 at 1:24

David Erickson

16.7k2 gold badges21 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

wwnde · Accepted Answer · 2020-06-06 01:38:58Z

2

Another way. Use regex to read the suffix and replace it using the .str.replace() method

df.Name=df.Name.str.replace('[\#\d+]','')

    Name
0   James
1   Mick
2   Tash
3   Liv
4   Nathan
5   Chris

answered Jun 6, 2020 at 1:38

wwnde

26.7k6 gold badges22 silver badges38 bronze badges

Collectives™ on Stack Overflow

Pandas if statement for finding substring in string

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related