0

I have a dataframe with the following column:

print(df):
Name
James#4567547
Mick#5456535
Tash
Liv#5468646
Nathan
Chris

You wil see some rows have the # and some dont. How can I loop through and retain all names and remove the # if present and anything after it. To get:

print(df):
Name
James
Mick
Tash
Liv
Nathan
Chris

I have tried:

if df['Name'].str.contains('#').any():
    df['Name'] = df['Name'].str.split('#',1)[0]

else:
    df['Name'] = df['Name']

But am getting a ValueError: Length of values does not match length of index at the str.split line. Any ideas? thanks!

2 Answers 2

2

This one would be good for str.split(), and this is the syntax.

df['Name'] = df['Name'].str.split('#').str[0]
Sign up to request clarification or add additional context in comments.

Comments

2

Another way. Use regex to read the suffix and replace it using the .str.replace() method

df.Name=df.Name.str.replace('[\#\d+]','')

    Name
0   James
1   Mick
2   Tash
3   Liv
4   Nathan
5   Chris

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.