2

When I look at the values in a column in my dataframe, I can see that due to user data entry errors, the same category has been entered incorrectly.

For my dataframe I use this code:

df['column_name'].value_counts()

output:

 Targeted    523534
 targeted    1
 story       25425
 story       2
 multiple    2524543

For story, I guess there is a space?

I am trying to replace targeted with Targeted.

df['column_name'].replace("targeted","Targeted")

But nothing is happening, I still get the same value count.

1
  • Did you df['column_name'].replace("targeted","Targeted").value_counts()? Commented Feb 8, 2017 at 19:08

1 Answer 1

4

Yes, is seems there is start of end white-space(s).

Need str.strip first and then Series.replace or Series.str.replace:

df['column_name'] = df['column_name'].str.strip().replace("targeted","Targeted")

df['column_name'] = df['column_name'].str.strip().str.replace("targeted","Targeted")

Another possible solution is convert all characters to lowercase:

df['column_name'] = df['column_name'].str.strip().str.lower()
Sign up to request clarification or add additional context in comments.

1 Comment

thank you , this option worked the best df['column_name'] = df['column_name'].str.strip().replace("targeted","Targeted")

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.