3

I have a dataframe, and a list of strings that I want to remove from a column in that dataframe. But when I use the replace function those characters remain. Can someone please explain why this is the case?

bad_chars = ['?', '!', ',', ';', "'", '|', '-', '--', '(', ')', 
             '[', ']', '{', '}', ':', '&', '\n']

and to replace:

df2['page'] = df2['page'].replace(bad_chars, '')

when i print out df2:

for index, row in df2.iterrows():
    print( row['project'] + '\t' + '(' + row['page'] + ',' + str(row['viewCount']) + ')' + '\n'  )

en (The_Voice_(U.S._season_14),613)

2 Answers 2

3

One way is to escape your characters using re, then use pd.Series.str.replace.

import pandas as pd
import re

bad_chars = ['?', '!', ',', ';', "'", '|', '-', '--', '(', ')', 
             '[', ']', '{', '}', ':', '&', '\n']

df = pd.DataFrame({'page': ['hello?', 'problems|here', 'nothingwronghere', 'nobrackets[]']})

df['page'] = df['page'].str.replace('|'.join([re.escape(s) for s in bad_chars]), '')

print(df)

#                page
# 0             hello
# 1      problemshere
# 2  nothingwronghere
# 3        nobrackets
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you ver much jpp, that does it perfectly
0

Use .str.replace, and pass your strings as a single, pipeline-separated string. You can use re.escape() in order to escape regex characters from that string, as suggested by @jpp. I tweak his suggestion a bit by avoiding iteration:

import re 
df2['page'] = df2['page'].str.replace(re.escape('|'.join(bad_chars)), '')

7 Comments

Cheers mcard, when I do that, I get the error : TypeError: unhashable type: 'list' If I replace the list variable with a literal example of an element in that list it works though. Is there a way to replace more than one string at the same time?
This does not work in the general case, e.g. what about regex characters such as | in your bad character list?
THey need to be escaped with a ´\´
So is the only way to manually go through the list of bad characters one by one to check for escapes? I'm sure there's a better way...
Thanks folks for talking a look, I'm accepting jpp's answer as it does exactly what I'm looking for in a generic way.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.