Pandas Dataframe not update column values

Question

This is print(df['Title']) result.

I am performing regex to replace unnecessary characters

def remove_punctuations(text):
    return re.sub(r']!@-#$%^&*(){};:,./<>?\|`~=_+',' ',text)

df1 = pd.read_csv(file2)
print(df1["Title"])
df1['Title'] = df1['Title'].apply(remove_punctuations)
print(df1["Title"])

What I am doing wrong. Please anyone point this out. Regards,

Tim Biegeleisen · Accepted Answer · 2020-12-26 14:30:14Z

1

You should be enclosing the special characters inside a character class, which is denoted by [...] square brackets:

def remove_punctuations(text):
    return re.sub(r'\s*[\[\]!@#$%^&*(){};:,./<>?\|`~=_+-]\s*', ' ', text).strip()

Note that the replacement logic used replaces standalone special characters with a single space. For the edge cases where special characters might start or end the input, we use strip().

answered Dec 26, 2020 at 14:30

Tim Biegeleisen

526k32 gold badges323 silver badges399 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Tiger Strom Over a year ago

Can you tell if I want to replace square brackets too. How can I do that ?

Tim Biegeleisen Over a year ago

My answer should already be replacing square brackets. Look closely.

Tony Ng · Accepted Answer · 2020-12-26 14:33:22Z

1

Your regex expression is looking for an exact chain of "]!@-#$%^&*(){};:,./<>?\| punctuations before substituting with a blank " ".

Replace your function with:

def remove_punctuations(text):
    return re.sub(r'[^\w\s]',' ',text)

where it would look for any instance of punctuations or white space.

answered Dec 26, 2020 at 14:33

Tony Ng

1642 silver badges12 bronze badges

Collectives™ on Stack Overflow

Pandas Dataframe not update column values

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related