0

I'm having a hard time updating a string value in a subset of Pandas data frame

In the field action, I am able to modify the action column using regular expressions with:

df['action'] = df.action.str.replace('([^a-z0-9\._]{2,})','')

However, if the string contains a specific word, I don't want to modify it, so I tried to only update a subset like this:

df[df['action'].str.contains('TIME')==False]['action'] = df[df['action'].str.contains('TIME')==False].action.str.replace('([^a-z0-9\._]{2,})','')

and also using .loc like:

df.loc('action',df.action.str.contains('TIME')==False) = df.loc('action',df.action.str.contains('TIME')==False).action.str.replace('([^a-z0-9\._]{2,})','')

but in both cases, nothing gets updated. Is there a better way to achieve this?

1
  • Could you provide a sample dataframe, sample input and desired output? Commented Apr 15, 2020 at 17:47

2 Answers 2

1

you can do it with loc but you did it the way around with column first while it should be index first, and using [] and not ()

mask_time = ~df['action'].str.contains('TIME') # same as df.action.str.contains('TIME')==False
df.loc[mask_time,'action'] = df.loc[mask_time,'action'].str.replace('([^a-z0-9\._]{2,})','')

example:

#dummy df
df = pd.DataFrame({'action': ['TIME 1', 'ABC 2']})
print (df)
   action
0  TIME 1
1   ABC 2

see the result after using above method:

   action
0  TIME 1
1       2
Sign up to request clarification or add additional context in comments.

1 Comment

That worked, also helped me understand a little better how .loc works. Thanks!
1

Try this it should work, I found it here

df.loc[df.action.str.contains('TIME')==False,'action'] = df.action.str.replace('([^a-z0-9\._]{2,})','')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.