Update and replace values in columns based on conditions in Python

Question

I wish to update and replace values based on the dates within my dataframe, while removing data in other specific columns.

Data

id  date        location    status  value1  value2
CC  1/1/2022    ny          new     12      1
CC  4/1/2022    ny          new     1       1
CC  7/1/2022    ny          new     1       1
CC  10/1/2022   ny          new     1       2
CC  1/1/2023    ny          ok      1       2
CC  4/1/2023    ny          ok      1       2
CC  7/1/2023    ny          ok      1       3
CC  10/1/2023   ny          ok      1       3
BB  1/1/2022    ca          new     1       3
BB  4/1/2022    ca          new     1       3
BB  7/1/2022    ca          new     1       3
BB  10/1/2022   ca          new     12      3
BB  1/1/2023    ca          new     2       3
BB  4/1/2023    ca          new     2       3
BB  7/1/2023    ca          new     2       3
BB  10/1/2023   ca          new     2       3

Desired

id  date        location    status  value1  value2
CC  1/1/2022    ny          open        
CC  4/1/2022    ny          open        
CC  7/1/2022    ny          open        
CC  10/1/2022   ny          new     1       2
CC  1/1/2023    ny          ok      1       2
CC  4/1/2023    ny          ok      1       2
CC  7/1/2023    ny          ok      1       3
CC  10/1/2023   ny          ok      1       3
BB  1/1/2022    ca          new     1       3
BB  4/1/2022    ca          new     1       3
BB  7/1/2022    ca          new     1       3
BB  10/1/2022   ca          new     12      3
BB  1/1/2023    ca          new     2       3
BB  4/1/2023    ca          new     2       3
BB  7/1/2023    ca          new     2       3
BB  10/1/2023   ca          new     2       3

Doing

df.loc[(df.id == 'cc') & (df.date <= '07/01/2022'), 'status']= 'open'

This labels all of the dates as open and does not remove the values in the other columns. Any suggestion is appreciated. Thank you for any suggestions.

7shoe · Accepted Answer · 2022-07-27 22:44:14Z

2

Unfortunately, snapping a cell out of existence does not seem to work with Pandas. Similarly, Pandas expects a value for each cell of every column when setting up a dataframe.

Therefore, nan (not a number) seems to be the exact placeholder appropriate for your case. In turn, consider, importing numpy as np and adding the line to set the respective entry to np.nan

df.loc[(df.id == 'cc') & (df.date <= '07/01/2022'), 'value1']= np.nan

Fortunately,

df.fillna("")

prints the Pandas frame without showing those annoying NAN entries but leaves the cells 'empty' as you seem to desire.

In addition, NumPy enables the use of aggregate functions to ignore nan values such as np.nanmean() as can be found here to not break computation on such tables.

answered Jul 27, 2022 at 22:44

7shoe

1,5261 gold badge8 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Lynn Over a year ago

ok thank you - is this also replacing the 'new' value with 'open'? I didn't see this in the script

7shoe Over a year ago

No, that's why wrote "adding the line" meaning [to your existing code].

Collectives™ on Stack Overflow

Update and replace values in columns based on conditions in Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related