5

I have a pandas dataframe that looks like this:

            val_1   val_2   Flag
Date                       
2018-08-27  221.0  121.0     0
2018-08-28  222.0  122.0     1
2018-08-29  223.0  123.0     0
2018-08-30  224.0  124.0     2
2018-08-31  225.0  125.0     0

I want to change the Flag column values to the same values from other columns based on Flag condition. Namely, if Flag is 1 replace 1 with val_1 from the same row and if Flag is 2 replace it with val_2. The output that I am looking would look like this:

            val_1   val_2   Flag
Date                       
2018-08-27  221.0  121.0     0
2018-08-28  222.0  122.0     222.0
2018-08-29  223.0  123.0     0
2018-08-30  224.0  124.0     124.0
2018-08-31  225.0  125.0     0

I know that I can use .loc like this df.loc[df['Flag'] == 1, ['Flag']] =. I don't know what goes to the right hand side of the code.

3 Answers 3

4

One other way is to use np.where for numpy.where(condtion,yes,no)

In this case, I use nested np.where so that

np.where(If Flag=2,take val_2,(take x)) where takex is another np.where

df['Flag']=np.where(df['Flag']==1,df['val_1'],(np.where(df['Flag']==2,df['val_2'],df['Flag'])))
df

Output

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

np.select is better than two nested np.where
Thanks @Quang Hoang, of course learning from the best absolutely humbled. I now know!
4

Few ways you could do this, firstly your initial code is very close, you just need to end the assignment :

df.loc[df['Flag'] == 1, 'Flag'] = df['val_1']
print(df)
         Date  val_1  val_2   Flag
0  2018-08-27  221.0  121.0    0.0
1  2018-08-28  222.0  122.0  222.0
2  2018-08-29  223.0  123.0    0.0
3  2018-08-30  224.0  124.0    2.0
4  2018-08-31  225.0  125.0    0.0

what you're doing here is filtering your dataframe and replacing the values where the conditions matches. in this iinstance where Flag is equal to one.

since you're making muliple assingments, lets use np.select

import numpy as np
conditions = [df['Flag'].eq(1),
             df['Flag'].eq(2)]


choices = [df['val_1'],df['val_2']]

df['Flag'] = np.select(conditions,choices,default=df['Flag'])

What this this does is evaulate any and all conditions you have. leaving the default as the original column. You can add more conditions in, and wrap OR statements in parenthsis with a | (pipe) sepreators. i.e [(df['Flag'] == 1 | df['Flag'] == 2)]

         Date  val_1  val_2   Flag
0  2018-08-27  221.0  121.0    0.0
1  2018-08-28  222.0  122.0  222.0
2  2018-08-29  223.0  123.0    0.0
3  2018-08-30  224.0  124.0  124.0
4  2018-08-31  225.0  125.0    0.0

1 Comment

For now I am using the simple one-liner solution @wwnde, but I very much like your choices for more complex conditions.
3

Try this:

new_vals = df.lookup(df.index, df.columns[df.Flag-1])

df['Flag'] = df.Flag.mask(df.Flag>0, new_val)

Note: as commented by @Erfan, this would also work:

df['Flag'] = df.lookup(df.index, df.columns[df.Flag-1])

Output:

            val_1  val_2  Flag
Date                          
2018-08-27  221.0  121.0     0
2018-08-28  222.0  122.0   222
2018-08-29  223.0  123.0     0
2018-08-30  224.0  124.0   124
2018-08-31  225.0  125.0     0

2 Comments

The mask is not needed right? The lookup is sufficient.
@Erfan apparently, you are correct. That's just because Flag is the -1-th column :-).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.