1

Using pandas 1.42

Having a DataFrame with 5 columns: A, B, C, D, E

I need to assign values from columns D and E to columns A and B if the value of column C is true. I want to achieve this in one line using the .loc method.

example

A B C D E
1 4 True 7 10
2 5 False 8 11
3 6 True 9 12

expected result

A B C D E
7 10 True 7 10
2 5 False 8 11
9 12 True 9 12
df = pd.DataFrame(
  {'A': [1, 2, 3], 
  'B': [4, 5, 6], 
  'C': [True, False, True], 
  'D': [7, 8, 9], 
  'E': [10, 11, 12]}
)

df.loc[df['C'], ['A', 'B']] = df[['D', 'E']]

actual result

A B C D E
nan nan True 7 10
2 5 False 8 11
nan nan True 9 12

workaround I figured

df.loc[df['C'], ['A', 'B']] = (df.D[df.C], df.E[df.C])

Seems pandas not getting right the to be assigned values if they come in form of a DataFrame, but it gets it right if you pack it nicely as tuple of Series. Do I get the syntax wrong or is it a bug in pandas?

1 Answer 1

1

Use boolean indexing on both sides, and remove index alignment by converting to_numpy array:

m = df['C']
df.loc[m, ['A', 'B']] = df.loc[m, ['D', 'E']].to_numpy()

Or change the column names with set_axis:

df.loc[df['C'], ['A', 'B']] = df[['D', 'E']].set_axis(['A', 'B'], axis=1)

Output:

   A   B      C  D   E
0  7  10   True  7  10
1  2   5  False  8  11
2  9  12   True  9  12
Sign up to request clarification or add additional context in comments.

5 Comments

to_numpy() is great workaround but doesn't seem to answer my question. your second solution seems to clarify it for me. So if assigned from DataFrame pandas is looking to get assign values at cells with same indexes. i was getting nans because it was looking to assign value to 0, A from 0,A but he only had value for 0,D. When you renamed the column index pandas could find the value. is it right?
@Eliy if you question is "why does this happen?", the answer is due to index alignment performed before assignment. My to_numpy approach is a workaround to avoid this.Regarding the second approach, can you provide a reproducible example for which it fails? Do you have duplicated indices?
unfortunately, I couldn't recreate it with the example data. But I did the same kind of operations, I set the axis of the right side Dataframe to have save columns index as left side data frame. I am getting the "cannot reindex on an axis with duplicate labels" error
Could you explain what index alignment before assignment means? isn't it as in the example on my comment?
Yes, exactly, in short you want to assign to columns A/B on the left side of the =, so pandas reindexes the DataFrame on the right side to have the same index/columns. In your case this leads to a DataFrame of NaNs as the columns A/B are missing on the right

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.