1

I am querying a dataframe like below:

>>> df
    A,B,C
    1,1,200
    1,1,433
    1,1,67
    1,1,23
    1,2,330
    1,2,356
    1,2,56
    1,3,30

if I do part_df = df[df['A'] == 1 & df['B'] == 2], I am able to get a sub-dataframe as

 >>> part_df
    A, B, C
    1, 2, 330
    1, 2, 356
    1, 2, 56

Now i wanna make some changes to part_df like:

part_df['C'] = 0

The changes are not reflected in the original df at all. I guess it is because of numpy's array mechanism that everytime a new copy of dataframe is produced. I am wondering how do I query a dataframe with some conditions and makes changes to the selected part as the example I provided and reflect value back to original dataframe in place?

2
  • Did you ask me a question, I had something pop up in my inbox but this post doesn't show anything anymore Commented May 22, 2014 at 12:09
  • @EdChum yeah, I have got a question but managed to solve it later. so I deleted the question. Thanks anyway! Commented May 22, 2014 at 12:18

1 Answer 1

2

You should do this instead:

In [28]:

df.loc[(df['A'] == 1) & (df['B'] == 2),'C']=0
df
Out[28]:
   A  B    C
0  1  1  200
1  1  1  433
2  1  1   67
3  1  1   23
4  1  2    0
5  1  2    0
6  1  2    0
7  1  3   30

[8 rows x 3 columns]

You should use loc and select the column of interest 'C' in the square brackets at the end

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.