0

The following is a minimal example of what I am trying to do. I have a pandas DataFrame with multiindex as follows

import pandas as pd
import numpy as np

arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
          ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
s = pd.DataFrame(np.random.randn(8,2), index=index)

So the DataFrame I have is

                     0         1
first second                    
bar   one    -3.174428 -0.314160
      two     0.968316  0.278967
baz   one     0.171292 -0.789257
      two     1.420621  0.100964
foo   one    -1.001074 -0.517729
      two    -0.211823  0.951422
qux   one     1.173289  0.313692
      two    -0.159855  0.149710

What I want is to set all the observations with the index "second" equal to two as -1. What I have in mind is using .loc, something as follows:

s.loc[(:,'two')]

but .loc would not accept the ":" operator.

Could someone help here?

1 Answer 1

2

Option 1:

In [127]: s.loc[pd.IndexSlice[:, 'two'], :] = -1

In [128]: s
Out[128]:
                     0         1
first second
bar   one    -0.581647  0.225254
      two    -1.000000 -1.000000
baz   one     0.705050 -1.414695
      two    -1.000000 -1.000000
foo   one     0.359795  1.468521
      two    -1.000000 -1.000000
qux   one    -0.481149 -0.241922
      two    -1.000000 -1.000000

Option 2:

In [137]: s.loc[(slice(None),'two'), :] = -11

In [138]: s
Out[138]:
                      0          1
first second
bar   one      2.144487   0.024400
      two    -11.000000 -11.000000
baz   one     -0.177128  -1.088566
      two    -11.000000 -11.000000
foo   one     -0.780979   2.701814
      two    -11.000000 -11.000000
qux   one     -0.981635  -0.202875
      two    -11.000000 -11.000000
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for the reply, but the first method does not work. I don't work which version you are using, but the latest version returns KeyError message. The second method works, but the syntax seems complicated. I guess if there are no better options, I will use the second method.
@user3821012, i'm using pandas: 0.22.0 and I do prefer a second option as it much clearer (for me)...
As the results of the first method shows, it does not return the desired results--the values for "two" do not change.
and If you use "s.loc[(None, 'two'), :]", it returns error.
@user3821012, i don't get any error message... Please post your desired data set
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.