Add column to pandas multiindex dataframe

Question

I have a pandas dataframe that looks like this:

import pandas as pd
import numpy as np

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
      np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])]
df = pd.DataFrame(np.random.randn(8,4),index=arrays,columns=['A','B','C','D'])

I want to add a column E such that df.loc[(slice(None),'one'),'E'] = 1 and df.loc[(slice(None),'two'),'E'] = 2, and I want to do this without iterating over ['one', 'two']. I tried the following:

df.loc[(slice(None),slice('one','two')),'E'] = pd.Series([1,2],index=['one','two'])

but it just adds a column E with NaN. What's the right way to do this?

BENY · Accepted Answer · 2019-04-02 03:51:05Z

2

Here is one way reindex

df.loc[:,'E']=pd.Series([1,2],index=['one','two']).reindex(df.index.get_level_values(1)).values
df
                A         B         C         D  E
bar one -0.856175 -0.383711 -0.646510  0.110204  1
    two  1.640114  0.099713  0.406629  0.774960  2
baz one  0.097198 -0.814920  0.234416 -0.057340  1
    two -0.155276  0.788130  0.761469  0.770709  2
foo one  1.593564 -1.048519 -1.194868  0.191314  1
    two -0.755624  0.678036 -0.899805  1.070639  2
qux one -0.560672  0.317915 -0.858048  0.418655  1
    two  1.198208  0.662354 -1.353606 -0.184258  2

answered Apr 2, 2019 at 3:51

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

cs95 · Accepted Answer · 2019-04-02 03:55:47Z

1

Methinks this is a good use case for Index.map:

df['E'] = df.index.get_level_values(1).map({'one':1, 'two':2})
df

                A         B         C         D  E
bar one  0.956122 -0.705841  1.192686 -0.237942  1
    two  1.155288  0.438166  1.122328 -0.997020  2
baz one -0.106794  1.451429 -0.618037 -2.037201  1
    two -1.942589 -2.506441 -2.114164 -0.411639  2
foo one  1.278528 -0.442229  0.323527 -0.109991  1
    two  0.008549 -0.168199 -0.174180  0.461164  2
qux one -1.175983  1.010127  0.920018 -0.195057  1
    two  0.805393 -0.701344 -0.537223  0.156264  2

answered Apr 2, 2019 at 3:55

cs95

406k106 gold badges744 silver badges797 bronze badges

Comments

U13-Forward · Accepted Answer · 2019-04-02 04:07:56Z

0

You can just get it from df.index.labels:

df['E'] = df.index.labels[1] + 1
print(df)

Output:

                A         B         C         D  E
bar one  0.746123  1.264906  0.169694 -0.180074  1
    two -1.439730 -0.100075  0.929750  0.511201  2
baz one  0.833037  1.547624 -1.116807  0.425093  1
    two  0.969887 -0.705240 -2.100482  0.728977  2
foo one -0.977623 -0.800136 -0.361394  0.396451  1
    two  1.158378 -1.892137 -0.987366 -0.081511  2
qux one  0.155531  0.275015  0.571397 -0.663358  1
    two  0.710313 -0.255876  0.420092 -0.116537  2

Thanks to coldspeed, if you want different values (i.e x and y), use:

df['E'] = pd.Series(df.index.labels[1]).map({0: 'x', 1: 'y'}).tolist()
print(df)

edited Apr 2, 2019 at 4:07

answered Apr 2, 2019 at 3:54

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

Collectives™ on Stack Overflow

Add column to pandas multiindex dataframe

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related