python pandas multi-level indexing - adding new columns

Question

I have been using panels in place of dataframes with multi-level indexing because they seem to be faster for large datasets. But I'm now transitioning to the Midx framework. With panel, I can do this easily:

import pandas as pd
pan = pd.Panel(np.random.randn(3,5,2),items=['p1','p2','p3'],minor_axis=['a','b'])

Then add a new item:

pan['p4'] = pd.DataFrame(np.random.randn(10,2),columns=['a','b'])

But with Midx:

cols = [['p1','p2','p3'],['a','b']]
idx = pd.MultiIndex.from_product(cols)
df_midx = pd.DataFrame(np.random.randn(10,6),columns=idx)

This returns an error:

df_midx['p4'] = pd.DataFrame(np.random.randn(10,2),columns=['a','b'])

ValueError: Wrong number of items passed 2, placement implies 1

Bharath M Shetty · Accepted Answer · 2017-12-30 02:07:29Z

1

You can use concat since you are trying to assign a dataframe and you have multi level columns i.e

Step 1: Make the dataframe a multi level column dataframe

samp = pd.DataFrame(pd.np.random.randn(10,2),columns=['a','b'])
p4 = pd.concat([samp], keys=['p4'],axis=1)

Or:

new_idx = pd.MultiIndex.from_product([['p4'],['a','b']])
p4 = pd.DataFrame(pd.np.random.randn(10,2),columns=new_idx)

Step2: Concat both the dataframes

ndf = pd.concat([df_midx,p4] ,axis=1)

        p1                  p2                  p3                  p4  \
          a         b         a         b         a         b         a   
0 -0.345972 -0.091595  1.524982 -1.181117 -1.288529 -1.295967  0.199311   
1 -0.398007  0.805862  0.109550  0.449695  0.342036  0.516858  1.128231   
2 -1.141256  0.614402  1.512875 -1.469454  0.637108 -0.413336 -1.483573   
3 -0.018409  0.842007  0.170275  1.731468  0.022853 -1.665722 -1.174225   
4 -0.407416  0.635482 -0.486413  0.090096  0.489290 -1.704067 -2.228681   
5  0.283725 -1.314413  0.382782 -1.139884  0.607638 -1.682241  1.479211   
6  0.369212  0.378822 -0.714765 -0.796454  0.840744  1.399895 -1.204143   
7  1.214798 -0.134845  1.274823 -0.319794  1.658468  1.442076 -2.118546   
8  0.305107 -1.649617 -0.424912  1.520576 -1.285289  0.476907 -1.104102   
9  1.175882 -1.677547 -0.842787 -0.585976  0.046749 -0.369360 -1.339593   


          b  
0 -0.438747  
1  0.395792  
2  0.561690  
3 -0.739772  
4  0.745308  
5  0.734140  
6  0.112849  
7  0.314292  
8  2.363909  
9 -1.741678

edited Dec 30, 2017 at 2:07

answered Dec 29, 2017 at 16:10

Bharath M Shetty

30.6k6 gold badges65 silver badges111 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

MJS Over a year ago

ty Dark. what is the purpose of p4 = pd.concat([samp], keys=['p4'],axis=1)? is there no way to combine to the ndf step?

Collectives™ on Stack Overflow

python pandas multi-level indexing - adding new columns

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related