3

I have been using panels in place of dataframes with multi-level indexing because they seem to be faster for large datasets. But I'm now transitioning to the Midx framework. With panel, I can do this easily:

import pandas as pd
pan = pd.Panel(np.random.randn(3,5,2),items=['p1','p2','p3'],minor_axis=['a','b'])

Then add a new item:

pan['p4'] = pd.DataFrame(np.random.randn(10,2),columns=['a','b'])

But with Midx:

cols = [['p1','p2','p3'],['a','b']]
idx = pd.MultiIndex.from_product(cols)
df_midx = pd.DataFrame(np.random.randn(10,6),columns=idx)

This returns an error:

df_midx['p4'] = pd.DataFrame(np.random.randn(10,2),columns=['a','b'])

ValueError: Wrong number of items passed 2, placement implies 1

1 Answer 1

1

You can use concat since you are trying to assign a dataframe and you have multi level columns i.e

Step 1: Make the dataframe a multi level column dataframe

samp = pd.DataFrame(pd.np.random.randn(10,2),columns=['a','b'])
p4 = pd.concat([samp], keys=['p4'],axis=1) 

Or:

new_idx = pd.MultiIndex.from_product([['p4'],['a','b']])
p4 = pd.DataFrame(pd.np.random.randn(10,2),columns=new_idx)

Step2: Concat both the dataframes

ndf = pd.concat([df_midx,p4] ,axis=1)

        p1                  p2                  p3                  p4  \
          a         b         a         b         a         b         a   
0 -0.345972 -0.091595  1.524982 -1.181117 -1.288529 -1.295967  0.199311   
1 -0.398007  0.805862  0.109550  0.449695  0.342036  0.516858  1.128231   
2 -1.141256  0.614402  1.512875 -1.469454  0.637108 -0.413336 -1.483573   
3 -0.018409  0.842007  0.170275  1.731468  0.022853 -1.665722 -1.174225   
4 -0.407416  0.635482 -0.486413  0.090096  0.489290 -1.704067 -2.228681   
5  0.283725 -1.314413  0.382782 -1.139884  0.607638 -1.682241  1.479211   
6  0.369212  0.378822 -0.714765 -0.796454  0.840744  1.399895 -1.204143   
7  1.214798 -0.134845  1.274823 -0.319794  1.658468  1.442076 -2.118546   
8  0.305107 -1.649617 -0.424912  1.520576 -1.285289  0.476907 -1.104102   
9  1.175882 -1.677547 -0.842787 -0.585976  0.046749 -0.369360 -1.339593   


          b  
0 -0.438747  
1  0.395792  
2  0.561690  
3 -0.739772  
4  0.745308  
5  0.734140  
6  0.112849  
7  0.314292  
8  2.363909  
9 -1.741678  
Sign up to request clarification or add additional context in comments.

1 Comment

ty Dark. what is the purpose of p4 = pd.concat([samp], keys=['p4'],axis=1)? is there no way to combine to the ndf step?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.