0

I would like to convert this:

          W         X         Y         Z       tdate tdim  stringdate
0  0.335737 -0.380237 -0.473996 -0.000448  2015-11-20    c  2015-11-20
1 -0.092511  1.149210 -0.834495 -0.585204  2015-05-25    c  2015-05-25
2  1.425946  1.293430 -0.545142  0.287451  2015-12-17    b  2015-12-17
3 -0.199412  1.011220  0.400355  0.470182  2015-03-21    c  2015-03-21
4  2.265234 -0.314174  0.684593 -1.360268  2015-12-23    b  2015-12-23
5 -0.192996  0.150177 -0.684978  0.464223  2015-11-19    b  2015-11-19
6 -1.089473 -1.375843 -0.901250 -0.775122  2015-07-25    c  2015-07-25
7 -0.572807 -2.484145 -2.339342 -0.263439  2015-05-31    a  2015-05-31

To this (with actual data instead of zero placeholder):

date    2015-11-20    2015-05-25    2015-12-17    2015-03-21    2015-12-23     \
metrics          W  X          W  X          W  X          W  X          W  X   
tdim                                                                            
a                0  0          0  0          0  0          0  0          0  0   
b                0  0          0  0          0  0          0  0          0  0   
c                0  0          0  0          0  0          0  0          0  0   

date    2015-11-19    2015-07-25    2015-05-31     
metrics          W  X          W  X          W  X  
tdim                                               
a                0  0          0  0          0  0  
b                0  0          0  0          0  0  
c                0  0          0  0          0  0  

I've skimmed over Pandas MultiIndexes and created one myself. That's how I arrived at the latter DataFrame, but I did not see an example in that documentation where he reindexes an existing DataFrame to have multiple column levels.

I have looked at this Pandas: add a column to a multiindex column dataframe and pandas dataframe select columns in multiindex. These are insightful but not specifically helpful for this case.

I have not attempted a brute force solution (i.e. manually populating data using loops).

I am using Python 2.7.10 and pandas 0.16.1.

1 Answer 1

2

I'm assuming you want to use all ['W', 'X', 'Y', 'Z'] columns, and only one of the date columns. If so, the below should get you there - if you first apply set_index and then unstack, pandas creates the MultiIndex automatically, which you can then swap by level and sort as you wish:

df.columns.name = 'metrics'    
df = df.set_index(['tdate', 'tdim']).drop('stringdate', axis=1).unstack('tdate')
df = df.swaplevel(0,1,axis=1).sort_index(axis=1)

     tdate    11/19/15                                11/20/15                      \
metrics         W         X         Y         Z         W         X         Y   
tdim                                                                            
a             NaN       NaN       NaN       NaN       NaN       NaN       NaN   
b       -0.192996  0.150177 -0.684978  0.464223       NaN       NaN       NaN   
c             NaN       NaN       NaN       NaN  0.335737 -0.380237 -0.473996   

tdate             12/17/15             ...       5/25/15            5/31/15  \
metrics        Z         W        X    ...             Y        Z         W   
tdim                                   ...                                    
a            NaN       NaN      NaN    ...           NaN      NaN -0.572807   
b            NaN  1.425946  1.29343    ...           NaN      NaN       NaN   
c        -1/1/04       NaN      NaN    ...     -0.834495  -1/1/04       NaN   

tdate                                    7/25/15                                
metrics         X         Y          Z         W         X        Y          Z  
tdim                                                                            
a       -2.484145 -2.339342  -0.263439       NaN       NaN      NaN        NaN  
b             NaN       NaN        NaN       NaN       NaN      NaN        NaN  
c             NaN       NaN        NaN -1.089473 -1.375843 -0.90125  -0.775122 
Sign up to request clarification or add additional context in comments.

1 Comment

This works great. I obviously need to look into indexing DataFrame objects a bit more.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.