MultiIndex and columns in a DataFrame

Question

When creating a DataFrame with MultiIndex columns it seems not possible to return a single column with a MultiIndex. Instead, an object with an Index is returned:

import pandas as pd
import numpy as np

dates = np.asarray(pd.date_range('1/1/2000', periods=8))
_metaInfo = pd.MultiIndex.from_tuples([('AA', '[m]'), ('BB', '[m]'), ('CC', '[s]'), ('DD', '[s]')], names=['parameter','unit'])

df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=_metaInfo)
print df.get('AA').columns
# Index([[m]], dtype=object)

where the 'parameter' info is missing. Is this a bug, is there a workaround?

Do you mean to say it doesn't have a name attribute (of 'AA')? — Andy Hayden
– Andy Hayden, Commented Nov 26, 2012 at 23:14
No, you loose a lvel of the MultiIndex (in this case the name) — user1515250
– user1515250, Commented Nov 27, 2012 at 10:17

Rutger Kassies · Accepted Answer · 2012-11-27 08:10:53Z

1

I have struggled with this as well. The opposite, adding an extra level to a single (so it matches a MultiIndex), also keeps me busy.

I sometimes use this to keep the index intact:

print df.T[[('AA', '[m]') == col for col in df.columns]].T

parameter         AA
unit             [m]
2000-01-01  0.972434
2000-01-02 -0.581852
2000-01-03 -0.784172
2000-01-04 -0.843441
2000-01-05 -1.030200
2000-01-06 -0.864225
2000-01-07 -0.530056
2000-01-08 -0.651367

But thats not the most flexible solution when your Index is more complex. In this example it would work.

answered Nov 27, 2012 at 8:10

Rutger Kassies

65k17 gold badges119 silver badges102 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user1515250 Over a year ago

There seems to be an inconsistency between MultiIndex rows and MultiIndex columns. Using dates = np.asarray(pd.date_range('1/1/2000', periods=4)) _metaInfo = pd.MultiIndex.from_tuples([('AA', '[m]'), ('BB', '[m]'), ('CC', '[s]'), ('DD', '[s]')], names=['parameter','unit']) df = pd.DataFrame(np.random.randn(4, 4), index=_metaInfo, columns=dates)

user1515250 Over a year ago

There seems to be an inconsistency between MultiIndex rows and MultiIndex columns. Using the transpose allows you to select rows as df["AA":"AA"] which then return a MultiIndex DataFrame (not losing information), however, df.xs("AA", axis=1) returns a DataFrmae with a single level Index (thus losing information). In addition to this, when I define a single level (Index) DataFrame with columns AA and BB then df[df["AA"]>0] will give me all the rows of columns AA and BB where the element in AA is greater than 0.0. However, if I do the same in a MultiIndex column DataFrame, then I get a crash.

Collectives™ on Stack Overflow

MultiIndex and columns in a DataFrame

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related