How to create pd.MultiIndex for existing pd.DataFrame object, using Pandas and Python 3?

Question

I'm looking for something equivalent to pd.read_table(path/to/file, index_col=[0,1]) for an existing pd.DataFrame.

I frequently encounter pd.DataFrames that have the following format:

# Index Data
iters = 3*[1] + 3*[2] + 3*[3]
clusters = 3*[1,2,3]

# Recreate DataFrame
DF_A = pd.DataFrame([iters, clusters], index = ["iteration", "cluster"]).T
DF_B = pd.DataFrame(np.random.RandomState(0).normal(size=(100,9)), index = ["attr_%d"%_ for _ in range(100)]).T
DF_concat = pd.concat([DF_A, DF_B], axis=1).set_index("iteration", drop=True)
DF_concat.head()

If I loaded these into Python, I would just do index_col=[0,1] like I described above but how can I convert a prexisting pd.DataFrame pd.Index into a pd.MultiIndex so iteration is the outer index level and cluster is the inner index level?

I tried the following but the assignments got messed up. There should only be 3 per iteration for the simple example I made:

DF_B.index = pd.MultiIndex(levels=[DF_concat["cluster"].index.tolist(), DF_concat["cluster"].tolist()], labels=[DF_concat["cluster"].index.tolist(), DF_concat["cluster"].tolist()], names=["iteration", "cluster"])
DF_B

burhan · Accepted Answer · 2016-11-21 19:17:42Z

1

How about this..

DF_concat.set_index([DF_concat.index, 'cluster'])

answered Nov 21, 2016 at 19:17

burhan

9244 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

O.rka Over a year ago

I didn't know you could called the index while you're setting it. Thanks!

Collectives™ on Stack Overflow

How to create pd.MultiIndex for existing pd.DataFrame object, using Pandas and Python 3?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related