Add multi-index to pandas dataframe and keep current index

Question

I am trying to merge time-course data from different participants. I am iteratively extracting a dataframe per participant and concatenating them at the end of the loop. Before I concatenate, I would like to add the ID of my participants to an additional index.

This seems REALLY straightforward, but I was unable to find anything on this issue :(

I would like to turn this

    col
0     1
1   1.1
2   NaN

Into:

          col
ID    0     1
      1   1.1
      2   NaN

I know I could make a new index like:

multindex = [np.array(ID*len(data)),np.array(np.arange(len(data)))]

But that's inelegant without end, and - seeing as I am measuring with high frequency over half an hour - would even get kind of slow :/

I would like to mention that I have recently found my question to be a duplicate of this other question. However mine apparently has more upvotes and better answers. “Prepend” apparently doesn't seem to draw as many hits.

It might be better to just add the ID as a column, then add to index when you concat ? — Andy Hayden
– Andy Hayden, Commented Nov 20, 2013 at 0:59
Andy, that is indeed the answer given in the duplicate as well. Seems to me like a better answer than the one accepted here. — egpbos
– egpbos, Commented Oct 8, 2014 at 5:49

HYRY · Accepted Answer · 2014-10-08 10:54:34Z

14

Maybe you can use keys argument of concat:

import numpy as np
import pandas as pd

df1 = pd.DataFrame(np.random.rand(3, 2))
df2 = pd.DataFrame(np.random.rand(4, 2))
df3 = pd.DataFrame(np.random.rand(5, 2))

print pd.concat([df1, df2, df3], keys=["A", "B", "C"])

output:

            0         1
A 0  0.863774  0.794880
  1  0.578503  0.418619
  2  0.215317  0.146167
B 0  0.655829  0.116917
  1  0.862316  0.812847
  2  0.500126  0.689218
  3  0.653439  0.270427
C 0  0.825213  0.882963
  1  0.579436  0.332047
  2  0.456948  0.718893
  3  0.795074  0.826773
  4  0.049676  0.697471

If you want to append other dataframes later:

df4 = pd.DataFrame(np.random.rand(6, 2))
pd.concat([df, pd.concat([df4], keys=["D"])])

edited Oct 8, 2014 at 10:54

answered Nov 20, 2013 at 1:39

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

egpbos Over a year ago

Though the keys can be useful at times, I don't think this actually answers the question. This way another concat cannot be made afterwards, which the question seemed to imply would be necessary. So, e.g. I don't see how you could easily append df4 = pd.DataFrame(np.random.rand(3,2)) with a new key "D" after having done the above. This answer seems more suited to this purpose.

Collectives™ on Stack Overflow

Add multi-index to pandas dataframe and keep current index

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related