Add a Multi-level Index on existing dataframe

Question

I am woring with pandas and I have an existing dataframe with 6 columns, with one level of index that looks like this:

No	a	b	c	d	e	f
1	34	43	29	78	29	68
2	29	28	57	39	10	37

and I want to add a second level of index so that it will look like this:

lvl1	1	1	2	2	3	3
lvl2	a	b	c	d	e	f
1	34	43	29	78	29	68
2	29	28	57	39	10	37

please how do I go about this using MultiIndex?

How do you want the values of level1 and level2 of index to be taken from? — Learning is a mess
– Learning is a mess, Commented Feb 24, 2022 at 15:22
Then what you want is a multilevel column, not a multilevel index. In the end dataframe are meant to be transposable (swapping index <-> columns) so it's not much different. — Learning is a mess
– Learning is a mess, Commented Feb 24, 2022 at 16:31

Learning is a mess · Accepted Answer · 2022-02-24 17:20:42Z

2

Not sure how/where you want to pick the index values from, so let me share a vanilla and easy to generalize way of having a multi-indexed dataframe:

df = pd.DataFrame(data=np.arange(50).reshape(-1,10))
df.index = pd.MultiIndex.from_tuples((i,i) for i in range(len(df)))
df
# = +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
#   |        |   0 |   1 |   2 |   3 |   4 |   5 |   6 |   7 |   8 |   9 |
#   |--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----|
#   | (0, 0) |   0 |   1 |   2 |   3 |   4 |   5 |   6 |   7 |   8 |   9 |
#   | (1, 1) |  10 |  11 |  12 |  13 |  14 |  15 |  16 |  17 |  18 |  19 |
#   | (2, 2) |  20 |  21 |  22 |  23 |  24 |  25 |  26 |  27 |  28 |  29 |
#   | (3, 3) |  30 |  31 |  32 |  33 |  34 |  35 |  36 |  37 |  38 |  39 |
#   | (4, 4) |  40 |  41 |  42 |  43 |  44 |  45 |  46 |  47 |  48 |  49 |
#   +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

Based on your comment you could try:

# creating dummy data
df = pd.DataFrame(data=np.arange(60).reshape(-1, 6))
# creating Multi Index column, from a tuple of (level_0_value, level_1_value) entries
new_columns = pd.MultiIndex.from_tuples((i//2 + 1,column_name) for i, column_name in enumerate(df))
# replacing dataframe columns with the newly created ones
df.columns = new_columns

edited Feb 24, 2022 at 17:20

answered Feb 24, 2022 at 15:32

Learning is a mess

8,3178 gold badges42 silver badges82 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

obj Over a year ago

From my original table, iIhave columns labeled from 'a' through to 'f'. What I want in effect is that columns 'a', and 'b', will be sub-columns under the new column '1', while columns 'c', and 'd', will be sub columns under the column '2'. The same thing for columns 'e', and 'f', the will be sub columns under the new column '3'.

Learning is a mess Over a year ago

Noted, check my edit, I have added an example working on columns the way you seem to want to

obj Over a year ago

Please sorry to trouble you, I'm a bit new to pandas. Can you add some comments to explain what is going on?

Learning is a mess Over a year ago

Have broken it down a bit more + added some explanations

Learning is a mess Over a year ago

reshape is not part of the solution, I just need to create 2d padding data to set up an example. I could have used data=np.random.rand(10,6) to get the same result (in terms of columns and index).

ListenSoftware Louise Ai Agent · Accepted Answer · 2022-02-26 13:08:52Z

create the tuple multi index for two index levels where level 0 is 1 and 2 and level 1 is a,b,c,d,e,f. Next extract A as a list of No 1 values and B as a list of No 2 values. Create the multi index and then create dataframe df2 using the lst_1 and lst_2 values for A and B and set the index to the multi-level index.

data="""No  a   b   c   d   e   f
1   34  43  29  78  29  68
2   29  28  57  39  10  37
"""
df = pd.read_csv(StringIO(data), sep="\s+").reset_index()
df.reset_index(inplace=True)
print(df.columns)
lst=[(1,'a'),(1,'b'),
     (2,'c'),(2,'d'),
     (3,'e'),(3,'f')
    ]

index=pd.MultiIndex.from_tuples(lst,names=['ID1','ID2'])

exclude=["No","level_0","index"]
columns=[x for x in df.columns if x not in exclude]

lst_1=np.array(df[df['No']==1][columns].unstack())
lst_2=np.array(df[df['No']==2][columns].unstack())

print(lst_1)
print(lst_2)
df2=pd.DataFrame({'A':lst_1,'B':lst_2},index=index)

print(df2)

output:

          A   B
ID1 ID2        
1   a    34  29
    b    43  28
2   c    29  57
    d    78  39
3   e    29  10
    f    68  37

fp=df2.pivot_table(columns=['ID1','ID2'])

print(fp)

output

    ID1   1       2       3    
ID2   a   b   c   d   e   f
A    34  43  29  78  29  68
B    29  28  57  39  10  37

Collectives™ on Stack Overflow

Add a Multi-level Index on existing dataframe

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related