2

I am woring with pandas and I have an existing dataframe with 6 columns, with one level of index that looks like this:

No a b c d e f
1 34 43 29 78 29 68
2 29 28 57 39 10 37

and I want to add a second level of index so that it will look like this:

lvl1 1 1 2 2 3 3
lvl2 a b c d e f
1 34 43 29 78 29 68
2 29 28 57 39 10 37

please how do I go about this using MultiIndex?

4
  • 2
    How do you want the values of level1 and level2 of index to be taken from? Commented Feb 24, 2022 at 15:22
  • I want to input them my self above the current columns Commented Feb 24, 2022 at 16:20
  • Then what you want is a multilevel column, not a multilevel index. In the end dataframe are meant to be transposable (swapping index <-> columns) so it's not much different. Commented Feb 24, 2022 at 16:31
  • I figured it the design. see my answer Commented Feb 24, 2022 at 19:32

2 Answers 2

2

Not sure how/where you want to pick the index values from, so let me share a vanilla and easy to generalize way of having a multi-indexed dataframe:

df = pd.DataFrame(data=np.arange(50).reshape(-1,10))
df.index = pd.MultiIndex.from_tuples((i,i) for i in range(len(df)))
df
# = +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
#   |        |   0 |   1 |   2 |   3 |   4 |   5 |   6 |   7 |   8 |   9 |
#   |--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----|
#   | (0, 0) |   0 |   1 |   2 |   3 |   4 |   5 |   6 |   7 |   8 |   9 |
#   | (1, 1) |  10 |  11 |  12 |  13 |  14 |  15 |  16 |  17 |  18 |  19 |
#   | (2, 2) |  20 |  21 |  22 |  23 |  24 |  25 |  26 |  27 |  28 |  29 |
#   | (3, 3) |  30 |  31 |  32 |  33 |  34 |  35 |  36 |  37 |  38 |  39 |
#   | (4, 4) |  40 |  41 |  42 |  43 |  44 |  45 |  46 |  47 |  48 |  49 |
#   +--------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

Based on your comment you could try:

# creating dummy data
df = pd.DataFrame(data=np.arange(60).reshape(-1, 6))
# creating Multi Index column, from a tuple of (level_0_value, level_1_value) entries
new_columns = pd.MultiIndex.from_tuples((i//2 + 1,column_name) for i, column_name in enumerate(df))
# replacing dataframe columns with the newly created ones
df.columns = new_columns
Sign up to request clarification or add additional context in comments.

5 Comments

From my original table, iIhave columns labeled from 'a' through to 'f'. What I want in effect is that columns 'a', and 'b', will be sub-columns under the new column '1', while columns 'c', and 'd', will be sub columns under the column '2'. The same thing for columns 'e', and 'f', the will be sub columns under the new column '3'.
Noted, check my edit, I have added an example working on columns the way you seem to want to
Please sorry to trouble you, I'm a bit new to pandas. Can you add some comments to explain what is going on?
Have broken it down a bit more + added some explanations
reshape is not part of the solution, I just need to create 2d padding data to set up an example. I could have used data=np.random.rand(10,6) to get the same result (in terms of columns and index).
0

create the tuple multi index for two index levels where level 0 is 1 and 2 and level 1 is a,b,c,d,e,f. Next extract A as a list of No 1 values and B as a list of No 2 values. Create the multi index and then create dataframe df2 using the lst_1 and lst_2 values for A and B and set the index to the multi-level index.

data="""No  a   b   c   d   e   f
1   34  43  29  78  29  68
2   29  28  57  39  10  37
"""
df = pd.read_csv(StringIO(data), sep="\s+").reset_index()
df.reset_index(inplace=True)
print(df.columns)
lst=[(1,'a'),(1,'b'),
     (2,'c'),(2,'d'),
     (3,'e'),(3,'f')
    ]

index=pd.MultiIndex.from_tuples(lst,names=['ID1','ID2'])

exclude=["No","level_0","index"]
columns=[x for x in df.columns if x not in exclude]

lst_1=np.array(df[df['No']==1][columns].unstack())
lst_2=np.array(df[df['No']==2][columns].unstack())

print(lst_1)
print(lst_2)
df2=pd.DataFrame({'A':lst_1,'B':lst_2},index=index)

print(df2)

output:

          A   B
ID1 ID2        
1   a    34  29
    b    43  28
2   c    29  57
    d    78  39
3   e    29  10
    f    68  37

fp=df2.pivot_table(columns=['ID1','ID2'])

print(fp)

output

    ID1   1       2       3    
ID2   a   b   c   d   e   f
A    34  43  29  78  29  68
B    29  28  57  39  10  37

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.