2

I have a multi-index data frame with rows "bar" and "baz" and each of this rows has a row "one" and "two". I now want to add a row "three" to each row "bar" and foo".

Is there an elegant way to do so?

For example:

import pandas as pd
import numpy as np

arrays = [["bar", "bar", "baz", "baz"],
          ["one", "two", "one", "two"]]

tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
df = pd.DataFrame(np.random.randn(3, 4), index=["A", "B", "C"], columns=index)
In [38]: df
Out[38]: 
first        bar                 baz          
second       one       two       one       two
A       0.357392 -1.880279  0.099014  1.354570
B       0.474572  0.442074 -1.173530 -1.362059
C      -0.980140 -0.173440 -1.490654 -0.539123

and i want something like this:

first        bar                           baz                    
second       one       two     three       one       two     three
A      -0.096890  0.012150       nan -0.749569 -0.965033       nan
B      -0.854206  0.118473       nan  0.263058 -0.025849       nan
C      -0.688007 -0.258569       nan  0.127305 -0.955044       nan

2 Answers 2

1

I don't know how Python-like it is, but there are two ways to do this: simple substitution and using inserts.

  1. Substitution
df[('bar','three')] = np.NaN
df[('baz','three')] = np.NaN 
  1. insert
df.insert(2,('bar','three'),np.NaN)
df.insert(5,('baz','three'),np.NaN)
first                   bar                     baz
second  one     two     three   one     two     three
A   -0.973338   -0.233507   NaN     0.777288    -2.282688   NaN
B   -0.377486   0.080627    NaN     0.401302    0.355696    NaN
C   0.481056    0.651335    NaN     0.161145    1.001937    NaN

Sign up to request clarification or add additional context in comments.

Comments

1

For a general answer when you don't necessarily know the names of indices in level 0 and generally want to do this for every level 0 index:

First, we should create the NaN matrix we want to inject. It has len(df) number of rows and for columns we should find how many level 0 columns we have in the dataframe. After we create it, we make it a dataframe with the same indices and columns as our multindex dataframe. Note that for this dataframe we just need the levels[0] of the original dataframe because for the next level we want to have 'three'.

a = np.full((len(df),len(df.columns.levels[0])), np.nan)

inject_df = pd.DataFrame(a, index=df.index, columns=pd.MultiIndex.from_product([df.columns.levels[0], ['three']]))
inject_df

first  bar     baz
       three   three
A      NaN     NaN
B      NaN     NaN
C      NaN     NaN

At last, we concat the injected df with the original one and sort the index so that the ones sharing level(0) indices fall beside each other.

result = pd.concat([df, inject_df], axis=1).sort_index(level=0, axis=1)
result

first   bar                         baz
second  one    three    two         one        three    two
A    -0.995944  NaN   -0.437629    -0.629472    NaN    1.919711
B    -0.402886  NaN   0.262420      0.117202    NaN    -1.234542
C    1.281046   NaN   -1.058977     0.447767    NaN    2.374122

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.