Replace column of pandas multi-index DataFrame with another DataFrame

Question

I have a pandas DataFrame like this:

import pandas as pd
import numpy as np

data1 = np.repeat(np.array(range(3), ndmin=2), 3, axis=0)
columns1 = pd.MultiIndex.from_tuples([('foo', 'a'), ('foo', 'b'), ('bar', 'c')])
df1 = pd.DataFrame(data1, columns=columns1)
print(df1)

  foo    bar
    a  b   c
0   0  1   2
1   0  1   2
2   0  1   2

And another one like this:

data2 = np.repeat(np.array(range(3, 5), ndmin=2), 3, axis=0)
columns2 = ['d', 'e']
df2 = pd.DataFrame(data2, columns=columns2)
print(df2)

   d  e
0  3  4
1  3  4
2  3  4

Now, I would like to replace 'bar' of df1 with df2, but the regular syntax of single-level indexing doesn't seem to work:

df1['bar'] = df2
print(df1)

  foo    bar
    a  b   c
0   0  1 NaN
1   0  1 NaN
2   0  1 NaN

When what I would like to get is:

  foo    bar
    a  b   d  e
0   0  1   3  4
1   0  1   3  4
2   0  1   3  4

I'm not sure if I'm missing something on the syntax or if this is related to the issues described here and here. Could someone explain why this doesn't work and how to get the desired outcome?

I'm using python 2.7 and pandas 0.24, if it makes a difference.

vbs · Accepted Answer · 2020-01-03 12:22:02Z

1

For lack of better alternative, I'm currently doing this:

df2.columns = pd.MultiIndex.from_product([['bar'], df2.columns])
df1.drop(columns='bar', level=0, inplace=True)
df1 = df1.join(df2)

Which gives the desired result. One needs to be cautious though if the order of columns is important, as this approach will likely change it.

Reading further the mentioned issues on Github, I think the reason the approach in the question doesn't work is indeed related to an inconsistency in the pandas API that hasn't been fixed yet.

answered Jan 3, 2020 at 12:22

vbs

466 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

killezio Over a year ago

Thank you for that alternative. Can you link the issue?

vbs Over a year ago

You're welcome. The issues are linked on the original post.

Collectives™ on Stack Overflow

Replace column of pandas multi-index DataFrame with another DataFrame

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related