Create new nested column within dataframe

Question

I have the following

df1 = pd.DataFrame({'data': [1,2,3]})
df2 = pd.DataFrame({'data': [4,5,6]})
df = pd.concat([df1,df2], keys=['hello','world'], axis=1)

What is the "proper" way of creating a new nested column (say, df['world']['data']*2) within the hello column? I have tried df['hello']['new_col'] = df['world']['data']*2 but this does not seem to work.

jezrael · Accepted Answer · 2020-05-14 06:14:36Z

1

Use tuples for select and set MultiIndex:

df[('hello','new_col')] = df[('world','data')]*2
print (df)
  hello world   hello
   data  data new_col
0     1     4       8
1     2     5      10
2     3     6      12

Selecting like df['world']['data'] is not recommended - link, because possible chained indexing.

edited May 14, 2020 at 6:14

answered May 14, 2020 at 6:07

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user10400458 Over a year ago

Thanks for this, can you also explain the difference between ['hello']['new_col'] and [('hello','new_col')] when assigning columns to a df?

Collectives™ on Stack Overflow

Create new nested column within dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related