1

I have the following

df1 = pd.DataFrame({'data': [1,2,3]})
df2 = pd.DataFrame({'data': [4,5,6]})
df = pd.concat([df1,df2], keys=['hello','world'], axis=1)

What is the "proper" way of creating a new nested column (say, df['world']['data']*2) within the hello column? I have tried df['hello']['new_col'] = df['world']['data']*2 but this does not seem to work.

1 Answer 1

1

Use tuples for select and set MultiIndex:

df[('hello','new_col')] = df[('world','data')]*2
print (df)
  hello world   hello
   data  data new_col
0     1     4       8
1     2     5      10
2     3     6      12

Selecting like df['world']['data'] is not recommended - link, because possible chained indexing.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for this, can you also explain the difference between ['hello']['new_col'] and [('hello','new_col')] when assigning columns to a df?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.