2

How can I join values in columns with the same name in MultiIndex pandas DataFrame?

data = [['1','1','2','3','4'],['2','5','6','7','8']]
df = pd.DataFrame(data, columns=['id','A','B','A','B'])
df = df.set_index('id')
df.columns = pd.MultiIndex.from_tuples([('result','A'),('result','B'),('student','A'),('student','B')])

df
   result    student   
        A  B       A  B
id                     
1       1  2       3  4
2       5  6       7  8

Desired results:

        A       B
id                     
1       "1 3"   "2 4"
2       "5 7"   "6 8"
1
  • 2
    you can try swaplevel Commented Oct 9, 2017 at 14:47

2 Answers 2

2

I am not completely sure what you are asking. If you have two separate dataframes then you should be able to just use pd.concat.

pd.concat([df1, df2], axis=1)

If you have one dataframe then just drop the top level of the index.

df.columns = df.columns.droplevel(0)
Sign up to request clarification or add additional context in comments.

1 Comment

pd.concat([df['result'],df['student']], axis=1) or df.columns = df.columns.droplevel(0) result ` A B A B` ` id ` ` 1 1 2 3 4` ` 2 5 6 7 8`
1

New answer:

For join values by second level of MultiIndex in columns use groupby with agg:

#select columns define in list
df = df[['result','student']]
df1 = df.astype(str).groupby(level=1, axis=1).agg(' '.join)
print (df1)
      A    B
id          
1   1 3  2 4
2   5 7  6 8

Old answer:

You can use sort_index for sorting columns and then droplevel for remove first level of MultiIndex.

But get duplicate columns names.

print (df)
   result    student    col   
        A  B       A  B   A  B
id                            
1       1  2       3  4   6  7
2       5  6       7  8   2  1

#select columns define in list
df = df[['result','student']]
print (df)
   result    student   
        A  B       A  B
id                     
1       1  2       3  4
2       5  6       7  8

df = df.sort_index(axis=1, level=1)
df.columns = df.columns.droplevel(0)
print (df)
    A  A  B  B
id            
1   1  3  2  4
2   5  7  6  8

So better, unique columns names can be created by map with join:

df = df.sort_index(axis=1, level=1)
df.columns = df.columns.map('_'.join)
print (df)
    result_A  student_A  result_B  student_B
id                                          
1          1          3         2          4
2          5          7         6          8

df = pd.concat([df['result'],df['student']], axis=1).sort_index(axis=1)
print (df)
    A  A  B  B
id            
1   1  3  2  4
2   5  7  6  8

2 Comments

I want to join the values based on column name
Please check last edit. Is not problem get duplicate columns names?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.