1

Very new to Pandas and probably been answered somewhere but I can't seem to find exactly what I'm looking for. Assuming my dataset has this type of structure

Animal |  Age  |  Color  | Length

Cat       1       Brown       50cm
Cat       2       White       60cm
Cat       3       Brown       55cm
Dog       1       White       99cm
Dog       2       White       129cm
Dog       3       White       105cm

How can I most easily transform it to this format where the existing columns are appended horizontally rather than being ordered vertical for a specific animal

Animal |  Age_1  |  Color_1  | Length_1 |  Age_2 | Color_2 | Length_2 | Age_3 | Color_3 | Length_3

Cat       1        Brown       50cm         2       White      60cm       3       Brown      55cm
Dog       1        White       99cm         2       White      129cm      3       White      105cm

Maybe not the best example labels to use, but hopefully gets the point across I greatly appreciate links to answers too.

4
  • @jezrael Question 10: How to pivot by two columns. and there's another question on renaming. Commented May 19, 2020 at 13:28
  • @jezrael Why? both contents are provided in the dupe! But sure... Commented May 19, 2020 at 13:29
  • @QuangHoang - It is not 100% dupe, so I think it should be not closed. Commented May 19, 2020 at 13:31
  • 1
    @jezrael Pivoting by set_index and unstack is also mentioned in that question/answer. But then again, that's just my opinion, you already said you disagreed. Please do not tag me again on this. Commented May 19, 2020 at 13:33

1 Answer 1

2

Create MultiIndex by GroupBy.cumcount and DataFrame.set_index, then reshape by DataFrame.unstack and sortinf second level of MultiIndex in columns, then flatten it with f-strings and convert index to column:

df1 = (df.set_index(['Animal', df.groupby('Animal').cumcount().add(1)])
         .unstack()
         .sort_index(axis=1, level=1))
df1.columns = [f'{a}_{b}' for a, b in df1.columns]
df1 = df1.reset_index()
print (df1)
  Animal  Age_1 Color_1 Length_1  Age_2 Color_2 Length_2  Age_3 Color_3  \
0    Cat      1   Brown     50cm      2   White     60cm      3   Brown   
1    Dog      1   White     99cm      2   White    129cm      3   White   

  Length_3  
0     55cm  
1    105cm  
Sign up to request clarification or add additional context in comments.

9 Comments

I never find a nice way to method chain 'flatten multiindex into a single one'. Any suggestion?
@MarkWang - The best should be if support rename, but it working with all columns names separately...
Also expand single header to multindex index header. It used to work with rename in the order version of pandas, now it just gives you a tuple..
Awesome, though multiindex to_flat_index only gives you tuple. But I guess you can chain it with df.rename(columns='{0[0]}_{0[1]}'.format) to flatten the tuple
Try the following? index = pd.MultiIndex.from_product( [['foo', 'bar'], ['baz', 'qux']],names=['a', 'b']) pd.Series([1,2,3,3], index=index.to_flat_index()).rename('{0[0]}_{0[1]}'.format)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.