7

I have a CSV that produces a Dataframe in the following format

--------------------------------------------------------------
|Date       | Fund | TradeGroup | LongShort | Alpha | Details|
--------------------------------------------------------------
|2018-05-22 |A     | TGG-A      | Long      | 3.99  | Misc   |
|2018-05-22 |A     | TGG-B      | Long      | 4.99  | Misc   |
|2018-05-22 |B     | TGG-A      | Long      | 5.99  | Misc   |
|2018-05-22 |B     | TGG-B      | Short     | 6.99  | Misc   |
|2018-05-22 |C     | TGG-A      | Long      | 1.99  | Misc   |
|2018-05-22 |C     | TGG-B      | Long      | 5.29  | Misc   |
--------------------------------------------------------------

What I would like to Do is, Group TradeGroups together and convert Fund to columns. So, the final dataframe should look like this:

  --------------------------------------------------------
  |TradeGroup| Date      | A         | B         | C     |
  --------------------------------------------------------
  | TGG-A    |2018-05-22 | 3.99      | 5.99      | 1.99  |
  | TGG-B    |2018-05-22 | 4.99      | 6.99      | 5.29  | 
  --------------------------------------------------------

Also, I don't really care about the LongShort Column and Details Column. So, it's okay if they're dropped. Thanks!! I have tried df.pivot() but it doesn't give the required format

1
  • 2
    Try df.set_index(['Date','TradeGroup','Fund']).unstack(level=2)['Alpha'] Commented May 22, 2018 at 22:50

2 Answers 2

7

Using pd.pivot_table:

res = df.pivot_table(index=['Date', 'TradeGroup'], columns='Fund',
                     values='Alpha', aggfunc='first').reset_index()

print(res)

Fund        Date TradeGroup     A     B     C
0     2018-05-22      TGG-A  3.99  5.99  1.99
1     2018-05-22      TGG-B  4.99  6.99  5.29
Sign up to request clarification or add additional context in comments.

Comments

4

Looks like you are attempting to unstack a column from a multiindex.

Try this:

import pandas as pd

data = '''\
Date        Fund  TradeGroup  LongShort  Alpha  Details
2018-05-22 A      TGG-A       Long       3.99   Misc   
2018-05-22 A      TGG-B       Long       4.99   Misc   
2018-05-22 B      TGG-A       Long       5.99   Misc   
2018-05-22 B      TGG-B       Short      6.99   Misc   
2018-05-22 C      TGG-A       Long       1.99   Misc   
2018-05-22 C      TGG-B       Long       5.29   Misc'''

fileobj = pd.compat.StringIO(data)

df = pd.read_csv(fileobj, sep='\s+')

dfout = df.set_index(['TradeGroup','Date','Fund']).unstack()['Alpha']
print(dfout)

Returns:

Fund                      A     B     C
TradeGroup Date                        
TGG-A      2018-05-22  3.99  5.99  1.99
TGG-B      2018-05-22  4.99  6.99  5.29

If you want you can also apply a .reset_index() after and you get:

Fund TradeGroup        Date     A     B     C
0         TGG-A  2018-05-22  3.99  5.99  1.99
1         TGG-B  2018-05-22  4.99  6.99  5.29

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.