How to Transpose a table with different columns using python

Question

Here is the initial table as shown below.

Cust ID	Jan Transaction Fee	Jan Transaction Fee	Jan Product Fee	Jan Product Fee	Feb Transaction Fee	Feb Transaction Fee	Feb Product Fee	Feb Product Fee
	HKD	USD	HKD	USD	HKD	USD	HKD	USD
100103	100	20	21	24	215	55	253	25
100104	200	30	31	34	315	65	353	35

I would like to convert the table from above to the below expected result.

Cust ID	Period	Type	FX	Price
100103	202201	Transaction Fee	HKD	100
100103	202201	Transaction Fee	USD	20
100103	202201	Product Fee	HKD	21
100103	202201	Product Fee	USD	24
100103	202202	Transaction Fee	HKD	215
100103	202202	Transaction Fee	USD	55
100103	202202	Product Fee	HKD	253
100103	202202	Product Fee	USD	25
100104	202201	Transaction Fee	HKD	200
100104	202201	Transaction Fee	USD	30
100104	202201	Product Fee	HKD	31
100104	202201	Product Fee	USD	34
100104	202202	Transaction Fee	HKD	315
100104	202202	Transaction Fee	USD	65
100104	202202	Product Fee	HKD	353
100104	202202	Product Fee	USD	35

My coding on import data is below

import pandas as pd 


test=pd.DataFrame({'Cust ID':['','100103','100104'],'Jan Transaction Fee':['HKD',100,200],'Jan Transaction Fee.1':['USD',20,30],\
                  'Jan Product Fee':['HKD',21,31],'Jan Product Fee.1':['USD',24,34],
                  'Feb Transaction Fee':['HKD',215,315],'Feb Transaction Fee.1':['USD',55,65],
                  'Feb Product Fee':['HKD',253,353],'Feb Product Fee.1':['USD',25,35]})

test

is there a way to do the expected result with using Python?

@mozway If there is a way to do it without '.1', then ignore the'.1' column name. Thanks so much for your help — Tony Tang
– Tony Tang, Commented Apr 6, 2022 at 12:49
Looks like a multiindex. Pulled from an excel file? How was the data read? — sammywemmy
– sammywemmy, Commented Apr 6, 2022 at 13:01
@sammywemmy Yes, the dataset is pulled from an excel file, but it is confidential. So, i just manipulated the data first and showed a few customer ID as an example — Tony Tang
– Tony Tang, Commented Apr 6, 2022 at 15:09

mozway · Accepted Answer · 2022-04-06 12:53:07Z

This is a complex reshape.

NB. I ignored the '.1', removed using test.columns = test.columns.map(lambda s: s.strip('.1')).

df = (test
 .T.set_index(0, append=True).T
 .set_index([('Cust ID', '')])
 .stack()
 .rename_axis(index=['Cust ID', 'FX'], columns='Type')
 .stack()
 .reset_index(name='Price')
)

df[['Period', 'Type']] = df['Type'].str.split(n=1, expand=True)
df['Period'] = pd.to_datetime('2022 '+df['Period']).dt.strftime('%Y%m')

output:

  Cust ID   FX             Type Price  Period
0  100103  HKD  Transaction Fee   100  202201
1  100103  USD      Product Fee    25  202202
2  100103  USD  Transaction Fee    55  202202
3  100103  USD      Product Fee    24  202201
4  100103  USD  Transaction Fee    20  202201
5  100104  HKD  Transaction Fee   200  202201
6  100104  USD      Product Fee    35  202202
7  100104  USD  Transaction Fee    65  202202
8  100104  USD      Product Fee    34  202201
9  100104  USD  Transaction Fee    30  202201

Collectives™ on Stack Overflow

How to Transpose a table with different columns using python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related