0

I have a pandas DataFrame in the following long format, where each observation is the expenditure on a given date, for each port, for each transport type.

import pandas as pd
import io  

df_s = """freight_type; port; 2020-01-01; 2020-01-02; 2020-01-03; 2020-01-04; 2020-01-05
rail; qd; 51204; 1024151; 100510; 715011; 1004151
vessel; qd; 81294; 1024151; 100510; 715011; 1004151
truck; qd; 59434; 756151; 100510; 715011; 1004151 
rail; ts; 199952; 521151; 100510; 715011; 1004151
vessel; ts; 515254; 224151; 100510; 715011; 1004151
truck; ts; 512552; 104151; 100510; 715011; 1004151
"""

df = pd.read_csv(io.StringIO(df_s), sep="; ")
df.head(6)
>>> 
freight_type port 2020-01-01 2020-01-02 2020-01-03 2020-01-04 2020-01-05
0   rail     qd   51204      1024151    100510     715011     1004151
1   vessel   qd   81294      1024151    100510     715011     1004151
2   truck    qd   59434      756151     100510     715011     1004151
3   rail     ts   199952     521151     100510     715011     1004151
4   vessel   ts   515254     224151     100510     715011     1004151
5   truck    ts   512552     104151     100510     715011     1004151

I would like to filter by port and then transpose the dataset in a way that I can set the date as an index, and create an individual column for each transport_type column.

Desired output:

port_qd_df:

date         rail_exp   vessel_exp  truck_exp
'2020-01-01'   51204    81294       59434       
'2020-01-01'   1024151  1024151     756151  
'2020-01-01'   100510   100510      100510

I have tried using pd.wide_to_long, however, I am struggling with its implementation/documentation as to how to get it to reach my solution.

Any help very greatly appreciated.

1 Answer 1

1

You can create a transposed pivot table:

df_pivot = pd.pivot_table(df, index=['port','freight_type']).T

Result:

port               qd                         ts                  
freight_type     rail    truck   vessel     rail    truck   vessel
2020-01-01      51204    59434    81294   199952   512552   515254
2020-01-02    1024151   756151  1024151   521151   104151   224151
2020-01-03     100510   100510   100510   100510   100510   100510
2020-01-04     715011   715011   715011   715011   715011   715011
2020-01-05    1004151  1004151  1004151  1004151  1004151  1004151

You can then extract separate dataframes by taking a cross-section:

port_qd_df = df_pivot.xs('qd', axis=1)
Sign up to request clarification or add additional context in comments.

1 Comment

This is perfect, thank you. I did not think to use pivot table.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.