0

I'm new to pandas and I need help. I have two following reports which are quite simple.

$ cat test_report1
ID;TYPE;VAL
1;USD;5
2;EUR;10
3;PLN;3
$ cat test_report2
ID;TYPE;VAL
1;USD;5
2;EUR;10
3;PLN;1

Then I'm using concat to connect two reports with unique index:

A=pd.read_csv('test_report1', delimiter=';', index_col=False)
B=pd.read_csv('test_report2', delimiter=';', index_col=False)
C=pd.concat([A.set_index('ID'), B.set_index('ID')], axis=1, keys=['PRE','POST'])
print(C)

Which gives me following output:

          PRE     POST
         TYPE VAL TYPE VAL
ID
1         USD   5  USD   5
2         EUR  10  EUR  10
3         PLN   3  PLN   1

I find this pretty good but actually I would like rather to have:

     STATE TYPE VAL
ID
1         PRE  USD  5  
          POST USD  5
2         PRE  EUR  10
          POST EUR  10
3         PRE  PLN  3  
          POST PLN  1

Then it would be perfect with diff like:

         STATE TYPE VAL
ID
1         PRE  Nan  Nan
          POST Nan  Nan
2         PRE  Nan  Nan
          POST Nan  Nan
3         PRE  PLN  3  
          POST PLN  1

I know that this is doable but I'm stuck digging 3rd day to find a solution.

1

1 Answer 1

1

Use DataFrame.rename_axis with DataFrame.stack and then sorting levels of MultiIndex:

df = (df.rename_axis(['STATE',None], axis=1)
        .stack(0)
        .sort_index(level=[0,1], ascending=[True, False])
        )
print (df)
         TYPE  VAL
ID STATE          
1  PRE    USD    5
   POST   USD    5
2  PRE    EUR   10
   POST   EUR   10
3  PRE    PLN    3
   POST   PLN    1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.