5

I have a dataframe and I want to transpose only few rows to column.

This is what I have now.

   Entity   Name        Date  Value
0     111  Name1  2018-03-31    100
1     111  Name2  2018-02-28    200
2     222  Name3  2018-02-28   1000
3     333  Name1  2018-01-31   2000

I want to create date as the column and then add value. Something like this:

   Entity   Name  2018-01-31  2018-02-28  2018-03-31
0     111  Name1         NaN         NaN       100.0
1     111  Name2         NaN       200.0         NaN
2     222  Name3         NaN      1000.0         NaN
3     333  Name1      2000.0         NaN         NaN

I can have identical Name for two different Entitys. Here is an updated dataset.

Code:

import pandas as pd
import datetime

data1 = {
         'Entity': [111,111,222,333],
         'Name': ['Name1','Name2', 'Name3','Name1'],
         'Date': [datetime.date(2018,3, 31), datetime.date(2018,2,28), datetime.date(2018,2,28), datetime.date(2018,1,31)],
         'Value': [100,200,1000,2000]
    }
df1 = pd.DataFrame(data1, columns= ['Entity','Name','Date', 'Value'])

How do I achieve this? Any pointers? Thanks all.

2 Answers 2

10

Based on your update, you'd need pivot_table with two index columns -

v = df1.pivot_table(
        index=['Entity', 'Name'], 
         columns='Date', 
         values='Value'
).reset_index()
v.index.name = v.columns.name = None

v
   Entity   Name  2018-01-31  2018-02-28  2018-03-31
0     111  Name1         NaN         NaN       100.0
1     111  Name2         NaN       200.0         NaN
2     222  Name3         NaN      1000.0         NaN
3     333  Name1      2000.0         NaN         NaN
Sign up to request clarification or add additional context in comments.

5 Comments

Exactly what I was looking for. Thanks for helpful comments.
This fails when I have same name for two different entity. I have updated question.
@COLDSPEED I just updated question with another test data. where new entity 333 has name as Name1. Error was "Reindexing only valid with uniquely valued Index objects" , which makes sense.
@ProgSky Added pivot_table solution, that should work regardless. Didn't know you needed a multiIndex. Should be quite efficient. Cheers
@ProgSky You're welcome. Please consider accepting an answer here if it was helpful.
4

From unstack

df1.set_index(['Entity','Name','Date']).Value.unstack().reset_index()

Date  Entity   Name  2018-01-31 00:00:00  2018-02-28 00:00:00  \
0        111  Name1                  NaN                  NaN   
1        111  Name2                  NaN                200.0   
2        222  Name3                  NaN               1000.0   
3        333  Name1               2000.0                  NaN   

Date  2018-03-31 00:00:00  
0                   100.0  
1                     NaN  
2                     NaN  
3                     NaN

4 Comments

Thanks for the help. Second time it's happened. First time was here, this morning
@cᴏʟᴅsᴘᴇᴇᴅ sign , seems like it happens again and again, and they just do not take our suggestion.
Thanks @Wen , I am learning lot of Python pandas from you :)
@ProgSky to tell the truth , I am learning those from Jez, pir and cold :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.