1

I have a dataframe like this:

    date      number  name  div    a    b   c   d   e   f   ... k   l   m   n   o   p   q   r   s   t
0   2008-01-01  150     A   get_on  379 287 371 876 965 1389    ... 2520    3078    3495    3055    2952    2726    3307    2584    1059    264
1   2008-01-01  150     A   get_off 145 707 689 1037    1170    1376    ... 1955    2304    2203    2128    1747    1593    1078    744 406 558
2   2008-01-01  151     B   get_on  131 131 101 152 191 202 ... 892 900 1154    1706    1444    1267    928 531 233 974
3   2008-01-01  151     B   get_off 35  158 203 393 375 460 ... 1157    1153    1303    1190    830 454 284 141 107 185
4   2008-01-01  152     C   get_on  1287    867 400 330 345 338 ... 1867    2269    2777    2834    2646    2784    2920    2290    802 1559
5   2008-01-01  152     C   get_off 74  147 261 473 597 698 ... 2161    2298    2360    1997    1217    799 461 271 134 210

to

date        number  name    div    a    
2008-01-01  150     A   get_on  379 
2008-01-01  150     A   get_on  287 
2008-01-01  150     A   get_on  371 
2008-01-01  150     A   get_on  876 
2008-01-01  150     A   get_on  965 
2008-01-01  150     A   get_on  1389
....

2008-01-01  152     C   get_off 2161
2008-01-01  152     C   get_off 2298
2008-01-01  152     C   get_off 2360
2008-01-01  152     C   get_off 1997
2008-01-01  152     C   get_off 1217
2008-01-01  152     C   get_off 799
2008-01-01  152     C   get_off 461
2008-01-01  152     C   get_off 271
2008-01-01  152     C   get_off 134
2008-01-01  152     C   get_off 210

I tried melt method like

df.melt(id_vars=df.columns.tolist()[0:4], value_name='a').drop('variable', 1)

but the column of 'b~t' is deleted... I want to add 'b~t' column is go to under 'a' column

It's not working on my dataframe...

How can I get like result?

number is train number

name is train name

dive is get_on or get_off

dataset is https://drive.google.com/open?id=1Upb5PgymkPB5TXuta_sg6SijwzUuEkfl

0

1 Answer 1

1

Use DataFrame.sort_values after melt:

df = df.melt(id_vars=df.columns[:4], value_name='a').drop('variable', 1)

df = df.sort_values(['date','number', 'div'], ascending=[True, True, False])
print (df.head())
          date  number name     div    a
0   2008-01-01     150    A  get_on  379
6   2008-01-01     150    A  get_on  287
12  2008-01-01     150    A  get_on  371
18  2008-01-01     150    A  get_on  876
24  2008-01-01     150    A  get_on  965

print (df.tail())
          date  number name      div    a
71  2008-01-01     152    C  get_off  799
77  2008-01-01     152    C  get_off  461
83  2008-01-01     152    C  get_off  271
89  2008-01-01     152    C  get_off  134
95  2008-01-01     152    C  get_off  210
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.