5

Having the following dataframe,

df = pd.DataFrame({'device_id' : ['0','0','1','1','2','2'],
               'p_food'    : [0.2,0.1,0.3,0.5,0.1,0.7],
               'p_phone'   : [0.8,0.9,0.7,0.5,0.9,0.3]
              })
print(df)

output:

  device_id  p_food  p_phone
0         0     0.2      0.8
1         0     0.1      0.9
2         1     0.3      0.7
3         1     0.5      0.5
4         2     0.1      0.9
5         2     0.7      0.3

How to achieve this transformation?

df2 = pd.DataFrame({'device_id' : ['0','1','2'],
                   'p_food_1'    : [0.2,0.3,0.1],
                   'p_food_2'    : [0.1,0.5,0.7],
                   'p_phone_1'   : [0.8,0.7,0.9],                    
                   'p_phone_2'   : [0.9,0.5,0.3]
                  })
print(df2)

Output:

  device_id  p_food_1  p_food_2  p_phone_1  p_phone_2
0         0       0.2       0.1        0.8        0.9
1         1       0.3       0.5        0.7        0.5
2         2       0.1       0.7        0.9        0.3

I try to achieve it use groupby,apply,agg...
But I still can't achieve this transformation.

Update
My final Code:

df.drop_duplicates('device_id', keep='first').merge(df.drop_duplicates('device_id', keep='last'),on='device_id')

I appreciated su79eu7k's and A-Za-z's time and effort.
Words are not enough to express my gratitude.

2
  • Possible duplicate of Long to wide data. Pandas Commented May 2, 2017 at 2:56
  • Thank you provide another answer for me. Commented May 2, 2017 at 3:23

2 Answers 2

6

If you are still looking for an answer using groupby

df = df.groupby('device_id')['p_food', 'p_phone'].apply(lambda x: pd.DataFrame(x.values)).unstack().reset_index()
df.columns = df.columns.droplevel()
df.columns = ['device_id','p_food_1', 'p_food_2', 'p_phone_1','p_phone_2']

You get

    device_id   p_food_1    p_food_2    p_phone_1   p_phone_2
0   0           0.2         0.1         0.8         0.9
1   1           0.3         0.5         0.7         0.5
2   2           0.1         0.7         0.9         0.3
Sign up to request clarification or add additional context in comments.

1 Comment

Good job! Thank you for your help!
2
df_m = df.drop_duplicates('device_id', keep='first')\
         .merge(df, on='device_id')\
         .drop_duplicates('device_id', keep='last')\
         [['device_id', 'p_food_x', 'p_food_y', 'p_phone_x', 'p_phone_y']]\
         .reset_index(drop=True)

print(df_m)

  device_id  p_food_x  p_food_y  p_phone_x  p_phone_y
0         0       0.2       0.1        0.8        0.9
1         1       0.3       0.5        0.7        0.5
2         2       0.1       0.7        0.9        0.3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.