This is a small extract of some mock data I am using - it's form what I am calling the "primary" DF. It has multiple customer keys, who each can have multiple devices which could access wifi on a number of days.
Customer Account Key Device Ref Date Data Used (mb)
ABC123 Dev1 03/06/2018 100
ABC123 Dev2 03/06/2018 500
ABC123 Dev3 03/06/2018 250
ABC123 Dev1 04/06/2018 600
ABC123 Dev2 04/06/2018 1000
ABC123 Dev3 04/06/2018 350
I would like to summarise this date in a second DF and it would look like this
Customer_Account_Key Total_Devices Total_Days Total_Data_Used
ABC123 3 2 2800
So far I have managed to create a second DF which has only one row for each of the unique customer account keys
df_users['Customer Account Key'] = df_data['Customer Account Key'].unique()
But I am really struggling to extract summary information from the main DF based on the each of the Customer account keys in my new DF.
I have played around with Groupby and df.loc but I am just not getting anywhere. I am new to Python so I'm not sure if these are the wrong approach or if I'm just not using them correctly.
Any pointers?
Thanks