I have a pandas dataframe with following columns.
order_id latitude
0 519 19.119677
1 519 19.119677
2 520 19.042117
3 520 19.042117
4 520 19.042117
5 521 19.138245
6 523 19.117662
7 523 19.117662
8 523 19.117662
9 523 19.117662
10 523 19.117662
11 524 19.137793
12 525 19.119372
13 526 0.000000
14 526 0.000000
15 526 0.000000
16 527 19.133430
17 528 0.000000
18 529 19.118284
19 530 0.000000
20 531 19.114269
21 531 19.114269
22 532 19.136292
23 533 19.119075
24 533 19.119075
25 533 19.119075
26 534 19.119677
27 535 19.119677
28 535 19.119677
29 535 19.119677
order_id is repeated, I want unique order_id values which I can get by
unique_order_id = pd.unique(tsp_data['order_id'])
array(['519', '520', '521', '523', '524', '525', '526', '527', '528',
'529', '530', '531', '532', '533', '534', '535'], dtype=object)
Which returns me correct unique values. I am storing it in unique_order_id variable. Now I want only corresponding lat values for unique order_id values.
I am doing something like this.
tsp_data['latitude'][tsp_data['order_id'].isin(unique_order_id)]
But it returns me all 30 rows. Where I am getting wrong? please help
df.drop_duplicates()?df.groupby('order_id').first().reset_index()isinyou're testing for membership so it will return essentially all the rows anyway as there exist rows for each order_id