I have a Pandas DataFrame containing a 2D array as a column looking something like the following:
Name 2DValueList
item 1 [ [ 0.0, 1.0 ], [ 0.0, 6.0 ], [ 0.0, 2.0 ] ]
item 2 [ [ 0.0, 2.0 ], [ 0.0, 1.0 ], [ 0.0, 1.0 ] ]
item 3 [ [ 0.0, 1.0 ], [ 0.0, 3.0 ], [ 0.0, 5.0 ], [ 0.0, 1.0 ] ]
item 4
item 5 [ [ 0.0, 4.0 ], [ 0.0, 1.0 ], [ 0.0, 2.0 ] ]
The first value isn't relative to this question so I've just made them all 0. I'm only interested in the second values. Also notice the amount of pairs can vary or be empty.
I want to be able to make a new dataframe that just contains the top (largest) n elements from the array.
It would look like this for the top 2 elements:
Name 2DValueList
item 1 [ [ 0.0, 6.0 ], [ 0.0, 2.0 ] ]
item 2 [ [ 0.0, 2.0 ], [ 0.0, 1.0 ] ]
item 3 [ [ 0.0, 5.0 ], [ 0.0, 3.0 ] ]
item 4
item 5 [ [ 0.0, 4.0 ], [ 0.0, 2.0 ] ]
I would use pandas nlargest, but I'm not sure how to make it accept a column that is a 2D array.
In reality, the 2D array holds thousands of value pairs and there are tens of thousands of rows. I'm open to better ways to hold this data that would be more versatile.