1

To illustrate my question, consider the following Pandas DataFrame:

df = pd.DataFrame({'player': ['Bob', 'Jane', 'Alice'], 
                   'hand': [['two','ace'], ['queen','king'], ['three','five']]})

I would like to sort each hand array. I've tried using lamdas orlooping through df using iterrow, but I couldn't get either to work.

BONUS: The reason I want it sorted is so I could do a groupby on that column to identify all players having same hand. Perhaps, there is a more direct way of doing it.

3
  • 2
    Your data is not tabular. I doubt that a dataframe is the appropriate abstraction for your data if you wish to keep it in its current shape. Commented Dec 2, 2021 at 1:37
  • Kindly post your expected output dataframe Commented Dec 2, 2021 at 2:08
  • @orlp what other abstraction would allow to easily do a groupby on the subarray? Commented Dec 2, 2021 at 12:40

4 Answers 4

2

You can apply(sorted):

df['hand'] = df['hand'].apply(sorted)

Output:

  player           hand
0    Bob     [ace, two]
1   Jane  [king, queen]
2  Alice  [five, three]

This won't allow you to group as lists are not hashable.

If your goal is to group or compare, and the cards are unique, you could also use a frozenset:

df['hand'] = df['hand'].apply(frozenset)

Or, if you want to consider duplicated cards (e.g, ace+ace), sort and convert to tuple:

df['hand'] = df['hand'].apply(lambda x: tuple(sorted(x)))

Output:

  player           hand
0    Bob     (two, ace)
1   Jane  (king, queen)
2  Alice  (three, five)

Then you can groupby hand to list the players with the same hand:

df.groupby('hand')['player'].apply(list)

Output:

hand
(ace, two)         [Bob]
(five, three)    [Alice]
(king, queen)     [Jane]
Name: player, dtype: object
Sign up to request clarification or add additional context in comments.

Comments

2

I will do explode ,for your next step , you can just groupby the hand agg the player

df.explode('hand').groupby('hand').player.agg(list)
hand
ace        [Bob]
five     [Alice]
king      [Jane]
queen     [Jane]
three    [Alice]
two        [Bob]
Name: player, dtype: object

1 Comment

This will get us who has each card, but how would you tell who all have the same hand?
0

 I think that using sorted is one of the best options, and in this question it is also raised.

>>> df['hand'] = [tuple(sorted(x)) for x in df['hand']]
>>> df
  player           hand
0    Bob     (ace, two)
1   Jane  (king, queen)
2  Alice  (five, three)

Comments

0

Since np.sort() is so much faster than sorted(), you can use:

df['hand'] = df['hand'].apply(np.sort)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.