How to create a list of lists using unique values in a dataframe column?

Question

I have a dataframe as below where one ticket has multiple item associated with it.

| ticket_no | items |
|-----------|-------|
| 1         | Item1 |
| 1         | Item2 |
| 2         | Item3 |
| 2         | Item4 |
| 3         | Item5 |
| 3         | Item6 |
| 3         | Item7 |
| 3         | Item8 |

Need output as below.

[[Item1, Item2],[Item3, Item4], [Item5, Item6, Item7, Item8]]

I have tried below code. It works, but it is terribly slow.

data = pd.read_csv('data.csv')
item_list = []
for ticket_no in data['ticket_no'].unique():
    temp_data = list(data[data['ticket_no'] == ticket_no]['items'])
    if len(temp_data) == 1:
        pass
    else:
        item_list.append(temp_data)

Is there a faster way of doing this?

jezrael · Accepted Answer · 2019-12-03 07:05:13Z

4

Use DataFrame.groupby with list to Series and then convert it to lists - output is nested lists:

item_list = data.groupby('ticket_no')['items'].apply(list).tolist()
print (item_list)
[['Item1', 'Item2'], ['Item3', 'Item4'], ['Item5', 'Item6', 'Item7', 'Item8']]

answered Dec 3, 2019 at 7:05

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Sachin Prabhu Over a year ago

Thank you. That works. Also, if the nested lists have only one item, I need to discard it. Can do it by looping over it but is there any other way?

jezrael Over a year ago

@SachinPrabhu - Simpliest is filter before solution by df = df[df['ticket_no'].duplicated(keep=False)]

Sachin Prabhu Over a year ago

Worked for me. Thanks.

Collectives™ on Stack Overflow

How to create a list of lists using unique values in a dataframe column?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related