Creating dummy variables from a string column in pandas

Question

So I have a pandas df as follows and my goal is to take the MATCHUP column and make it several more dummy columns.

INDICATOR MATCHUP 
1         [   "APPLE",   "GRAPE" ]
1         [   "APPLE",   "GRAPE" ]
0         [   "GRAPE",   "BANANA" ]
0         [   "PEAR",   "ORANGE" ]
1         [   "ORANGE",   "APPLE" ]

Here's a dict of how it looks:

{'INDICATOR': [1, 1, 0, 0, 1],
 'MATCHUP': ['[   "APPLE",   "GRAPE" ]',
  '[   "APPLE",   "GRAPE" ]',
  '[   "GRAPE",   "BANANA" ]',
  '[   "PEAR",   "ORANGE" ]',
  '[   "ORANGE",   "APPLE" ]']}

So given this df, I would like to create some dummy variables to identify if a value appears in the MATCHUP.

Final outcome:

INDICATOR MATCHUP                    APPLE GRAPE BANANA PEAR ORANGE
1         [   "APPLE",   "GRAPE" ]   1     1     0      0    0 
1         [   "APPLE",   "GRAPE" ]   1     1     0      0    0
0         [   "GRAPE",   "BANANA" ]  0     1     1      0    0
0         [   "PEAR",   "ORANGE" ]   0     0     0      1    1
1         [   "ORANGE",   "APPLE" ]  1     0     0      0    1

Is there a way to accomplish this using pandas? I attempted to accomplish this using this but I think the spacing in the MATCHUP column make this method unviable.

BENY · Accepted Answer · 2022-05-17 21:19:00Z

3

Check explode with str.get_dummies

import ast
df = df.join(df['MATCHUP'].map(ast.literal_eval).explode().str.get_dummies().groupby(level=0).sum())

edited May 17, 2022 at 21:19

answered May 17, 2022 at 21:04

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

John Thomas Over a year ago

This unfortunately did not work; It did create the dummy variables, but did not split the MATCHUP. Instead of Apple as Variable1 and Grape as Variable2, this returned [ "APPLE", "GRAPE" ] as its own Variable

BENY Over a year ago

@JohnThomas check the update , you need first convert the string back to list

Collectives™ on Stack Overflow

Creating dummy variables from a string column in pandas

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related