2

I have a column in a pandas dataframe for genres. It is a string with the genres seperated by a column.

>>> df['genres_omdb']
0                      Crime, Drama
1        Adventure, Family, Fantasy
2                    Drama, Mystery
3         Horror, Mystery, Thriller
5         Action, Adventure, Sci-Fi
6                    Drama, Romance
8                             Drama
9      Animation, Adventure, Comedy
10     Animation, Adventure, Comedy
11                    Drama, Sci-Fi
12                            Drama
13              Drama, Romance, War
14            Comedy, Drama, Family
16         Comedy, Musical, Romance

So originally I split it into three columns and ran get_dummies on each of the columns. This produced repetitive columns (i.e. genre1_Adventure genre2_Adventure).

So then I tried getting every unique genre, creating a column of that genre, and then manually iterating through the rows and changing values to a 1 if the genre is in the list.

genre1_keys = df['genre1'].value_counts().keys()
genre2_keys = df['genre2'].value_counts().keys()
genre3_keys = df['genre3'].value_counts().keys()
for genre in genre1_keys:
  all_genres.add(genre.strip())
for genre in genre2_keys:
  all_genres.add(genre.strip())
for genre in genre3_keys:
  all_genres.add(genre.strip())
for genre in all_genres:
  df[genre] = 0
for i, row in df.iterrows():
  genres = row['genres_omdb'].split(',')
  for genre in genres:
    genre = genre.strip()
    row[genre] = 1

It's very messy and I know there is a better way to do this. Any help on how to clean up this code would be appreciated.

1 Answer 1

4

I think you just need to str.get_dummies

df['genres_omdb'].str.get_dummies(sep=',')
Out[115]: 
    Action  Adventure  Animation  Comedy  Crime  Drama  Family  Fantasy  \
0        0          0          0       0      1      1       0        0   
1        0          1          0       0      0      0       1        1   
2        0          0          0       0      0      1       0        0   
3        0          0          0       0      0      0       0        0   
5        1          1          0       0      0      0       0        0   
6        0          0          0       0      0      1       0        0   
8        0          0          0       0      0      1       0        0   
9        0          1          1       1      0      0       0        0   
10       0          1          1       1      0      0       0        0   
11       0          0          0       0      0      1       0        0   
12       0          0          0       0      0      1       0        0   
13       0          0          0       0      0      1       0        0   
14       0          0          0       1      0      1       1        0   
16       0          0          0       1      0      0       0        0   
    Horror  Musical  Mystery  Romance  Sci-Fi  Thriller  War  
0        0        0        0        0       0         0    0  
1        0        0        0        0       0         0    0  
2        0        0        1        0       0         0    0  
3        1        0        1        0       0         1    0  
5        0        0        0        0       1         0    0  
6        0        0        0        1       0         0    0  
8        0        0        0        0       0         0    0  
9        0        0        0        0       0         0    0  
10       0        0        0        0       0         0    0  
11       0        0        0        0       1         0    0  
12       0        0        0        0       0         0    0  
13       0        0        0        1       0         0    1  
14       0        0        0        0       0         0    0  
16       0        1        0        1       0         0    0  
Sign up to request clarification or add additional context in comments.

2 Comments

Wow, so simple. Thank you
@user3736114 yw~ happy coding

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.