Adding a column to pandas.DataFrame with randomly assigned values from a given set

Question

I have a dataset like:

df["movie"] 
A
B
C
D

How to add another columns["genre"] with randomly assigned values from given list?

genres = ["action", "drama", "comedy"]

so that my df would look like：

movies genre
  A    action
  B    drama
  C    drama
  D    comedy
    ...

i've tried:

def addGenreColumn():
   for line in data:
       data["genre"] = random.choice(['action', 'comedy', 'drama'])
addGenreColumn()

but it will assign only one value from the list, like all 'action's or all 'comedy's. What is the proper way of dealing with that?

MrNobody33 · Accepted Answer · 2020-07-03 08:22:35Z

1

You could try with a list comprehension iterating over movies:

import random
import pandas as pd

data = pd.DataFrame({'movie':['A','B','C','D']})

def addGenre():
    data["genre"] = [random.choice(['action', 'comedy', 'drama']) for movie in data.movie]
    
addGenre()

print(data)

Output:

  movie   genre
0     A   drama
1     B  action
2     C  comedy
3     D  action

answered Jul 3, 2020 at 8:22

MrNobody33

6,5039 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

HHYurdagul · Accepted Answer · 2020-07-03 08:29:00Z

1

You could use numpy.random.choice like:

data["genre"] = numpy.random.choice(genres, data["movie"].shape)

This will generate out of genres list with the shape as your first column so it can be assigned to the new column.

answered Jul 3, 2020 at 8:29

HHYurdagul

1456 bronze badges

Collectives™ on Stack Overflow

Adding a column to pandas.DataFrame with randomly assigned values from a given set

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related