12

I am working with Python in Bigquery and have a large dataframe df (circa 7m rows). I also have a list lst that holds some dates (say all days in a given month).

I am trying to create an additional column "random_day" in df with a random value from lst in each row.

I tried running a loop and apply function but being quite a large dataset it is proving challenging.

My attempts passed by the loop solution:

df["rand_day"] = ""

for i in a["row_nr"]:
  rand_day = sample(day_list,1)[0]
  df.loc[i,"rand_day"] = rand_day

And the apply solution, defining first my function and then calling it:

def random_day():
  rand_day = sample(day_list,1)[0]
  return day

df["rand_day"] = df.apply(lambda row: random_day())

Any tips on this? Thank you

1 Answer 1

17

Use numpy.random.choice and if necessary convert dates by to_datetime:

df = pd.DataFrame({
        'A':list('abcdef'),
        'B':[4,5,4,5,5,4],
})

day_list = pd.to_datetime(['2015-01-02','2016-05-05','2015-08-09'])
#alternative
#day_list = pd.DatetimeIndex(['2015-01-02','2016-05-05','2015-08-09'])

df["rand_day"] = np.random.choice(day_list, size=len(df))
print (df)
   A  B   rand_day
0  a  4 2016-05-05
1  b  5 2016-05-05
2  c  4 2015-08-09
3  d  5 2015-01-02
4  e  5 2015-08-09
5  f  4 2015-08-09
Sign up to request clarification or add additional context in comments.

2 Comments

I have a follow up question to the above @jezrael - how can I create a list of values and then add them to a dataframe with a given distribution? The above works to randomly add in the elements of a list, but say I have a list of values [50, 40, 30, 20, 10] is there a way to assign x% of my df the 50 value, y% 40, z% 30 etc... or assign them to the dataframe in a normal distribution across the len(df)?
Small note that the numpy docs now recommend using numpy.random.Generator.choice instead of numpy.random.choice

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.