I have two dataframes and have a code to extract some data from one of the dataframes and add to the other dataframe:
sales= pd.read_excel("data.xlsx", sheet_name = 'sales', header = 0)
born= pd.read_excel("data.xlsx", sheet_name = 'born', header = 0)
bornuni = born.number.unique()
for babies in bornuni:
datafame = born[born["id"]==number]
for i, r in sales.iterrows():
if r["number"] == babies:
sales.loc[i,'ini_weight'] = datafame["weight"].iloc[0]
sales.loc[i,'ini_date'] = datafame["date of birth"].iloc[0]
else:
pass
this is pretty inefficient with bigger data sets so I want to parallelize this code but I don´t have a clue how to do it. Any help would be great. Here is a link to a mock dataset.