I'm trying to add a column in a pandas dataframe which is a value on average equal to the initial column, but can deviate on each row some decimal points. Ideally deviating with a normal distribution, but I'm not sure how to do this.
I've tried a simple code like the one below:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(1,99,size=(100000, 1)), columns=["GOD_prob"])
df["GOD_prob"] = df["GOD_prob"] / 100
df["GOD_odd"] = 1 / df["GOD_prob"]
df["market_prob"] = ((df["GOD_prob"] * 100 ) + np.random.randint(-10,10, len(df))) / 100
df["market_price"] = 1 / df["market_prob"]
The problem I'm having is, for values in df["GOD_prob"] under 0.10, I can get negative values for df["market_prob"] and I don't want this, as these columns stand for probabilities.
Afterwards I'd like to create another column which deviates from df["GOD_prob"] 5% on average, but I'm not really sure how to do this.
Thanks for helping!
df[col] = np.random.normal(mean, std, size=len(df))np.random.gammabut you'll have to do some maths to figure out whatshapeandscaleshould be.