0

I need to make this code run fast by vectorization

final1 = pd.DataFrame()
for index, row in demo1.iterrows():
    a = np.random.choice([0, 1], size=1000, p=[1 - row['prob'], row['prob']])
    b = a * row['syb'] * (1 + row['percentage_change_syb'] / 100)
    final1 = final1.append(pd.DataFrame(b).T)
3
  • 1
    please provide your dataset on which you are looping. Commented Jan 7, 2019 at 8:02
  • demo1 = pd.DataFrame({"syb" : [298, 388, 267, 746, 645], "prob" : [0.84, 0.46, 0.68, 0.35, 0.95], "percentage_change_syb" : [1.29, 3.45, 20, 14.9, 12.5]}) Commented Jan 8, 2019 at 9:33
  • Above is the data frame Commented Jan 8, 2019 at 9:34

1 Answer 1

1

Since you did not supply data to work against, the following code is unchecked, but should work:

def computation(prob, syb, percentage_change_syb):
    a = np.random.choice([0, 1], size=1000, p=[1 - prob, prob])
    b = a * syb * (1 + percentage_change_syb / 100)
    return b.T

final1 = computation(demo1['prob'].values, demo1['syb'].values, demo1['percentage_change_syb'].values)

For more information on the choice of operating on NumPy arrays I recommend this article.

Sign up to request clarification or add additional context in comments.

2 Comments

demo1 = pd.DataFrame({"syb" : [298, 388, 267, 746, 645], "prob" : [0.84, 0.46, 0.68, 0.35, 0.95], "percentage_change_syb" : [1.29, 3.45, 20, 14.9, 12.5]})
Above is the data frame I tried your code . It is giving an error - ValueError: object too deep for desired array

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.