Speeding up row-by-row loop with if-condition in Python

Question

I have a dataset of 6 milion rows, the columns are: symbol, timeStamp, open price and close price. I run the following loop, which takes very long, though being very simple (if open price is nan, take close price from the previous row):

for i in range(0,len(price2)):
    print(i)
    if np.isnan(price3.iloc[i,2]):
        price3.iloc[i,2]=price3.iloc[i-1,3]

How can I speed this loop up? As far as I know, I can change to apply(), but how can I include the if-condition to it?

miradulo · Accepted Answer · 2018-05-12 14:29:09Z

3

Instead of the for loop, you can use pandas.Series.fillna with the shifted Series for the close price.

price3['open price'].fillna(price3['close price'].shift(1), inplace=True)

This is vectorized and so should be far faster than your for loop.

Note I am assuming that price2 and price3 have the same length and you may as well be iterating over price3 in your loop.

answered May 12, 2018 at 14:29

miradulo

29.8k7 gold badges86 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Speeding up row-by-row loop with if-condition in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related