I have over 500,000 rows in my dataframe and a number of similar 'for' loops which are causing my code to take over a hour to complete its computation. Is there a more efficient way of writing the following 'for' loop so that things run a lot faster:
col_26 = []
col_27 = []
col_28 = []
for ind in df.index:
if df['A_factor'][ind] > df['B_factor'][ind]:
col_26.append('Yes')
col_27.append('No')
col_28.append(df['A_value'][ind])
elif df['A_factor'][ind] < df['B_factor'][ind]:
col_26.append('No')
col_27.append('Yes')
col_28.append(df['B_value'][ind])
else:
col_26.append('')
col_27.append('')
col_28.append(float('nan'))
forloop of 500,000 items runs in less than a second. So it not theforloop that causes the trouble.