0

below is my df

df = pd.DataFrame({
                   'Sr. No': [1, 2, 3, 4, 5, 6],
                    'val1' : [2,3,2,4,1,2],

})

I want output val2 as show in the below figures. row1 is same as row1 of val1 but row2 and below is calculated using a formula, as shown

enter image description here

enter image description here

enter image description here

3
  • 2
    Please show us what have you tried so far? Commented Nov 8, 2020 at 11:25
  • 1
    just write a for loop and do the calculation in plain python Commented Nov 8, 2020 at 11:33
  • no that will be inefficient, I got over 30k lines. something along the lines of shift might be more relevant, but not able to get the head around it,. Commented Nov 8, 2020 at 11:35

1 Answer 1

1

So all rows are dependent on the previous as C4 depends on the calculation of C3 for instance. So what we can do is to operate on the numpy arrays directly.

sr_no_vals = df['Sr. No'].values
val1_vals = df['val1'].values
val2_vals = [val1_vals[0]]

for i in range(1, len(sr_no_vals)):
    calculated_value = (((1 + val2_vals[i - 1]) ** sr_no_vals[i - 1]) * (1 + val1_vals[i])) ** (1 / sr_no_vals[i]) 
    val2_vals.append(calculated_value)

df['val2'] = val2_vals

When operating with numpy arrays, we can also use a just-in-time compiler such as numba to speed up the operation by a huge factor for large data.

@numba.jit(nopython=True)
def calc_val2(val1_vals, sr_no_vals):
    val2_vals = [val1_vals[0]]
    for i in range(1, len(sr_no_vals)):
        calculated_value = (((1 + val2_vals[i - 1]) ** sr_no_vals[i - 1]) * (1 + val1_vals[i])) ** (1 / sr_no_vals[i]) 
        val2_vals.append(calculated_value)
    return val2_vals

df['val2'] = calc_val2(val1_vals, sr_no_vals)
Sign up to request clarification or add additional context in comments.

3 Comments

thanks, that was v helpful. I usually try and avoid loops but hopefully with numba should be able to speed up the operation. the only edit to your code will be calculated value should be : calculated_value = (((1 + val2_vals[i - 1]) ** sr_no_vals[i - 1]) * (1 + val1_vals[i])) ** (1 / sr_no_vals[i])
Thanks for the comment, I fixed the formula. Usually, it is great practice to avoid for loops, you indeed are right there. I wish I knew / there was a way this would work without it.
would be keen to hear if there is a faster way! thanks again

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.