1

I have the following code working:

import numpy as np
import pandas as pd
colum1 = [0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05,0.05]
colum2 = [1,2,3,4,5,6,7,8,9,10,11,12]
colum3 = [0.85,0.80,0.80,0.80,0.85,0.0,0.0,0.0,0.0,0.0,0.0,0.0]
colum4 = [1743.85, 1485.58, 1250.07, 1021.83, 818.96, 628.05, 455.40, 319.03, 190.86 , 97.07, 26.96 , 0.00]
df = pd.DataFrame({
    'colum1' : colum1,
    'colum2' : colum2,
    'colum3' : colum3,
    'colum4' : colum4,
});

df['result'] = 0
for i in range(len(colum2)):
    df['result'] = np.where(
        df['colum2'] <= 5,
        np.where(
            df['colum2'] == 1,
            df['colum4'],
            np.where(
                ( df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3'])) )>0,
                ( df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3'])) ),
                0
            )
        ),
        np.where(
            ( df['colum4'] - (df['result'].shift(1) * df['colum1']) )>0,
            ( df['colum4'] - (df['result'].shift(1) * df['colum1']) ),
            0
        )
    )

and I need to perform the same operation without resorting to a for cycle. This would be very helpful, since I am working with thousands of records, which is very slow.

My expected result is the following:

    colum1  colum2  colum3   colum4       result
0     0.05       1    0.85  1743.85  1743.850000
1     0.05       2    0.80  1485.58  1415.826000
2     0.05       3    0.80  1250.07  1193.436960
3     0.05       4    0.80  1021.83   974.092522
4     0.05       5    0.85   818.96   777.561068
5     0.05       6    0.00   628.05   589.171947
6     0.05       7    0.00   455.40   425.941403
7     0.05       8    0.00   319.03   297.732930
8     0.05       9    0.00   190.86   175.973354
9     0.05      10    0.00    97.07    88.271332
10    0.05      11    0.00    26.96    22.546433
11    0.05      12    0.00     0.00     0.000000
4
  • Why are you using a loop in the first place? Seems like the code will work if you just remove it and bring the loop body to an outer level of indentation. Commented Oct 9, 2018 at 22:58
  • The biggest problem is in the .shift(1) :( Commented Oct 9, 2018 at 23:02
  • How confident are you in your expected result, I can only reproduce the first several rows, then they vary Commented Oct 10, 2018 at 21:36
  • @user3483203 sorry, I just corrected Commented Oct 10, 2018 at 22:02

1 Answer 1

1

The first step is to remove the loop over the index and replace those tests for numbers greater than 0 with np.maximum. This works because np.where(a > 0, a, 0) for our purposes is equivalent to np.maximum(0, a).

At the same time define the longer expressions separately to make your code readable:

s1 = df['colum4'] - (df['result'].shift(1) * (df['colum1'] * df['colum3']))
s2 = df['colum4'] - (df['result'].shift(1) * df['colum1'])

df['result'] = np.where(df['colum2'] <= 5,
                        np.where(df['colum2'] == 1, df['colum4'],
                                 np.maximum(0, s1)),
                        np.maximum(0, s2))

The next step is to use np.select to remove the nested np.where statements:

m1 = df['colum2'] <= 5
m2 = df['colum2'] == 1

conds = [m1 & m2, m1 & ~m2]
choices = [df['colum4'], np.maximum(0, s1)]

df['result'] = np.select(conds, choices, np.maximum(0, s2))

This version will be more manageable.

Sign up to request clarification or add additional context in comments.

4 Comments

thanks, it is a more manageable version, but it does not solve my intention to eliminate the buble for to obtain the desired result.
The last solution has no explicit for loop, isn't that what you want?
I just edit the question adding the desired result, but delete the buble the resutlado is different
SO isn't (just) a code writing service. I'm happy to clarify logic, concepts, performance type queries if you have any. But "get these numbers" in itself is not something either I, or other users, will find interesting.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.