3

I have a sequential grouped table in Pandas.

I am trying to create a running sum within groups, conditional upon running sum can not be negative using the loop below I am able to create the same As you can see for the user_id 77558 the subtotal continues from user_id 223 how do I Fix this

user_id = [4705,4705,4705,4705,4705,223,223,223,223,223,223,223,77558,77558,77558,77558,77558,77558,77558,77558,77558,77558]
transaction_c= [1,2,3,4,5,1,2,3,4,5,6,7,1,2,3,4,5,6,7,8,9,10]
Credit_Debit = [75,-125,47,75,-122,50,50,100,-200,35,50,-15,100,27,27,-54,1000,-220,-220,-220,-220,1000,]

df = pd.DataFrame(list(zip(user_id,transaction_c,Credit_Debit)), columns =['user_id','transaction', 'Credit_Debit'])

#summary function 

lastvalue = 0
newtotal = []
for row in df['Credit_Debit']:
    thisvalue =  row + lastvalue
    if thisvalue < 0:
        thisvalue = 0
    newtotal.append( thisvalue )
    lastvalue = thisvalue
    
df['Balance']=pd.Series(newtotal, index=df.index)

output


+---------+-------------+--------------+----------+------------------+
| user_id | transaction | Credit_Debit | Balance  | Desired Balance  |
+---------+-------------+--------------+----------+------------------+
|    4705 |           1 |           75 |       75 |               75 |
|    4705 |           2 |         -125 |        0 |                0 |
|    4705 |           3 |           47 |       47 |               47 |
|    4705 |           4 |           75 |      122 |              122 |
|    4705 |           5 |         -122 |        0 |                0 |
|     223 |           1 |           50 |       50 |               50 |
|     223 |           2 |           50 |      100 |              100 |
|     223 |           3 |          100 |      200 |              200 |
|     223 |           4 |         -200 |        0 |                0 |
|     223 |           5 |           35 |       35 |               35 |
|     223 |           6 |           50 |       85 |               85 |
|     223 |           7 |          -15 |       70 |               70 |
|   77558 |           1 |          100 |      170 |              100 |
|   77558 |           2 |           27 |      197 |              127 |
|   77558 |           3 |           27 |      224 |              154 |
|   77558 |           4 |          -54 |      170 |              100 |
|   77558 |           5 |         1000 |     1170 |             1100 |
|   77558 |           6 |         -220 |      950 |              880 |
|   77558 |           7 |         -220 |      730 |              660 |
|   77558 |           8 |         -220 |      510 |              440 |
|   77558 |           9 |         -220 |      290 |              220 |
|   77558 |          10 |         1000 |     1290 |             1220 |
+---------+-------------+--------------+----------+------------------+

Appreciate all the help in doing resolving this . thanks :)

2 Answers 2

3

You can first group your dataframes using groupby and then apply your function (I named it "change"), on those individual groups.

import pandas as pd

user_id = [4705,4705,4705,4705,4705,223,223,223,223,223,223,223,77558,77558,77558,77558,77558,77558,77558,77558,77558,77558]
transaction_c= [1,2,3,4,5,1,2,3,4,5,6,7,1,2,3,4,5,6,7,8,9,10]
Credit_Debit = [75,-125,47,75,-122,50,50,100,-200,35,50,-15,100,27,27,-54,1000,-220,-220,-220,-220,1000,]

df = pd.DataFrame(list(zip(user_id,transaction_c,Credit_Debit)), columns =['user_id','transaction', 'Credit_Debit'])

#summary function 

def change(df):
    lastvalue = 0
    newtotal = []
    for row in df['Credit_Debit']:
        thisvalue =  row + lastvalue
        if thisvalue < 0:
            thisvalue = 0
        newtotal.append( thisvalue )
        lastvalue = thisvalue
    return pd.Series(newtotal)
    
df['Balance']= df.groupby('user_id',sort=False).apply(change).reset_index(drop=True)

print(df)

Output:

    user_id  transaction  Credit_Debit  Balance
0      4705            1            75       75
1      4705            2          -125        0
2      4705            3            47       47
3      4705            4            75      122
4      4705            5          -122        0
5       223            1            50       50
6       223            2            50      100
7       223            3           100      200
8       223            4          -200        0
9       223            5            35       35
10      223            6            50       85
11      223            7           -15       70
12    77558            1           100      100
13    77558            2            27      127
14    77558            3            27      154
15    77558            4           -54      100
16    77558            5          1000     1100
17    77558            6          -220      880
18    77558            7          -220      660
19    77558            8          -220      440
20    77558            9          -220      220
21    77558           10          1000     1220
Sign up to request clarification or add additional context in comments.

Comments

1

What's happening in your code is, that it's adding the last Transaction. Change Credit_Debit[11] = -85, then you will get your desired output. You have to reset your transaction for every unique ID.

LastTrans = 0
NewTotal = []
for i, col in zip(df['transaction'], df['Credit_Debit']):
    if i == 1:
        ThisValue = col
    else:
        ThisValue = col + LastTrans
        if ThisValue < 0:
            ThisValue = 0
    NewTotal.append(ThisValue)
    LastTrans = ThisValue

I hope this helps...

1 Comment

@AniruddhaDas I am trying to figure out. I'll post if I get the answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.