1

I have the following script:

df = pd.DataFrame()
df["Stake"]=[0.25,0.15,0.26,0.30,0.10,0.40,0.32,0.11,0.20,0.25]
df["Odds"]=[2.5,4.0,1.75,2.2,1.85,3.2,1.5,1.2,2.15,1.65]
df["Ftr"]=["H","D","A","H","H","A","D","H","H","A"]
df["Ind"]=[1,2,2,1,3,3,3,1,2,2]

which results in:

    Stake   Odds    Ftr Ind
0   0.25    2.50    H   1
1   0.15    4.00    D   2
2   0.26    1.75    A   2
3   0.30    2.20    H   1
4   0.10    1.85    H   3
5   0.40    3.20    A   3
6   0.32    1.50    D   3
7   0.11    1.20    H   1
8   0.20    2.15    H   2
9   0.25    1.65    A   2

I want to create two additional columns "Start Balance" and "End Balance"."Start Balance" in index 0 is equal to 1000. "End balance" is always equal to either:

"Start Balance" - "Stake" * "Start Balance" + "Stake" x "Start Balance" x "Odds" if column "Ftr" = "H".

or,

"Start Balance" - "Stake" * "Start Balance" if column "Ftr" different than "H".

Then the next index "Start balance" becomes the preceding index "End Balance". For example "End Balance" in index 0 becomes "Start Balance" in index 1.

To make things a bit more complicated the "Start Balance" should respect one more condition. If "Ind" column is different than 1 , for example 2 then the "Start Balance" for both rows (indices 1 and 2) is equal to the "End Balance" in index 0. Likewise where "Ind" is 3 then all indices (4,5,6) should have "Start Balance" equal to the "End balance" in index 3. Expected result is:

    Stake   Odds    Ftr Ind Start Balance   End Balance
0   0.25    2.5      H   1     1000.0          1375.0
1   0.15     4       D   2     1375.0          1168.8
2   0.26    1.75     A   2     1375.0          1017.5
3   0.3     2.2      H   1     1017.5          1383.8
4   0.1     1.85     H   3     1383.8          1501.4
5   0.4     3.2      A   3     1383.8           830.3
6   0.32    1.5      D   3     1383.8           941.0
7   0.11    1.2      H   1      941.0           961.7
8   0.2     2.15     H   2      961.7          1182.9
9   0.25    1.65     A   2      961.7           721.3

I have not tried anything since I truly don't know how to approach so many conditions :). Cheers

1 Answer 1

1

I can't think of a vectorized function to do what you want so a for loop is the only solution I can think of:

# A temp dataframe to keep track of the End Balance by Ind
# It's empty to start
tmp = pd.DataFrame(columns=['index', 'End Balance']).rename_axis('ind')

for index, row in df.iterrows():
    stake, odds, ind = row['Stake'], row['Odds'], row['Ind']

    if index == 0:
        start_balance = 1000
    elif row['Ind'] == 1:
        start_balance = df.loc[index - 1, 'End Balance']
    else:
        start_balance = tmp.query('ind != @ind').sort_values('index')['End Balance'].iloc[-1]

    end_balance = start_balance * (1 - stake + stake * odds) if row['Ftr'] == 'H' else start_balance * (1 - stake)

    # Keep track of when the current Ind last occurs
    tmp.loc[ind, ['index', 'End Balance']] = [index, end_balance]

    df.loc[index, 'Start Balance'] = start_balance
    df.loc[index, 'End Balance'] = end_balance

Result:

   Stake  Odds Ftr  Ind  Start Balance  End Balance
0   0.25  2.50   H    1    1000.000000  1375.000000
1   0.15  4.00   D    2    1375.000000  1168.750000
2   0.26  1.75   A    2    1375.000000  1017.500000
3   0.30  2.20   H    1    1017.500000  1383.800000
4   0.10  1.85   H    3    1383.800000  1501.423000
5   0.40  3.20   A    3    1383.800000   830.280000
6   0.32  1.50   D    3    1383.800000   940.984000
7   0.11  1.20   H    1     940.984000   961.685648
8   0.20  2.15   H    2     961.685648  1182.873347
9   0.25  1.65   A    2     961.685648   721.264236
Sign up to request clarification or add additional context in comments.

6 Comments

In fact this is not a mistake but one of the conditions that makes me struggle over this. If "Ind" column is different than 1 , for example 2 then the "Start Balance" for both rows where 2 appears should be equal to the end balance in the row before the 2s. In the example indices 1 and 2 should have as "Start Balance" the "End balance from Index 0. Likewise where "Ind" column is 3 then all rows where are 3s should have "Start Balance" equal to the "End balance" in the row preceding the 3s (from the table - indices 4,5,6 should have 'Start Balance" equal to "End Balance' in index 3).
The solution provided by you is very close but not yet :) , thanks anyways
Let me see if I understand your question correctly: if Ind == 1, Start Balance = End Balance of previous row. If Ind == 2, Start Balance = End Balance of the last row with Ind != 2. If Ind == 3, Start Balance = End Balance of the last row with Ind != 3? Is that what you meant?
Exactly this is what I mean, yes
It only requires some small modifications in the loop. See my edited answer
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.