1

I have a dataframe called df that looks like:

            dept          ratio higher  lower
      date  
01/01/1979     B    0.522576565      2      1
01/01/1979     A    0.940614079      2      2
01/01/1979     C    0.873957946      0      1
01/01/1979     B    0.087828824      0      2
01/01/1979     A    0.39754345       1      2
01/01/1979     A    0.475491609      1      2
01/01/1979     B    0.140605283      0      2
01/01/1979     A    0.071007362      0      2
01/01/1979     B    0.480720923      2      2
01/01/1979     A    0.673142643      1      2
01/01/1979     C    0.73554271       0      0

I would like to create a new column called compared where for each row I would like to count the number of values in the dept column that match the row dept value minus 1. If the count is greater or equal to 1 then I would like returned to the compared column the solution to the following:

`compared` row value = (higher - lower) / count of dept column which matches the dept row value - 1

If the count of departments is 0 then 0 would be returned to the compared column.

For example, for the first row in df the dept value is B. There are 4 values of B in the dept column. 4-1 is greater than 1. Therefore in the new compared column I would like entered the higher column value (2) minus the lower column value (1) which equals 1 divided by 4-1

or

(2-1)/(4-1) = 0.333333333

so my desired output would look like:

            dept          ratio higher  lower      compared
date    
01/01/1979     B    0.522576565      2      1   0.333333333
01/01/1979     A    0.940614079      2      2   0.000000000
01/01/1979     C    0.873957946      0      1  -1.000000000
01/01/1979     B    0.087828824      0      2  -0.666666667
01/01/1979     A    0.39754345       1      2  -0.250000000
01/01/1979     A    0.475491609      1      2  -0.250000000
01/01/1979     B    0.140605283      0      2  -0.666666667
01/01/1979     A    0.071007362      0      2  -0.500000000
01/01/1979     B    0.480720923      2      2   0.000000000
01/01/1979     A    0.673142643      1      2  -0.250000000
01/01/1979     C    0.73554271       0      0   0.000000000

I have some code but it's really slow:

    minDept=1
    for staticidx, row in df.iterrows():
        dept = row['dept']
        deptCount = deptPivot.loc[dept, "date"] # if error then zero
        myLongs= df.loc[staticidx, "higher"]
        myShorts= df.loc[staticidx, "lower"]

        if deptCount > minDept:

           df.loc[staticidx, "compared"] = (higher- lower)/(deptCount-1)

        else:
           df.loc[staticidx, "compared"] = 0

Is there a faster way that I can do this?

1
  • Unless there is a good reason not to do so, variable and function names should follow the lower_case_with_underscores style. Commented May 3, 2020 at 0:39

1 Answer 1

2

It's rather straight-forward:

counts = df.groupby('dept')['dept'].transform('count')-1

df['compared'] = (df['higher']-df['lower'])/counts

# to avoid possible division by zero warning
# also to match `counts>0` condition
# use this instead
# df.loc[counts>0,'compared'] = df['higher'].sub(df['lower']).loc[counts>0]/counts[counts>0]

Output:

           dept     ratio  higher  lower  compared
date                                              
01/01/1979    B  0.522577       2      1  0.333333
01/01/1979    A  0.940614       2      2  0.000000
01/01/1979    C  0.873958       0      1 -1.000000
01/01/1979    B  0.087829       0      2 -0.666667
01/01/1979    A  0.397543       1      2 -0.250000
01/01/1979    A  0.475492       1      2 -0.250000
01/01/1979    B  0.140605       0      2 -0.666667
01/01/1979    A  0.071007       0      2 -0.500000
01/01/1979    B  0.480721       2      2  0.000000
01/01/1979    A  0.673143       1      2 -0.250000
01/01/1979    C  0.735543       0      0  0.000000
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.