I have a dataframe called df that looks like:
dept ratio higher lower
date
01/01/1979 B 0.522576565 2 1
01/01/1979 A 0.940614079 2 2
01/01/1979 C 0.873957946 0 1
01/01/1979 B 0.087828824 0 2
01/01/1979 A 0.39754345 1 2
01/01/1979 A 0.475491609 1 2
01/01/1979 B 0.140605283 0 2
01/01/1979 A 0.071007362 0 2
01/01/1979 B 0.480720923 2 2
01/01/1979 A 0.673142643 1 2
01/01/1979 C 0.73554271 0 0
I would like to create a new column called compared where for each row I would like to count the number of values in the dept column that match the row dept value minus 1. If the count is greater or equal to 1 then I would like returned to the compared column the solution to the following:
`compared` row value = (higher - lower) / count of dept column which matches the dept row value - 1
If the count of departments is 0 then 0 would be returned to the compared column.
For example, for the first row in df the dept value is B. There are 4 values of B in the dept column. 4-1 is greater than 1. Therefore in the new compared column I would like entered the higher column value (2) minus the lower column value (1) which equals 1 divided by 4-1
or
(2-1)/(4-1) = 0.333333333
so my desired output would look like:
dept ratio higher lower compared
date
01/01/1979 B 0.522576565 2 1 0.333333333
01/01/1979 A 0.940614079 2 2 0.000000000
01/01/1979 C 0.873957946 0 1 -1.000000000
01/01/1979 B 0.087828824 0 2 -0.666666667
01/01/1979 A 0.39754345 1 2 -0.250000000
01/01/1979 A 0.475491609 1 2 -0.250000000
01/01/1979 B 0.140605283 0 2 -0.666666667
01/01/1979 A 0.071007362 0 2 -0.500000000
01/01/1979 B 0.480720923 2 2 0.000000000
01/01/1979 A 0.673142643 1 2 -0.250000000
01/01/1979 C 0.73554271 0 0 0.000000000
I have some code but it's really slow:
minDept=1
for staticidx, row in df.iterrows():
dept = row['dept']
deptCount = deptPivot.loc[dept, "date"] # if error then zero
myLongs= df.loc[staticidx, "higher"]
myShorts= df.loc[staticidx, "lower"]
if deptCount > minDept:
df.loc[staticidx, "compared"] = (higher- lower)/(deptCount-1)
else:
df.loc[staticidx, "compared"] = 0
Is there a faster way that I can do this?
lower_case_with_underscoresstyle.