This is my df:
import pandas as pd
df = pd.DataFrame({'id': [1,1,1,1,2,2,3,3,3],
'col1': [7,6,12,1,3,6,10,11,12],
'col2': [1.2,0.8,0.9,1.1,2.0,1.8,0.7,0.9,1.2]})
I want to apply 2 functions, each of which returns strictly 1 output.
def myfunc1(g):
var1 = g['col1'].iloc[0]
var2 = g.loc[g['col2'] > 1, 'col1'].iloc[0]
return var1 / var2
def myfunc2(g):
var1 = g['col1'].iloc[0]
var2 = g.loc[g['col2'] < 1, 'col1'].iloc[0]
return var2 - var1
If I run them this way, the code fails:
df[['new_col1','new_col2']] = df.groupby("id").apply(myfunc1,myfunc2)
However, if I run them separately (see below), everything works fine:
df['new_col1'] = df.groupby("id").apply(myfunc1)
df['new_col2'] = df.groupby("id").apply(myfunc2)
The expected output should have the following columns:
- blade_id
- new_col1
- new_col2
myfunc2throws an error because one of the groups does not have aniloc[0](id 2 does not have a value below 1)