I have three columns in a pandas dataframe.
df = pd.DataFrame({'month':[1,2,3,4,5,6,7,8,9,10,11,12],
'day':[10,23,16,30,1,23,3,28,29,1,4,2],
'year':[1990,1990,1990,1990,1990,1990,1990,1990,1990,1990,1990,1990]})
I want to return the water year in another column which is defined as year +1 for any day after September 28. For example September 29 1990 is water year 1991.
I have a function that works on its own
def wy(month, day, year):
if month >8 and day > 28:
year = year + 1
else:
if month>9:
year = year + 1
else:
year = year
return year
wy(9,30,2000)
But
df['wy'] = wy(df['month'],df['day'],df['year'])
returns the error
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
When I tried to apply it to the dataframe. I have looked at the docs and other questions and have tried multiple other functions such as
def wy(month, day, year):
if any(month >8 & day > 28):
year = year + 1
else:
if any(month>9):
year = year + 1
else:
year = year
return year
def wy(month, day, year):
if any((month >8 & day > 28)|month>9):
year = year + 1
return year
But I still am returning errors. I have also tried a lambda expression (not very pretty)
df['wy'] = (lambda x,y,z:z+1 if (x >8 & y > 28)|y>9 else z, df['month'],df['day'],df['year'])
Thanks for any help