2

I have three columns in a pandas dataframe.

df = pd.DataFrame({'month':[1,2,3,4,5,6,7,8,9,10,11,12],
                   'day':[10,23,16,30,1,23,3,28,29,1,4,2],
                    'year':[1990,1990,1990,1990,1990,1990,1990,1990,1990,1990,1990,1990]})

I want to return the water year in another column which is defined as year +1 for any day after September 28. For example September 29 1990 is water year 1991.

I have a function that works on its own

def wy(month, day, year):
    if month >8 and day > 28:
        year = year + 1
    else:
        if month>9:
            year = year + 1
        else:
            year = year
    return year
wy(9,30,2000)

But

df['wy'] = wy(df['month'],df['day'],df['year'])

returns the error

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

When I tried to apply it to the dataframe. I have looked at the docs and other questions and have tried multiple other functions such as

def wy(month, day, year):
    if any(month >8 & day > 28):
        year = year + 1
    else:
        if any(month>9):
            year = year + 1
        else:
            year = year
    return year


def wy(month, day, year):
    if any((month >8 & day > 28)|month>9):
        year = year + 1
    return year

But I still am returning errors. I have also tried a lambda expression (not very pretty)

df['wy'] = (lambda x,y,z:z+1 if (x >8 & y > 28)|y>9 else z, df['month'],df['day'],df['year'])

Thanks for any help

1 Answer 1

2

You can use a tuple of (month, day) and compare to that applied across the column axis, eg:

df = pd.DataFrame({'month':[1,2,3,4,5,6,7,8,9,10,11,12],
                   'day':[10,23,16,30,1,23,3,28,29,1,4,2],
                    'year':[1990,1990,1990,1990,1990,1990,1990,1990,1990,1990,1990,1990]})

Using:

df.apply(lambda L: L.year + ((L.month, L.day) >= (9, 28)), axis=1)

Gives you:

0     1990
1     1990
2     1990
3     1990
4     1990
5     1990
6     1990
7     1990
8     1991
9     1991
10    1991
11    1991
dtype: int64

This works because tuples are compared element by element and month/day are natural ordering. Any true condition evaluates to what's effectively 1 and a false condition to 0 - and we add that to the year to move it into the next if required.

Sign up to request clarification or add additional context in comments.

2 Comments

this works and taught me something valuable about tuples
@JeffTilton oh - probably want > instead of >= but you've probably already noticed that :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.