1

I've searched this one and cannot find a solution. I have a multiple data condition where when either condition is met, is summed. In my dataset, I have used "apply" and the lambda function for a single condition <, >. However, I have a continuous data column where a count is based on either a low value OR a high value. I have tried variations of this below but keep getting a "ValueError:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Let's say my data looks like this: dfdata

Site   data  month  day   year
A      4     1      1     2021
A      17    1      2     2021
A      8     1      3     2021
A      7     1      1     2022
A      0     1      2     2022
A      2     1      3     2022
B      3     1      1     2021
B      16    1      2     2021
B      9     1      3     2021
B      2     1      1     2022
B      18    1      2     2022
B      5     1      3     2022

I've used a for loop that should give the following result below for evaluating the "data" column and counting the instances of the value < 4 OR > 15. I think that the "|" operator might do this but I get a True/False...

sites = ['A','B']
n = len(sites)
dft = pd.DataFrame(); 
for n in sites: 
    dft.loc[:,n] = dfdata[dfdata['Site']==n].groupby(["month", "day"])["data"].apply(lambda x: (x < 4) or (x > 15).sum())

the result.

month   day   A    B
1       1     0    2
1       2     2    2
1       3     1    0

Thanks for your help.

1 Answer 1

1

You don't have to use (and should avoid) loops in pandas. Aside from being slow, it also make you intention harder to read.

Here's on solution using pandas functions:

dft = (
    dfdata.query("data < 4 or data > 15")
    .groupby(["month", "day", "Site"])["data"]
    .sum()
    .unstack(fill_value=0)
)

The query filters for rows whose data is <4 or >17. The rest is just adding them up and reshaping the resulting dataframe.

Sign up to request clarification or add additional context in comments.

1 Comment

You might want to use a pivot_table with aggfunc='sum' here ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.