2

I want to write a query to determine the success rate for every day for each different mode. I did write a query that'll just group by date and mode which serves my purpose but one of my seniors wrote this following query which also works but I am just unable to understand how the if clause is working. I'll add a bit of the query here -

SELECT 
        dt,
        sum(if(mode='A',success,0))AS a_s,
        sum(if(mode='A',total,0))AS a_t,    
        sum(if(mode='B',success,0))AS b_s,
        sum(if(mode='B',total,0))AS b_t,
        sum(if(mode='C',success,0))AS c_s,
        sum(if(mode='C',total,0))AS c_t,
        sum(if(mode='D',success,0))AS d_s,
        sum(if(mode='D',total,0))AS d_t,
        sum(if(mode NOT in('A','B','C','D'),success,0))AS other_s,
        sum(if(mode NOT in('A','B','C','D'),total,0))AS other_t 
    FROM
        (SELECT 
            mode,
            date(addedon)AS dt,
            sum(if(status in('success','partial'),1,0))AS success,
            count(*)AS total 
        FROM `a_huge_ass_table` 
        WHERE `studentid`=159633 AND addedon>'2021-01-15' 
        GROUP BY mode,date(addedon)
        )AS t 

Here I am unable to understand how sum(if(mode='A',success,0))AS a_s, - this if clause is working. If the condition is true then the clause is returning success? how does that work does adding success also somehow verify that its status is a success case? I cant find this on google.

1

1 Answer 1

2

First, if() is not standard SQL. I recommend rewrite this using case:

    sum(case when mode = 'A' then success else 0 end) as a_s,
    sum(case when mode = 'A' then total else 0 end) as a_t, 

and so on.

Second, this query is missing the final group by dt. Otherwise it produces one row, rather than a separate row for each dt value.

This is called conditional aggregation. Every row in the final result set represents a group of rows from the subquery. Within this group, some have mode = 'A' and some do not. For the ones with mode = 'A' the above sums the value of success and total.

There is no need for a subquery by the way. That just slows down the query. I would recommend writing the query as:

SELECT date(addedon) as dt
       SUM( mode = 'A' AND status IN ('success', 'partial') ) as a_success,
       SUM( mode = 'A'  ) as a_total,
       . .  .
FROM `a_huge_ass_table` 
WHERE studentid = 159633 AND addedon >= '2021-01-15' 
GROUP BY date(addedon);

Note that this uses a MySQL extension where boolean expressions are treated as integers, with "1" for true and "0" for false.

Sign up to request clarification or add additional context in comments.

1 Comment

yes it needs a final group by dt I did not post the whole thing. And yes sub query is redundant here I just didn't understand the if clause functionality. Thank you very much I'll look into conditional aggregation. Also, your query with boolean looks far cleaner and faster. Thanks again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.