1

I have a query which converts more than a row to single row. I wanted to know if there is any technique better than . To illustrate our case I have taken a simple usercars relation and simulated queries similar to our application.

table: Users       PrimaryKey: UserId
---------------------------------------
| UserId | UserDetails1 | UserDetails2|
---------------------------------------
| 1      | name1        | Addr1       |
| 2      | name2        | Addr2       |
---------------------------------------

table: UserCars    Unique Constraint(UserId, CarType) 
                   index on userid, cartype
-------------------------------------------
| UserId | CarType | RedCount | BlueCount |
------------------------------------------- 
|   1    |   SUV   |    1     |    0      |
|   1    |  sedan  |    1     |    2      |
|   2    |  sedan  |    1     |    0      |
-------------------------------------------    
Consider CarType as an enum type with values SUV and sedan only 

Application needs to fetch UserDetails1, sum(RedCount), sum(BlueCount), SUV's RedCount, SUV's BlueCount, sedan RedCount, sedan BlueCount for every user in a single query.

For the above example, the result should be like

--------------------------------------------------------------------------------
| UserId | UserDetails1 | TotalRed |TotalBlue|SUVRed|SUVBlue|sedanRed|sedanBlue|
--------------------------------------------------------------------------------
|  1     | name1        |   2      |    2    |   1  |   2   |   1    |    0    |
|  2     | name2        |   1      |    0    |   0  |   0   |   1    |    0    |
--------------------------------------------------------------------------------

Currently, our query is like below

SELECT
 --User Information 
u.UserId, u.UserDetails1,
 --Total Counts by color
count_by_colour.TotalRed, count_by_colour.TotalBlue,
  -- Counts by type
COALESCE(suv.red, 0) AS SUVRed, COALESCE(suv.blue, 0) AS SUVBlue, 
COALESCE(sedan.red, 0) AS sedanRed, COALESCE(sedan.blue, 0) AS sedanBlue
FROM Users u
JOIN (
    SELECT c.UserId, SUM(RedCount) as TotalRed, 
    SUM(BlueCount) AS TotalBlue
    FROM UserCars c GROUP BY UserId
) count_by_colour
ON (u.UserId = count_by_colour.UserId)
LEFT JOIN (
    SELECT UserId, RedCount AS red, BlueCount AS blue
    FROM UserCars
    WHERE CarType = 'SUV') suv
ON (u.UserId = suv.UserId)
LEFT JOIN (
    SELECT UserId, RedCount AS red, BlueCount AS blue
    FROM UserCars
    WHERE CarType = 'sedan') sedan
ON (u.UserId = sedan.UserId)

Though the query fetches data as expected, I wanted to know if there is any technique better than this. In this example, I have given only two types (SUV and sedan) but in our original application which is related to marketing, has more types which means more left joins.

Note: tables cannot be altered as there are other applications use the same

Thanks,
Ravi

3
  • While this query is a bit ugly, it is logically correct and I can't think of a way to simplify it. Commented Feb 26, 2016 at 15:11
  • Are you sure you want both mysql and postgresql? Commented Feb 26, 2016 at 15:11
  • @TimBiegeleisen I slightly modified it and commented below. It performs better than the original query. Commented Feb 27, 2016 at 17:36

3 Answers 3

2

You can use conditional aggregation:

SELECT u.UserId, u.UserDetails1, 
       SUM(RedCount) AS TotalRed, SUM(BlueCount) AS TotalBlue,
       COALESCE(SUM(CASE WHEN CarType = 'SUV' THEN RedCount  END), 0) AS SUVRed,
       COALESCE(SUM(CASE WHEN CarType = 'SUV' THEN BlueCount  END), 0) AS SUVBlue,
       COALESCE(SUM(CASE WHEN CarType = 'sedan' THEN RedCount  END), 0) AS SedanRed,
       COALESCE(SUM(CASE WHEN CarType = 'sedan' THEN BlueCount  END), 0) AS SedanBlue
FROM Users AS u    
LEFT JOIN UserCars AS uc 
  ON u.UserId = uc.UserId
GROUP BY u.UserId, u.UserDetails1 

Demo here

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much for the suggestion. i slightly modified the query and using it. I commented the modified query below.
2

As @Giorgos Betsos pointed out conditional aggregation can be used to avoid the left joins in my initial query. Thanks to Giorgos Betsos for suggesting that. But the reason for not accepting @Giorgos Betsos's answer as the answer for original question is that grouping on Users table using all columns from users table is taking more time. In real case there will be more number of columns to be fetched from users table and hence it should be avoided.

I slightly modified his query as follows

SELECT
 --User Information 
u.UserId, u.UserDetails1,
 --Total Counts by color
temp.TotalRed, temp.TotalBlue,
  -- Counts by type
temp.SUVRed, temp.SUVBlue, 
temp.sedanRed, temp.sedanBlue
FROM Users u
JOIN (SELECT userid,
      SUM(RedCount) AS TotalRed, SUM(BlueCount) AS TotalBlue,
       COALESCE(SUM(CASE WHEN CarType = 'SUV' THEN RedCount  END), 0) AS SUVRed,
       COALESCE(SUM(CASE WHEN CarType = 'SUV' THEN BlueCount  END), 0) AS SUVBlue,
       COALESCE(SUM(CASE WHEN CarType = 'sedan' THEN RedCount  END), 0) AS SedanRed,
       COALESCE(SUM(CASE WHEN CarType = 'sedan' THEN BlueCount  END), 0) AS SedanBlue
FROM usercars GROUP BY userid) temp
ON (temp.userid = u.userid)

I ran both these queries against the same dataset and the query plan is as follows

For query in Giorgos Betsos's answer


 GroupAggregate  (cost=34407.59..41848.99 rows=99999 width=25) (actual time=477.323..644.976 rows=99999 loops=1)
   ->  Sort  (cost=34407.59..34903.09 rows=198197 width=25) (actual time=477.303..513.956 rows=199974 loops=1)
         Sort Key: u.userid, u.userdetails1
         Sort Method: external merge  Disk: 7608kB
         ->  Hash Right Join  (cost=3375.98..12227.15 rows=198197 width=25) (actual time=83.339..265.419 rows=199974 loops=1)
               Hash Cond: (uc.userid = u.userid)
               ->  Seq Scan on usercars uc  (cost=0.00..3176.51 rows=199951 width=16) (actual time=0.009..48.687 rows=199951 loops=1)
               ->  Hash  (cost=1636.99..1636.99 rows=99999 width=13) (actual time=83.137..83.137 rows=99999 loops=1)
                     Buckets: 4096  Batches: 8  Memory Usage: 570kB
                     ->  Seq Scan on users u  (cost=0.00..1636.99 rows=99999 width=13) (actual time=0.009..34.343 rows=99999 loops=1)
 Total runtime: 649.600 ms

For the modified query given in this comment

Hash Join  (cost=3376.40..23359.86 rows=100884 width=61) (actual time=87.938..392.103 rows=99976 loops=1)
   Hash Cond: (temp.userid = u.userid)
   ->  Subquery Scan on temp  (cost=0.42..15883.52 rows=100884 width=52) (actual time=0.064..231.107 rows=99976 loops=1)
         ->  GroupAggregate  (cost=0.42..14874.68 rows=100884 width=16) (actual time=0.063..216.605 rows=99976 loops=1)
               ->  Index Scan using user_cartype on usercars  (cost=0.42..8367.18 rows=199951 width=16) (actual time=0.036..44.917 rows=199951 loops=1)
   ->  Hash  (cost=1636.99..1636.99 rows=99999 width=13) (actual time=87.635..87.635 rows=99999 loops=1)
         Buckets: 4096  Batches: 8  Memory Usage: 570kB
         ->  Seq Scan on users u  (cost=0.00..1636.99 rows=99999 width=13) (actual time=0.008..36.204 rows=99999 loops=1)
 Total runtime: 395.397 ms

Once again thanks to Giorgos Betsos for his suggestion.

Thanks,
Ravi

1 Comment

You could also use function crosstab from tablefunc extension if you're doing pivot.
1
  SELECT
--User Information 
u.UserId, u.UserDetails1,
 --Total Counts by color
temp.TotalRed, temp.TotalBlue,
  -- Counts by type
temp.SUVRed, temp.SUVBlue, 
temp.sedanRed, temp.sedanBlue
FROM Users u
(SELECT userid,
      SUM(RedCount) AS TotalRed, SUM(BlueCount) AS TotalBlue,
       COALESCE(SUM(CASE WHEN CarType = 'SUV' THEN RedCount  END), 0) AS SUVRed,
       COALESCE(SUM(CASE WHEN CarType = 'SUV' THEN BlueCount  END), 0) AS SUVBlue,
       COALESCE(SUM(CASE WHEN CarType = 'sedan' THEN RedCount  END), 0) AS SedanRed,
       COALESCE(SUM(CASE WHEN CarType = 'sedan' THEN BlueCount  END), 0) AS SedanBlue
FROM usercars GROUP BY userid) temp
ON (temp.userid = u.userid)

1 Comment

Please introduce/explain your answer with words. Don't just post code, as we want to understand how your answer solves the problem, however good it might be.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.