2

This is probably a standard problem, and I've keyed off some other answers, but so far been unable to resolve my current problem.

A              B             C
+----+-------+ +----+------+ +----+------+-------+
| id | start | | id | a_id | | id | b_id | name  |
+----+-------+ +----+------+ +----+------+-------+
|  1 |     1 | |  1 |    1 | |  1 |    1 | aname |
|  2 |     2 | |  2 |    1 | |  2 |    2 | aname |
+----+-------+ |  3 |    2 | |  3 |    3 | aname |
               +----+------+ |  4 |    3 | bname |
                             +----+------+-------+

In English what I'd like to accomplish is:

  1. For each c.name, select its newest entry based on the start time in a.start

The SQL I've tried is the following:

SELECT a.id, a.start, c.id, c.name 
FROM a
INNER JOIN (
    SELECT id, MAX(start) as start
    FROM a
    GROUP BY id
) a2 ON a.id = a2.id AND a.start = a2.start
JOIN b
ON a.id = b.a_id
JOIN c
on b.id = c.b_id
GROUP BY c.name;

It fails with errors such as:

ERROR: column "a.id" must appear in the GROUP BY clause or be used in an aggregate function Position: 8

To be useful I really need the ids from the query, but cannot group on them since they are unique. Here is an example of output I'd love for the first case above:

+------+---------+------+--------+
| a.id | a.start | c.id | c.name |
+------+---------+------+--------+
|    2 |       2 |    3 | aname  |
|    2 |       2 |    4 | bname  |
+------+---------+------+--------+

Here is a Sqlfiddle

Edit - removed second case

1
  • GROUP BY c.name; is not required. Commented Jul 11, 2016 at 18:43

1 Answer 1

5

Case 1

select distinct on (c.name)
    a.id, a.start, c.id, c.name
from
    a
    inner join
    b on a.id = b.a_id
    inner join
    c on b.id = c.b_id
order by c.name, a.start desc
;
 id | start | id | name  
----+-------+----+-------
  2 |     2 |  3 | aname
  2 |     2 |  4 | bname

Case 2

select distinct on (c.name)
    a.id, a.start, c.id, c.name
from
    a
    inner join
    b on a.id = b.a_id
    inner join
    c on b.id = c.b_id
where
    b.a_id in (
        select a_id
        from b
        group by a_id
        having count(*) > 1
    )
order by c.name, a.start desc
;
 id | start | id | name  
----+-------+----+-------
  1 |     1 |  1 | aname
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for such a fast answer! If I need to extend the distinct to additional columns in c, I'm assuming I just append it to the distinct statement, but also inside the order by? Also, I'm guessing the performance of this is going to get bad pretty quickly as overall row count of the join goes up due to multiple sorts?
@DavidE You can add items to the order by clause in addition to the obligatory c.name and the untier a.start. The select list is free. Check explain analyze

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.