0

I have an events table that contains various creation, completion and failure events. Each event has an ID (primary key in the table), but also a "entity_id", which links multiple events together.

For example, when a request is created and then completed, we will have two events:

  • request #42 has been created
  • request #42 has been completed

In the above example, 42 is the entity_id of the request.

CREATE TABLE IF NOT EXISTS events (
    id SERIAL PRIMARY KEY,
    entity_id INTEGER NOT NULL,
    type VARCHAR(255) NOT NULL,
    occurred_at TIMESTAMP NOT NULL
);

INSERT INTO events (entity_id, type, occurred_at) VALUES
(1, 'created', '2019-08-08 11:20:04.791592+00'),
(1, 'completed', '2019-08-08 11:20:05.791592+00'),
(2, 'created', '2019-08-08 11:20:06.791592+00'),
(2, 'failed', '2019-08-08 11:20:07.791592+00'),
(3, 'created', '2019-08-08 11:20:08.791592+00'),
(3, 'completed', '2019-08-08 11:20:09.791592+00');

I want to create a view of that table, so that each entity_id is associated with creation and completion/failure time.

A query on that view should return the following result:

 entity_id |         created_at         |        completed_at        |         failed_at          
-----------+----------------------------+----------------------------+----------------------------
         1 | 2019-08-08 11:20:04.791592 | 2019-08-08 11:20:05.791592 | 
         2 | 2019-08-08 11:20:06.791592 |                            | 2019-08-08 11:20:07.791592
         3 | 2019-08-08 11:20:08.791592 | 2019-08-08 11:20:09.791592 |

I tried with left join, but couldn't get any good result. So far, my best attempt is this:

SELECT
    e.entity_id,
    e.occurred_at as created_at,
    (SELECT occurred_at FROM events WHERE type = 'completed' AND entity_id = e.entity_id) AS completed_at,
    (SELECT occurred_at FROM events WHERE type = 'failed' AND entity_id = e.entity_id) AS failed_at
FROM events e
WHERE e.type = 'created';

That seems pretty inelegant to me, and probably inefficient as well.

Can you suggest a better alternative? I'm using postgres, and glad to use features that are postgres-specific.

3 Answers 3

1

You can use window functions:

SELECT e.*
FROM (SELECT e.entity_id,
             e.occurred_at as created_at,
             MAX(e.occurred_at) FILTER (WHERE type = 'completed') OVER (PARTITION BY e.entity_id) AS completed_at,
             MAX(e.occurred_at) FILTER (WHERE type = 'failed') OVER (PARTITION BY e.entity_id) AS failed_at
      FROM events e
     ) e
WHERE e.type = 'created';

But, aggregation is probably more appropriate:

SELECT e.entity_id,
       MAX(e.occurred_at) FILTER (WHERE type = 'created') as created_at,
       MAX(e.occurred_at) FILTER (WHERE type = 'completed') AS completed_at,
       MAX(e.occurred_at) FILTER (WHERE type = 'failed') AS failed_at
FROM events e
GROUP BY e.entity_id;
Sign up to request clarification or add additional context in comments.

2 Comments

Nice! What's the major difference between your second suggestion and the answer of Tim? I tried to EXPLAIN both queries, and they yield exactly the same result, at least on this small dataset. Is there any advantage this one vs the other?
@aspyct . . . Postgres supports the standard FILTER clause and it is a bit faster. Tim's answer is what I would use in other databases.
1

You are looking for a pivot query:

SELECT
    entity_id,
    MAX(CASE WHEN type = 'created'   THEN occurred_at END) AS created_at,
    MAX(CASE WHEN type = 'completed' THEN occurred_at END) AS completed_at,
    MAX(CASE WHEN type = 'failed'    THEN occurred_at END) AS failed_at
FROM events
GROUP BY
    entity_id
ORDER BY
    entity_id;

enter image description here

Demo

1 Comment

Looks interesting, I had never seen this. Thanks for sharing the dbfiddle link, I didn't know about it and it's going to be useful!
1

You could try using case and a (fake) aggregation for reduce the rows

SELECT
    entity_id,
    max(case when  type = 'created' then occurred_at end ) as created_at,
    max(case when  type = 'completed' then occurred_at end)  as completed_at,
    max(case when  type = 'failed' then occurred_at end ) as failed_at,
FROM events 
group by entity_id

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.