2

Is it possible to write an aggregate function in PostgreSQL that will calculate a delta value, by substracting the initial (last value in the column) from the current(first value in column) ? It would apply on a structure like this

rankings (userId, rank, timestamp)

And could be used like

SELECT userId, custum_agg(rank) OVER w 
FROM rankings
WINDOWS w AS (PARTITION BY userId ORDER BY timstamp desc)

returning for an userId the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

Thanks!

1
  • 1
    Well, it's a window, not an aggregate, but yes. As far as I know you must currently write window functions in C. Commented Feb 27, 2014 at 8:52

2 Answers 2

3

the rank of the newest entry (by timestamp) - rank of the oldest entry (by timestamp)

There are many ways to achieve this with existing functions. Like the window functions first_value() and last_value(), combined with DISTINCT or DISTINCT ON and without joins and subqueries:

SELECT DISTINCT ON (userid)
       userid, last_value(rank) OVER w - first_value(rank) OVER w AS rank_delta
FROM   rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts
             ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);

Note the custom window frame!

Or you can use basic aggregate functions - with joins and a subquery:

SELECT userid, r2.rank - r1.rank AS rank_delta
FROM  (
  SELECT userid, min(ts) AS first_ts, max(ts) AS last_ts
   FROM  rankings
   GROUP BY 1
   ) sub
JOIN   rankings r1 USING (userid)
JOIN   rankings r2 USING (userid)
WHERE  r1.ts = first_ts
AND    r2.ts = last_ts;

Assuming unique (userid, rank), or your requirements would be ambiguous.

Shichinin no samurai

... a.k.a. "7 Samurai"
The same for only the last seven rows per userid (or as many as can be found, if there are fewer) - one of the shortest ways:

SELECT DISTINCT ON (userid)
       userid, first_value(rank) OVER w - last_value(rank) OVER w AS rank_delta
FROM   rankings
WINDOW w AS (PARTITION BY userid ORDER BY ts DESC
             ROWS BETWEEN CURRENT ROW AND 7 FOLLOWING)
ORDER  BY userid, ts DESC;

Note the reversed sort order. The first row is the "newest" entry. I span a frame of (max.) 7 preceding rows and pick only the results for the newest entry per userid with DISTINCT ON.

fiddle
Old sqlfiddle 1, sqlfiddle 2

Sign up to request clarification or add additional context in comments.

6 Comments

Thanks Erwin! I tried to further adapt your first solution, using windows, to not using all rankings but only the newest 7 let's say. I did that by modifying to ROWS BETWEEN 7 PRECEDING AND 0 FOLLOWING, but somehow i still get all rows, not just top 7 (ordered by timestamp). Any idea why?
@maephisto: If you adapt the frame, you get varying result per userid. My solution builds on identical results per userid. Are the "newest 7" supposed to be relative to each row or absolute for the complete table?
It should be newest 7 entries for a userId. Only them should be taken into consideration, everything else older than the 7th chronological entry has no value
@maephisto: Do all of them have 7 or more, or can there be fewer?
It should be possible to have fewer than 7
|
1

You can do it with JOIN and DISTINCT ON in Postgres. The GRP query give you the last rank values for each userID so just join it with rankings on user_id and substract values.

SELECT rankings.userId, 
       rankings.rank-GRP.rank as delta,
       rankings.timestamp
FROM rankings
JOIN
(
    SELECT DISTINCT ON (userId)  userId, rank, timestamp
    FROM rankings
    ORDER BY userId, timestamp DESC
) as GRP ON rankings.userId=GRP.userId

SQLFiddle demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.