I'm trying to define a PostgreSQL aggregate function that is aware of rows asked for in the frame clause but that are missing. Specifically, let's consider an aggregate function framer whose job is to return an array consisting of the values aggregated through it, with any missing values in the frame returned as null. So,
select
n,
v,
framer(v) over (order by v rows between 2 preceding and 2 following) arr
from (values (1, 3200), (2, 2400), (3, 1600), (4, 2900), (5, 8200)) as v (n, v)
order by v
should return
"n" "v" "arr"
3 1600 {null,null,1600,2400,2900}
2 2400 {null,1600,2400,2900,3200}
4 2900 {1600,2400,2900,3200,8200}
1 3200 {2400,2900,3200,8200,null}
5 8200 {2900,3200,8200,null,null}
Basically I want to grab a range of values around each value, and it's important to me to know if I'm missing any to the left or to the right (or potentially both). Seems simple enough. I expected something like this to work:
create aggregate framer(anyelement) (
sfunc = array_append,
stype = anyarray,
initcond = '{}'
);
but it returns
"n" "v" "arr"
3 1600 {1600,2400,2900}
2 2400 {1600,2400,2900,3200}
4 2900 {1600,2400,2900,3200,8200}
1 3200 {2400,2900,3200,8200}
5 8200 {2900,3200,8200}
So sfunc is really only being called three times when two of the values are missing.
I haven't been able to think of any non-ridiculous way to capture those missing rows. It seems like there should be a simple solution, like somehow prepending/appending some sentinel nulls to the data before the aggregate runs, or maybe somehow passing in the index (and frame values) as well as the actual value to the function...
I wanted to implement this as an aggregate because it gave the nicest user-facing experience for what I want to do. Is there any better way?
FWIW, I'm on postgres 9.6.