I have a problem where I need to round timestamps representing end dates to the end of the current month or end of the prior month depending on where those timestamps fall relative to NOW() assuming the timestamp is in the same month as NOW(). I'm basically just rounding timestamps to the end of a particular month with some logic about which month to pick. There are two helper functions I'm using to make my date math a bit easier, one which converts dates to the start of the month and one which converts dates to the last day of the month:
CREATE FUNCTION first_of_month(val date) RETURNS date
IMMUTABLE
STRICT
PARALLEL SAFE
LANGUAGE sql
RETURN (DATE_TRUNC('MONTH'::text, (val)::timestamp WITH TIME ZONE))::date;
CREATE FUNCTION last_of_month(val date) RETURNS date
IMMUTABLE
STRICT
PARALLEL SAFE
LANGUAGE sql
RETURN ((DATE_TRUNC('MONTH'::text, (val)::timestamp WITH TIME ZONE) + '1 mon -1 days'::interval))::date;
The main function I'm calling in my operation is as follows:
CREATE FUNCTION rounded_end(end_ts timestamp WITH TIME ZONE, now_ts timestamp WITH TIME ZONE) RETURNS date
IMMUTABLE
STRICT
PARALLEL SAFE
LANGUAGE sql
RETURN CASE
WHEN ((first_of_month((now_ts)::date) = first_of_month((end_ts)::date)) AND (end_ts >= now_ts))
THEN last_of_month((end_ts)::date)
ELSE (first_of_month((end_ts)::date) - 1) END;
All of these functions are written in SQL and marked IMMUTABLE with the goal of allowing them to be inlined, per information in this wiki.
However, I'm seeing wildly different performance results depending on whether I call rounded_end or inline it manually.
For example, this sample call which uses rounded_end takes 8-9s for me locally:
WITH timestamps AS (SELECT GENERATE_SERIES(timestamp '2014-01-10 20:00:00' +
RANDOM() * (timestamp '2014-01-20 20:00:00' -
timestamp '2014-01-10 10:00:00'),
timestamp '2025-01-10 20:00:00' +
RANDOM() * (timestamp '2025-01-20 20:00:00' -
timestamp '2025-01-10 10:00:00'),
'10 minutes') AS ts)
SELECT rounded_end(ts, NOW())
FROM timestamps;
While this sample call which manually inlines the body of rounded_end runs in under 2s:
WITH timestamps AS (SELECT GENERATE_SERIES(timestamp '2014-01-10 20:00:00' +
RANDOM() * (timestamp '2014-01-20 20:00:00' -
timestamp '2014-01-10 10:00:00'),
timestamp '2025-01-10 20:00:00' +
RANDOM() * (timestamp '2025-01-20 20:00:00' -
timestamp '2025-01-10 10:00:00'),
'10 minutes') AS ts)
SELECT CASE
WHEN ((first_of_month((NOW())::date) = first_of_month((ts)::date)) AND (ts >= NOW()))
THEN last_of_month((ts)::date)
ELSE (first_of_month((ts)::date) - 1) END
FROM timestamps;
A DB Fiddle with a repro and ANALYZE timings is here: https://dbfiddle.uk/CtQxpa3S. I'm running on Postgres 15.
- What gives? I can see in the results of
ANALYZEthat my function isn't being inlined, although all dependencies in the callstack should beIMMUTABLE, unless I'm missing something. What's preventing inlining? - For
first_of_monthandlast_of_month, I would very happily precompute a mapping of all possible dates in the time range my data covers to what those results look like if it would be a performance optimization. Is there a strategy in Postgres for doing this, basically building a hashmap of all dates between 2014 and 2025 and what their respective first and last of months would be? I know I could write a new table and do aJOIN, but I'm wondering if there's a cheaper solution than aJOINavailable out there.
Thanks much for the help!
last_of_monthdoing interval math, which this answer suggests is not immutable. Trying to see if there's a workaround, and still curious about strategies for precomputing first and last of month dates.