Postgres "first" aggregation function

Question

I am aggregation a table using file ID field. Each file has a name which matched exactly one (his) file id.

select file_key, min(fullfilepath)
from table
group by file_key

Because I know the structure of the table, I know that I need ANY fullfilepath. The min and the max are ok, but it requires a lot of time.

I came across this aggregation function which returns the first value. Unfortunately, this function takes a long time, because it scans the whole table. For example, this is very slow:

select first(file_id) from table;

What is the fastest way to do that? With or without aggregation function.

For the first query, try select distinct on (file_key), file_key, fullfilepath from the_table order by file_key, fullfilepath - that might be faster then the group by — user330315
– user330315, Commented Feb 14, 2017 at 13:15

Laurenz Albe · Accepted Answer · 2017-02-14 13:05:12Z

6

There is no way to make your first query with the GROUP BY clause faster, because it has to scan the whole table to find all groups.

Your second query can be made faster:

SELECT (
   SELECT file_id FROM "table"
   WHERE file_id IS NOT NULL
   LIMIT 1
);

There is no way to optimize the query as you wrote it, because the aggregate function is a black box to PostgreSQL.

answered Feb 14, 2017 at 13:05

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

pozs Over a year ago

You last statement is usually true. But PostgreSQL can optimize (and use an index), when it has a defined SORTOP (which min/max has).

Laurenz Albe Over a year ago

That means that you can use the index for SELECT min(field) FROM atable, but not for SELECT min(field) FROM atable GROUP BY anotherfield. Think about it - all different values of anotherfield have to be identified, and how can an index help there? That requires a sequential or index scan over the whole table, and the table scan is usually cheaper there.

Jasen · Accepted Answer · 2019-07-03 05:02:05Z

2

I doubt that this will help performance but it may be useful if anyone actually wants a first agregate.

-- coaslesce isn't a function so make an equivalent function.
create function coalesce_("anyelement","anyelement") returns "anyelement"     
    language sql as $$ select coalesce( $1,$2 ) $$;

create aggregate first("anyelement") (sfunc=coalesce_, stype="anyelement");

answered Jul 3, 2019 at 5:02

Jasen

12.5k2 gold badges37 silver badges50 bronze badges

Comments

Jasen · Accepted Answer · 2019-07-03 05:05:14Z

-1

select 
    distinct on (file_key) 
    file_key, fullfilepath
from table
order by file_key

That will return one record for each file_key

answered Jul 3, 2019 at 5:05

Jasen

12.5k2 gold badges37 silver badges50 bronze badges

Collectives™ on Stack Overflow

Postgres "first" aggregation function

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related