0

Here is a simple query to get the projects list ordered by participations count.

SELECT projects.*, COUNT(participations) as ct from projects
  INNER JOIN participations ON participations.project_id=projects.id
  GROUP BY projects.id
ORDER BY ct DESC
LIMIT 5;

I'm not astonished it's slow (2 seconds). A simple query without ordering by participations count, is super speed (it's normal, ~10ms).

There are ~2k projects and ~430k participations.

What should I do to speed up this request? I feel I have to add an index, but I don't know which one(s?) and how.

EDIT:

Table paticipations:

create table participations
(
    id serial not null
        constraint participations_pkey
            primary key,
    project_id integer not null
        constraint participations_project_id_foreign
            references projects
        constraint participations_service_id_foreign
            references projects,
    status varchar(32) default 'unknown'::character varying not null,
    data jsonb,
    created_at timestamp(0),
    updated_at timestamp(0),
    deleted_at timestamp(0)
);

create index participations_status_index on participations (status);
create index participation_data_idx on participations (data);

Table projects:

create table projects
(
    id serial not null
        constraint services_pkey
            primary key,
    user_id integer not null
        constraint services_user_id_foreign
            references users,
    type varchar(50) not null,
    name varchar(255) not null,
    url varchar(255) not null,
    start_at timestamp(0),
    end_at timestamp(0),
    visibility varchar(255) not null,
    status varchar(255) not null,
    metadata jsonb not null,
    data jsonb not null,
    created_at timestamp(0),
    updated_at timestamp(0),
    deleted_at timestamp(0)
);

create index services_type_index on projects (type);
create index services_name_index on projects (name);
create index services_url_index on projects (url);
create index services_start_at_index on projects (start_at);
create index services_end_at_index on projects (end_at);
create index services_visibility_index on projects (visibility);
create index services_status_index on projects (status);

Query plan:

Limit  (cost=487359.41..487359.43 rows=5 width=2218) (actual time=2177.264..2177.265 rows=5 loops=1)
Buffers: shared hit=37446 read=3493, temp read=21051 written=39402
->  Sort  (cost=487359.41..487366.68 rows=2907 width=2218) (actual time=2177.262..2177.262 rows=5 loops=1)
        Sort Key: (count(participations.*)) DESC
        Sort Method: top-N heapsort  Memory: 33kB
        Buffers: shared hit=37446 read=3493, temp read=21051 written=39402
        ->  GroupAggregate  (cost=477474.47..487311.13 rows=2907 width=2218) (actual time=1170.136..2175.445 rows=1884 loops=1)
            Group Key: projects.id
            Buffers: shared hit=37446 read=3493, temp read=21051 written=39402
            ->  Merge Join  (cost=477474.47..485105.80 rows=435252 width=2218) (actual time=1170.122..2120.557 rows=435207 loops=1)
                    Merge Cond: (projects.id = participations.project_id)
                    Buffers: shared hit=37446 read=3493, temp read=21051 written=39402
                    ->  Sort  (cost=625.30..632.57 rows=2907 width=1131) (actual time=22.443..27.047 rows=2879 loops=1)
                        Sort Key: projects.id
                        Sort Method: external sort  Disk: 3216kB
                        Buffers: shared hit=429, temp read=402 written=402
                        ->  Seq Scan on projects  (cost=0.00..458.07 rows=2907 width=1131) (actual time=0.011..1.532 rows=2879 loops=1)
                                Buffers: shared hit=429
                    ->  Materialize  (cost=476849.12..479025.38 rows=435252 width=1091) (actual time=1147.671..1996.166 rows=435207 loops=1)
                        Buffers: shared hit=37017 read=3493, temp read=20649 written=39000
                        ->  Sort  (cost=476849.12..477937.25 rows=435252 width=1091) (actual time=1147.666..1400.856 rows=435207 loops=1)
                                Sort Key: participations.project_id
                                Sort Method: external merge  Disk: 165040kB
                                Buffers: shared hit=37017 read=3493, temp read=20649 written=20649
                                ->  Seq Scan on participations  (cost=0.00..19528.52 rows=435252 width=1091) (actual time=0.012..318.347 rows=435207 loops=1)
                                    Buffers: shared hit=37017 read=3493
Planning time: 1.309 ms
Execution time: 2203.249 ms
3
  • Please edit your question and add the create table statements for the tables in question (including all indexes), the query you are using and the execution plan generated using explain (analyze, buffers). Formatted text please, no screen shots Commented Mar 16, 2018 at 10:01
  • @a_horse_with_no_name Ok, done. Commented Mar 16, 2018 at 10:11
  • These FK constraints don't look like they would work.(you could use the output of \dt your_table_name instead) Also: foreign keys need a supporting index. Commented Mar 16, 2018 at 11:25

2 Answers 2

1

The order by cannot use an index, because it is based on a calculated value.

I might suggest doing the order by and limit in a subquery:

SELECT pr.*, p.ct 
FROM projects pr JOIN
     (SELECT p.project_id, COUNT(*) as ct
      FROM participations p
      GROUP BY p.project_id
      ORDER BY ct DESC
      LIMIT 5
     ) p
     ON p.project_id = pr.id;
Sign up to request clarification or add additional context in comments.

Comments

1

You could try to aggregate first, then join on the aggregation:

SELECT projects.*, pt.ct 
from projects 
  join (
    select project_id, count(*) as ct 
    from participations 
    GROUP BY project_id
  ) pt ON pt.project_id = projects.id
ORDER BY ct DESC
LIMIT 5;

Also: all your sorts are done on disk. If you increase work_mem this should also improve performance.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.