0

I have a table with multiply columns. But for simplicity purpose, we can consider the following table:

create table tmp_table
(
    entity_1_id varchar(255) not null,
    status integer default 1 not null,
    entity_2_id varchar(255)
);

create index tmp_table_entity_1_id_idx
    on tmp_table (entity_1_id);

create index tmp_table_entity_2_id_idx
    on tmp_table (entity_2_id);

I want to execute this request:

SELECT tmp_table.entity_2_id, COUNT(*) FROM tmp_table 
    WHERE tmp_table.entity_1_id='cedca236-3f27-4db3-876c-a6c159f4d15e' AND 
          tmp_table.status <> 2 AND 
          tmp_table.entity_2_id = ANY (string_to_array('21c5598b-0620-4a8c-b6fd-a4bfee024254,af0f9cb9-da47-4f6b-a3c4-218b901842f7', ',')) 
    GROUP BY tmp_table.entity_2_id;

It works fine, when I send string to string_to_array function with a few values (like 1-20). But when I try to send 500 elements, it works too slow. Unfortunately, I really need 100-500 elements.

1 Answer 1

1

For this query:

SELECT t.entity_2_id, COUNT(*)
FROM tmp_table t
WHERE t.entity_1_id = 'cedca236-3f27-4db3-876c-a6c159f4d15e' AND 
      t.status <> 2 AND 
      t.entity_2_id = ANY (string_to_array('21c5598b-0620-4a8c-b6fd-a4bfee024254,af0f9cb9-da47-4f6b-a3c4-218b901842f7', ',')) 
GROUP BY t.entity_2_id;

I would recommend an index on tmp_table(entity_1_id, entity_2_id, status).

However, you might find this faster:

select rst.entity_2_id,
       (select count(*)
        from tmp_table t
        where t.entity_2_id = rst.entity_2_id and
              t.entity_1_id = 'cedca236-3f27-4db3-876c-a6c159f4d15e' AND 
              t.status <> 2
       ) as cnt
from regexp_split_to_table(str, ',') rst(entity_2_id);

Then you want an index on tmp_table(entity_2_id, entity_1_id, status).

In most databases, this would be faster, because the index is a covering index and this avoids the final aggregation over the entire result set. However, Postgres stores locking information on the data pages, so they still need to be read. It is still worth trying.

Sign up to request clarification or add additional context in comments.

2 Comments

what should I do, if I also need do queries with entity_1_id condition and entity_2_id condition? I mean, should I have 3 differ indexes (two mines and you index from answer). Or, your index allows to remove entity_1_id.
@Max . . . The one composite index should work for the query you have in your question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.