0

I need to implement a basic facet search sidebar in my app. I unfortunately can't use Elasticsearch/Solr/alternatives and limited to Postgres.

I have around 10+ columns ('status', 'classification', 'filing_type'...) I need to return counts for every distinct value after every search made and display them accordingly. I've drafted this bit of sql, however, this won't take me very far in the long run as it will slow down massively once I reach a high number of rows.

select row_to_json(t) from (
    select 'status' as column, status as value, count(*) from api_articles_mv_temp group by status 
  union
    select 'classification' as column, classification as value, count(*) from api_articles_mv_temp group by classification 
  union 
    select 'filing_type' as column, filing_type as value, count(*) from api_articles_mv_temp group by filing_type
  union
    ...) t;

This yields

 {"column":"classification","value":"State","count":2001}
 {"column":"classification","value":"Territory","count":23}
 {"column":"filing_type","value":"Joint","count":169}
 {"column":"classification","value":"SRO","count":771}
 {"column":"filing_type","value":"Single","count":4238}
 {"column":"status","value":"Updated","count":506}
 {"column":"classification","value":"Federal","count":1612}
 {"column":"status","value":"New","count":3901}

From the query plan, the HashAggregates are slowing it down.

Subquery Scan on t  (cost=2397.58..2397.76 rows=8 width=32) (actual time=212.822..213.022 rows=8 loops=1)
  ->  HashAggregate  (cost=2397.58..2397.66 rows=8 width=186) (actual time=212.780..212.856 rows=8 loops=1)
         Group Key: ('status'::text), api_articles_mv_temp.status, (count(*))
         ->  Append  (cost=799.11..2397.52 rows=8 width=186) (actual time=75.238..212.701 rows=8 loops=1)
               ->  HashAggregate  (cost=799.11..799.13 rows=2 width=44) (actual time=75.221..75.242 rows=2 loops=1)
                     Group Key: api_articles_mv_temp.status
...

Is there a simpler, more optimized way of getting this result?

1 Answer 1

1

It may be improve the performance that reading api_articles_mv_temp is just once. I gave you examples so can you try them?

  1. If the combinations of "column" and "value" are fixed, the query looks like this:
select row_to_json(t) from (
  select "column", "value", count(*) as "count"
  from column_temp left outer join api_articles_mv_temp on
    "value"=
    case "column"
      when 'status' then status
      when 'classification' then classification
      when 'filing_type' then filing_type
    end
  group by "column", "value"
) t;

The column_temp has records below:

column         |value
---------------+----------
status         |New
status         |Updated
classification |State
classification |Territory
classification |SRO
filing_type    |Single
filing_type    |Joint

DB Fiddle

  1. If just the "column" is fixed, the query looks like this:
select row_to_json(t) from (
  select "column",
    case "column"
      when 'status' then status
      when 'classification' then classification
      when 'filing_type' then filing_type
    end as "value",
    sum("count") as "count"
  from column_temp a
    cross join (
      select
        status,
        classification,
        filing_type,
        count(*) as "count"
      from api_articles_mv_temp
      group by
        status,
        classification,
        filing_type) b
  group by "column", "value"
) t;

The column_temp has records below:

column         
---------------
status         
classification 
filing_type    

DB Fiddle

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.