3

Possible Duplicate:
Optimization of count query for PostgreSQL

Using PostgreSQL 9.2, we are trying to figure out if there is a way to keep track of the number of results for a query, and return that number in an efficient manner. This query should be executed several (possibly tens to hundreds or even thousands) of times per second. The query we have right now looks like this, but we wonder if this is inefficient:

-- Get # of rows that do not have ‘parameter value’ in array_column
select count(*) 
    from table
    where not (ARRAY[‘parameter value’] <@ table.array_column)

My questions are (an answer might solve multiple problems at the same time):

Is the count(id) (or count(*) for that matter) for that query a linear (O(n)) query?

Is there some way to make this query more efficient in PostgreSQL? Please keep in mind that we need to query for different parameter values, so I believe keeping a materialized view for it is not feasible (although, we might consider creating one for each parameter value if that is considered to be better).

Is there any change I should make to the query, database structure or the configuration of my PostgreSQL server that might help me improve the query performance?

Any pointers or suggestions will be greatly appreciated. If this is a completely wrong way to do this, please let me know.

Edit

Taking into consideration what was answered, I was wondering if it would be plausible to use materialized views. By this I mean having several materialized views (each one for a different parameter value, having the rows where that value is not present). We the parameter values are, to a certain extent, predictable, so this doesn't seem too far out there as a solution. This brings another matter into question: Would materialized views help here? Is there some limitation (either in definition or performance) as to the number of materialized views (or tables) that I can create in a database?

2
  • Not sure, but I think an GIN index on table.array_column will help to speed this up. You will need to run EXPLAIN to find out. See here: dba.stackexchange.com/a/27505/1822 and I think you don't need to create an array from your parameter value (if that is indeed only a single value) Commented Oct 25, 2012 at 19:07
  • This seems to be just the same as stackoverflow.com/q/13075210/398670 ... why? Is this an assignment that's come up somewhere? Commented Oct 26, 2012 at 0:47

1 Answer 1

1

The first idea that come to mind is to cache the value.

You should evaluate the rate of change of this value, and depending on that decide if you want to have a trigger to be executed when this table is updated to compute the new value and cache it somewhere.

The resulting query for that value would be a simple SELECT without any WHERE clause, making it very fast.

Or you could simply do the change, and get some stats before and after to know if you've gained in speed.

See there for further explanations.

Sign up to request clarification or add additional context in comments.

2 Comments

Ok, this sounds like it might work, but... where do you mean when you say "somewhere"? If you mean the database, I don't understand how I could cache the results for different parameter values without having a where. On the other hand, it might just be simpler and easier to look into caching it somewhere like Redis.
you could create a dedicated table, holding that value. The trigger would just update the value. Or you could dump the value to a file, and read the content from your app, but perhaps you'd get sync issues.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.