3

I have the following constructor (as a test):

CREATE TABLE product (id BIGSERIAL PRIMARY KEY, ext hstore);
CREATE INDEX ix_product_ext ON product USING GIN(ext);

INSERT
INTO    product (id, ext)
SELECT  id, ('size=>' || CEILING(10 + RANDOM() * 90) || ',mass=>' || CEILING(10 + RANDOM() * 90))::hstore
FROM    generate_series(1, 100000) id;

I have the following query, which works ok:

SELECT  COUNT(id)
FROM    (
    SELECT  id
    FROM    product
    WHERE  (ext->'size')::INT >= 41
    AND    (ext->'mass')::INT <= 20
) T

But I believe the correct way to do this is using the @> operator. I have the following, but it gives a syntax error:

SELECT  COUNT(id)
FROM    (
    SELECT  id
    FROM    product
    WHERE  ext @> 'size>=41,mass<=20'
) T

How should I write this?

2 Answers 2

6

Your initial attempt is correct but you need to use (partial) btree indexes and bitmap index scans to rely on it:

create index on product(((ext->'size')::int)) where ((ext->'size') is not null);

The same for mass, and if the planner doesn't get it on the spot add two where clauses, ie where ext->'size' is not null and the same for mass.

If there is a pattern of some kind (which is likely, since most products with a size also have a mass), potentially create a multicolumn index combining the two - one sac, the other desc.

The gin index as you wrote it, along with the accompanying query (with a syntax error) will basically do the same thing but unordered; it'll be slower.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks Dennis. I had included partial indexes as you show them in my tests, and actually they were slower, both for insert and for query. In fact, querying was substantially faster against the GIN. Why do you say I need partial indexes for reliability?
Actually, technically that is an expression index, not a partial index.
Further testing showed that expression indexes are in fact faster on average. Complex queries win with GIN. This would really be a typical workload test scenario.
Yeah. The one I suggested yesterday is actually expression and partial. If you're repeatedly doing the same kind of query over and over, especially if there is any need for ordering the results and tying them to a limit, my experience is you'll get better performance out of (pre-ordered and partial) expression indexes; GIN will win if you're constantly querying against widely varying conditions (e.g. tsvectors).
3

Reading hstore documentation your (last query) size>=41 does not mean "when size is greater or equal than 41":

text => text    make single-pair hstore

Following that you can't write mass<=20, because there is no such operation. Using @> operator:

hstore @> hstore    does left operand contain right?

you can write:

SELECT count(id)
FROM product
WHERE ext @> 'size=>41,mass=>20';

However it takes only these products where size is equal to 41 and mass is equal to 20.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.