PostgreSQL - query against GIN index of HSTORE value

Question

I have the following constructor (as a test):

CREATE TABLE product (id BIGSERIAL PRIMARY KEY, ext hstore);
CREATE INDEX ix_product_ext ON product USING GIN(ext);

INSERT
INTO    product (id, ext)
SELECT  id, ('size=>' || CEILING(10 + RANDOM() * 90) || ',mass=>' || CEILING(10 + RANDOM() * 90))::hstore
FROM    generate_series(1, 100000) id;

I have the following query, which works ok:

SELECT  COUNT(id)
FROM    (
    SELECT  id
    FROM    product
    WHERE  (ext->'size')::INT >= 41
    AND    (ext->'mass')::INT <= 20
) T

But I believe the correct way to do this is using the @> operator. I have the following, but it gives a syntax error:

SELECT  COUNT(id)
FROM    (
    SELECT  id
    FROM    product
    WHERE  ext @> 'size>=41,mass<=20'
) T

How should I write this?

Denis de Bernardy · Accepted Answer · 2011-06-24 22:46:50Z

6

Your initial attempt is correct but you need to use (partial) btree indexes and bitmap index scans to rely on it:

create index on product(((ext->'size')::int)) where ((ext->'size') is not null);

The same for mass, and if the planner doesn't get it on the spot add two where clauses, ie where ext->'size' is not null and the same for mass.

If there is a pattern of some kind (which is likely, since most products with a size also have a mass), potentially create a multicolumn index combining the two - one sac, the other desc.

The gin index as you wrote it, along with the accompanying query (with a syntax error) will basically do the same thing but unordered; it'll be slower.

answered Jun 24, 2011 at 22:46

Denis de Bernardy

79.1k14 gold badges138 silver badges158 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

IamIC Over a year ago

Thanks Dennis. I had included partial indexes as you show them in my tests, and actually they were slower, both for insert and for query. In fact, querying was substantially faster against the GIN. Why do you say I need partial indexes for reliability?

IamIC Over a year ago

Actually, technically that is an expression index, not a partial index.

IamIC Over a year ago

Further testing showed that expression indexes are in fact faster on average. Complex queries win with GIN. This would really be a typical workload test scenario.

Denis de Bernardy Over a year ago

Yeah. The one I suggested yesterday is actually expression and partial. If you're repeatedly doing the same kind of query over and over, especially if there is any need for ordering the results and tying them to a limit, my experience is you'll get better performance out of (pre-ordered and partial) expression indexes; GIN will win if you're constantly querying against widely varying conditions (e.g. tsvectors).

Grzegorz Szpetkowski · Accepted Answer · 2011-06-24 22:09:27Z

3

Reading hstore documentation your (last query) size>=41 does not mean "when size is greater or equal than 41":

text => text    make single-pair hstore

Following that you can't write mass<=20, because there is no such operation. Using @> operator:

hstore @> hstore    does left operand contain right?

you can write:

SELECT count(id)
FROM product
WHERE ext @> 'size=>41,mass=>20';

However it takes only these products where size is equal to 41 and mass is equal to 20.

answered Jun 24, 2011 at 22:09

Grzegorz Szpetkowski

38.1k6 gold badges94 silver badges140 bronze badges

Collectives™ on Stack Overflow

PostgreSQL - query against GIN index of HSTORE value

2 Answers 2

4 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related