3

all! I have a table created in PostgreSQL 9.3 that contains about 24 millions records with the following structure:

CREATE TABLE jtest
(
  recordid uuid NOT NULL,
  details json,
  tsv tsvector,
  CONSTRAINT jtest_pkey PRIMARY KEY (recordid)
)
WITH (
  OIDS=FALSE
);

Column tsv filled by trigger on the grounds of json contents: to_tsvector('english', json_extract_path_text(details, 'info').

Field info in json structure have similar content. For example, "Milk from manufacturer one", "Milk from another two". There are about 1 million records that contains "milk".

If i use the following query:

select recordid from jtest where tsv @@ to_tsquery('english', 'milk');

and dont use any index on tsv column, that query takes about 250 seconds. If i create gin index on tsv field then that query takes about 200 seconds.

Is there a possibility to increase performance?

1
  • Why is someone searching for milk if it's in every record? Commented Jul 29, 2014 at 18:37

1 Answer 1

3

The answer lies in the question:

There are about 1 million records that contains "milk".

Index or not, you still need to retrieve that million rows. Also note that if a million means most rows, Postgres will ignore the index entirely and seq scan the entire table.

The index will help if you change the query to:

select recordid from jtest where tsv @@ to_tsquery('english', 'rare string');
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.