2

I've got a PostgreSQL array of strings as a column in a table. I created an index using the GIN method. But ANY queries won't use the index (instead, they're doing a sequential scan of the whole table with a filter). What am I missing?

Here's my migration:

class CreateDocuments < ActiveRecord::Migration
  def up
    create_table :documents do |t|
      t.string :title
      t.string :tags, array: true, default: []
      t.timestamps
    end

    add_index :documents, :tags, using: 'gin'

    (1..100000).each do |i|
      tags = []
      tags << 'even' if (i % 2) == 0
      tags << 'odd' if (i % 2) == 1
      tags << 'divisible by 3' if (i % 3) == 0
      tags << 'divisible by 4' if (i % 4) == 0
      tags << 'divisible by 5' if (i % 5) == 0

      Document.create(
        title: i,
        tags: tags
      )
    end
  end

  def down
    drop_table :documents
  end
end

Here's my query, with the resulting number of rows.

Document.where("'divisible by 5' = ANY (tags)").explain
    Document Load (249.8ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    D, [2014-03-07T17:09:49.689709 #41937] DEBUG -- :   Document Load (249.8ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    => EXPLAIN for: SELECT "documents".* FROM "documents"  WHERE ('divisible by 5' = ANY (tags))
                       QUERY PLAN
    -----------------------------------------------------------------
    Seq Scan on documents  (cost=0.00..3500.00 rows=20057 width=69)
      Filter: ('divisible by 5'::text = ANY ((tags)::text[]))
    (2 rows)

Document.where("'divisible by 5' = ANY (tags)").length
    Document Load (258.0ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    D, [2014-03-07T17:09:55.536517 #41937] DEBUG -- :   Document Load (258.0ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    => 20000
8
  • How many rows does the table products have? Commented Mar 7, 2014 at 16:37
  • What is the cardinality of that column?How many unique records reported to the number of rows? Commented Mar 7, 2014 at 16:43
  • @IgorRomanchenko The table has just over 100,000 rows Commented Mar 7, 2014 at 17:50
  • @Mihai There are over 100,000 unique rows. The column in question has over 18,000 unique values Commented Mar 7, 2014 at 17:52
  • Probably that ANY keyword make the optimizer choose a sequential scan as the best option.How many rows does that query returns? Commented Mar 7, 2014 at 18:02

1 Answer 1

5

To work with a GIN index use the <@ ("is contained by") operator instead of the ANY construct.

The manual states here that default GIN indexes currently only support these operators (additional functionality is shipped with extensions):

<@
@>
=
&&

So try this query:

Document.where("'{divisible by 5}' <@ tags").explain

Note that the left hand side needs to be in array notation, too, even if it's a single element. The operator <@ works for arrays. Hence '{divisible by 5}'.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.