PostgreSQL doesn't use index with unaccent function

Question

I have the following table:

CREATE TABLE products (
  id   bigserial NOT NULL PRIMARY KEY,
  name varchar(2048)
  -- Many other rows
);

I want to make a case and diacritics insensitive LIKE query on name.

For that I have created the following function :

CREATE EXTENSION IF NOT EXISTS unaccent;
CREATE OR REPLACE FUNCTION immutable_unaccent(varchar)
  RETURNS text AS $$
    SELECT unaccent($1)
  $$ LANGUAGE sql IMMUTABLE;

And then created an index on name using this function:

CREATE INDEX products_search_name_key ON products(immutable_unaccent(name));

However, when I make a query, the query is very slow (about 2.5s for 300k rows). I'm pretty sure PostgreSQL is not using the index

-- Slow (~2.5s for 300k rows)
SELECT products.* FROM products
    WHERE immutable_unaccent(products.name) LIKE immutable_unaccent('%Hello world%')

-- Fast (~60ms for 300k rows), and there is no index
SELECT products.* FROM products
    WHERE products.name LIKE '%Hello world%'

I have tried creating a separate column with a case and diacritics insensitive copy of the name like so, and in that case the query is fast:

ALTER TABLE products ADD search_name varchar(2048);
UPDATE products
    SET search_name = immutable_unaccent(name);

-- Fast (~60ms for 300k rows), and there is no index
SELECT products.* FROM products
    WHERE products.search_name LIKE immutable_unaccent('%Hello world%')

What am I doing wrong ? Why doesn't my index approach work ?

Edit: Execution plan for the slow query

explain analyze SELECT products.* FROM products
    WHERE immutable_unaccent(products.name) LIKE immutable_unaccent('%Hello world%')

Seq Scan on products  (cost=0.00..79568.32 rows=28 width=2020) (actual time=1896.131..1896.131 rows=0 loops=1)
  Filter: (immutable_unaccent(name) ~~ '%Hello world%'::text)
  Rows Removed by Filter: 277986
Planning time: 1.014 ms
Execution time: 1896.220 ms

add execution plan - explain analyze SELECT products.* FROM products WHERE immutable_unaccent(products.name) LIKE immutable_unaccent('%Hello world%') — Vao Tsun
– Vao Tsun, Commented Apr 8, 2017 at 8:42
@ClodoaldoNeto Now that you say it, it makes sense I guess. I was hoping it would use the index because it should already contain the immutable_unaccent computed value. Is using a copy column the only way then ? — deadbeef
– deadbeef, Commented Apr 8, 2017 at 9:05
Even if you copy the column unaccented, you're still not going to be able to use an index with LIKE patterns starting with %. Postgres has pretty good full text search functionality built in however, maybe you should have a look at that. postgresql.org/docs/current/static/textsearch-intro.html — Ede
– Ede, Commented Apr 8, 2017 at 13:55
@Ede Thanks. I understand my queries still don't use an index, but a least the unaccent function is already precomputed which is much faster. I will take a look at full text search and see if it can solve my problem better. — deadbeef
– deadbeef, Commented Apr 9, 2017 at 8:17
@ClodoaldoNeto: I wouldn't say never.. it will if you use a gist/gin trigram index (see my answer below) — Joe Love
– Joe Love, Commented Apr 17, 2017 at 14:36

Joe Love · Accepted Answer · 2018-08-27 20:30:08Z

2

If you're wanting to do a like '%hello world%' type query, you must find another way to index it.

(you may have to do some initial installation of a couple of contrib modules. To do so, login as the postgres admin/root user and issue the following commands)

Prerequisite:

CREATE EXTENSION pg_trgm;
CREATE EXTENSION fuzzystrmatch;

Try the following:

create index on products using gist (immutable_unaccent(name) gist_trgm_ops);

It should use an index with your query at that point.

select * from product 
where immutable_unaccent(name) like '%Hello world%';

Note: this index could get large, but with 240 character limit, probably wont get that big.

You could also use full text search, but that's a lot more complicated.

What the above scenario does is index "trigrams" of the name, IE, each set of "3 letters" within the name. So it the product is called "hello world" it would index hel,ell,llo ,lo , wo, wor, orl, and rld. Then it can use that index against your search term in a more efficient way. You can use either a gist or a gin index type if you like.

Basically GIST will be slightly slower to query, but faster to update. GIN is the opposite>

edited Aug 27, 2018 at 20:30

answered Apr 10, 2017 at 15:31

Joe Love

6,1262 gold badges24 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

deadbeef Over a year ago

Thanks, I will try to take a look at that option

Joe Love Over a year ago

Let me know how it works. I added instructions on installing the prerequisite contrib modiules that come with PG, so please review the new instructions if you had any trouble with the prior instructions.

Collectives™ on Stack Overflow

PostgreSQL doesn't use index with unaccent function

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related