0

DB-Type: PostgreSQL DB-Version: 11 We have a column which has a single word as a value always. The maxlength is 10 chars.

We always have unique value for this column in the table.

We do not have any updates to this column, only new rows are inserted with this column.

We would like to enable like queries for this column.

Should we consider the PostgreSQL TRGM extension and using a GIN index? or will a normal index suffice in this case?

The queries will be like this:

select * from my_table where my_column like '%abc%';

The question arrives from the fact that TRGM is quite powerful when full text search is required for a long text with many words, but wanted to know if it will be better than a normal index for the single word scenario also.

1
  • What would the queries look like? Commented Feb 18, 2020 at 15:03

3 Answers 3

1

A trigram index is the only index that could help with a LIKE query with a leading wildcard. For short search strings like the one you show, it may still be slow if the trigram occurs in many words. But that's the best you can get.

For a LIKE condition without a wildcard in the beginning, a b-tree index might well be faster.

Sign up to request clarification or add additional context in comments.

Comments

1

A "regular" index (b-tree) will generally be able to resolve:

where x like 'abcdefghij'
where x = 'abcdefghij'

It can also be used for prefix matches:

where x like 'abcd%'

However, it cannot be used when the pattern starts with a wildcard:

where x like '%hij'

So, whether the index is used depends on how you are going to use it. If the pattern starts with wildcards then a GIN index could be used.

I should add that regardless of the index, there are considerations if you want case-independence or are mixing collations.

3 Comments

Thanks, this helps a lot! we do have wildcards in the front. we do not have case issues luckily
@MozenRath . . . Then a GIN index is a better choice. That said, with only 10 characters there might be alternative representations of the data that work. For instance, if they were North American telephone numbers, splitting by area code, exchange, and line might suffice.
can you update your answer accordingly? Then I will accept it
-1

I think you have a fundamental (but kind of common) misunderstanding here:

The question arrives from the fact that TRGM is quite powerful when full text search is required for a long text with many words

No, that is what Full Text Search is for, which is very different than pg_trgm.

pg_trgm is fairly bad at long text with many words (not as bad since 9.6 as it was before that, but still not its strongest point), but it is good at this very thing you want.

The problem is that you need to have trigrams to start with. If your query was changed to like '%ab%' then pg_trgm would probably be worse than not having the index at all. So it might be worthwhile to validate the query in the app or client side to reject attempts to specify such patterns.

7 Comments

So even TRGM is useless for contains? I am not sure if that's the case. Do you have a reference for this?
Found a counter documentation here saying: Beginning in PostgreSQL 9.1, these index types also support index searches for LIKE and ILIKE, for example SELECT * FROM test_trgm WHERE t LIKE '%foo%bar'; The index search works by extracting trigrams from the search string and then looking these up in the index. The more trigrams in the search string, the more effective the index search is. Unlike B-tree based searches, the search string need not be left-anchored.
What specifically do you mean by "contains"? Can you translate that into SQL? Do I have a reference for what?
can you see the example in the question? it clearly states like '%abc%'
And I clearly stated that it works for that when the column has word or short phrases.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.