3

I was assigned to develop a full-text search functionality on PostgreSql 9.3 and I'd be very glad if I can hear other opinions and advices in this matter.

The problem is, that I need to implement a partial word match. An user will send out a string which can contain partial words, separated by space, and without order.

For example: string "lue ped zeb" should find a row with "Blue striped zebra" in it (in one column). It should be case-insensitive and the order of words should not matter (but these conditions are insignificant in this question).

Problem is performance. There are over 5 million rows in the database table on which the search is performed and I need to get to very small execution times.

Example query would be "SELECT * FROM table WHERE LOWER(text) LIKE ('%lue%ped%zeb');", which I suspect will be VERY slow because the wildcard at first position will cause the query to ignore indexes.

So far, I've found http://www.sai.msu.su/~megera/wiki/wildspeed, which is a index that could help me (size of the index doesn't really matter in this case), but the production server is running MS Windows and I don't know if this extension will be able to compile on windows. (I will try it and update my question).

I'm not a database developer and use Postgres usually only from applications, so I don't have much experience in database optimalization and lower-level operations.

Does anyone have some experience with similar problem, word of advice or example that can help me with this task?

2
  • 2
    The problem is, what you've described is not a google-like full text search (f.ex. google won't find zebra, if you are looking for zeb). But, there is a real full text search support in Postgres: postgresql.org/docs/9.3/static/textsearch.html Commented May 19, 2014 at 8:58
  • Thanks for your fast reply! I'm sorry, the comparsion to google wasn't correct. I am aware of the postgresql full-text search capabilities (vaguely), but that isn't exactly what I'm looking for. I really need the partial words match as stated above (return "Zebra" for "ebr") Commented May 19, 2014 at 9:44

1 Answer 1

8

Trigram is a contrib module for Postgres, which can help you achieve your goal. There is a complete example of its usage in the docs.

Beginning in 9.1, trigram support index searches for LIKE and ILIKE operators.

Beginning in 9.3, it support index searches for regular-expression matches (~ and ~* operators).

But if you want to search for any order of the provided partial words, you should query for each word separate:

...
WHERE LOWER(text) LIKE '%lue%'
   OR LOWER(text) LIKE '%ped%'
   OR LOWER(text) LIKE '%zeb%'
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks very much! Trigram seems like it could solve my problem and I will definitely test it. I will post my results here when I'am done.
Ok, so I tried the Trigram extension and it really works in the way I need it to. Thanks - you saved me a lot of time!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.