0

I am searching for a database solution for real full text indexing. I have read Postgres' full text search chapter but it describes text searching which is not a "full" index and it is heuristic in nature.
However I found this https://pgpedia.info/f/fulltextindex.hml contrib/fulltextindex module which sound promising.

So my questions are as follows.
why was it removed in PostgreSQL 8.1?
how can I use it?
are there other alternative database solutions that do support this kind of feature?
what is the performance one can expect?

6
  • You will need to specify exactly what "full" means to you. Most people do want stemming etc. Commented Jun 17, 2022 at 5:57
  • Ther are at least 3 reasons for this post to be closed: (1) "A community-specific reason: Seeking recommendations for books, tools, software libraries, and more" (2) "Needs details or clarity: This question should include more details and clarify the problem". (3) "Needs more focus: This question currently includes multiple questions in one. It should focus on one problem only" Commented Jun 17, 2022 at 6:01
  • 1
    @DavidדודוMarkovitz Where would it be suitable to ask this kind of question? Commented Jun 17, 2022 at 6:12
  • It is less about the "where" and more about the "how". (1) You are not giving any context (What is it for? School project? Production system that needs to serve 1M requests per second?). (2) You are using obscure statements ("real full text indexing", "not a "full" index", "heuristic in nature") instead of describing what features you actually need / what you have found missing. Commented Jun 17, 2022 at 7:19
  • 1
    "what is the performance one can expect?" - getting back to the context. You haven't gave any information regarding your usage patters, SLO or hardware. If you would have tried something and you don't get the performance you need, you can then open a SO question and ask if there's something that can be done about it, Commented Jun 17, 2022 at 7:19

1 Answer 1

1

The index to use for full-text search is a GiST index, and there is nothing heuristic about it (except the "picksplit" algorithm). "fulltextindex" was removed in 8.2, and full text search got added to core in 8.3, so that's what you should use.

Read the WARNING file from release 8.1:

WARNING
-------

This implementation of full text indexing is very slow and inefficient.  It is
STRONGLY recommended that you switch to using contrib/tsearch which offers these
features:

Advantages
----------
* Actively developed and improved
* Tight integration with OpenFTS (openfts.sourceforge.net)
* Orders of magnitude faster (eg. 300 times faster for two keyword search)
* No extra tables or multi-way joins required
* Select syntax allows easy 'and'ing, 'or'ing and 'not'ing of keywords
* Built-in stemmer with customisable dictionaries (ie. searching for 'jellies' will find 'jelly')
* Stop words automatically ignored
* Supports non-C locales

Disadvantages
-------------
* Only indexes full words - substring searches on words won't work.
    eg. Searching for 'burg' won't find 'burger'

Due to the deficiencies in this module, it is quite likely that it will be removed from the standard PostgreSQL distribution in the future.

PostgreSQL is open source. To see the discussion that led to the removal of the module, search the archives. You will find this and this.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.