PostgreSQL Full Text search

Question

I need to use Full Text Search with Postgresql but I don't find the way to look for a list of words from a table (using ts_query) against an indexed text field (ts_vector data type). Is ts_query just able to process a few words or can process also multiple values that come from a table?

Thanks in advance for your help.

A to_tsquery can search for multiple tokens by splitting the tokens with a &: to_tsquery('foo & bar & baz'). So you could retrieve your list of words from your table and feed them (as tokens) to the `to_tsquery' function, separated by ampersands. Yet ... a few code examples could help to better illustrate what you are trying to accomplish. — Timusan
– Timusan, Commented Oct 26, 2015 at 2:01
I'm trying to do something like this: SELECT * FROM table , to_tsquery(SELECT words FROM another_table) query WHERE Table.indexed_text_field @@ query; I think your solution would be great, but I don't know how to feed to_tsquery. — Char-Lee
– Char-Lee, Commented Oct 26, 2015 at 6:57

Fabio Milheiro · Accepted Answer · 2023-06-27 10:12:28Z

7

Let me try to formulate an answer according to the comments given on the question (if I understand your request correctly).

Problem

You are trying to do a full text search on the table tableA, column indexed_text_field (a tsvector type) based on words that are stored as text in another table tableB in a column called words.

Solution

First, if you wish to feed PostgreSQL multiple tokens (individual words) during a full text search you have two functions at your disposal:

to_tsquery()
plainto_tsquery()

In the first function you need to split each given token with an ampersand (&). The second function can be fed any string of text and it will chop it into tokens for you. More info here.

Your challenge is that you wish to select matches based on words present in another table. This can be done in different ways, for example via a simple (INNER) JOIN:

SELECT a.* FROM tableA a, tableB b WHERE a.indexed_text_field @@ to_tsquery(b.words);

Or if you have multiple words in the words column you should most likely be using the plainto_tsquery() function to keep things simple:

SELECT a.* FROM tableA a, tableB b WHERE a.indexed_text_field @@ plainto_tsquery(b.words);

Yet, if you must use the more low-level to_tsquery() version:

SELECT a.* FROM tableA a, tableB b WHERE a.indexed_text_field @@ to_tsquery(replace(b.words, ' ', '&'));

In the latter you replace all spaces between the words with an ampersand, thus making them separate tokens. Mind the index usage on the last one though, as you might need to create an expression index on the usage of the replace() function.

edited Jun 27, 2023 at 10:12

Fabio Milheiro

8,54420 gold badges63 silver badges104 bronze badges

answered Oct 26, 2015 at 11:08

Timusan

3,4551 gold badge24 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Char-Lee Over a year ago

Thank you very much for the clear answer, it has been really useful for me. Now I'm facing some performance problems while running these codes, while indexed_text_fieldis a 23 Million record table and words is a 140 record table, it's taking more than 5 hours to finish. Do you think it's correct?

Timusan Over a year ago

You are very welcome. If the answer helped you, please accept it as so. Regarding the performance, 5 hours is awfully long for this run, without a proper look at your database setup my first guess would be that your tsvector data misses a database index (Gist or Gin). What kind of indexes do you have on your data?

Char-Lee Over a year ago

I'm using GIST for both fields.

Timusan Over a year ago

Well, the next step is to see what the planner is doing then ... what is the output using EXPLAIN ANALYZE on the same query that took 5 hours (Btw, which query of the above did you run)?

Collectives™ on Stack Overflow

PostgreSQL Full Text search

1 Answer 1

Problem

Solution

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Problem

Solution

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related