1

Consider the following:

CREATE TEMPORARY TABLE foo (string text);
INSERT INTO foo VALUES
    ('the small but capable man'),
    ('the small and strong but capable man');
SELECT * FROM foo WHERE to_tsvector(string) @@ to_tsquery('small<->but');
SELECT * FROM foo WHERE to_tsvector(string) @@ to_tsquery('small<->capable');
SELECT * FROM foo WHERE to_tsvector(string) @@ to_tsquery('small<2>capable');

The first query returns both rows, when it should only return one (because small but only appears once as a full phrase). The second query query correctly returns no rows because small and capable are never next to one another. The third correctly returns only one because small and capable are within two of one another.

So question is: why does the first query return both strings? Is there something unique about words like but (or maybe and, etc)?

1 Answer 1

1

ah, answer: this is using the english dictionary by default which excludes stop words by default (and, but, etc)-- to really match everything, I use the simple dictionary, e.g. SELECT * FROM foo WHERE to_tsvector('simple',string) @@ to_tsquery('simple','small<->but').

More here.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.