2

I have a table posts:

CREATE TABLE posts (
  id serial primary key,
  content text
);

When a user submits a post, how can I compare his post with the others and find similar posts?
I'm looking for something like StackOverflow does with the "Similar Questions".

2 Answers 2

5

While Text Search is an option it is not meant for this type of search primarily. The typical use case would be to find words in a document based on dictionaries and stemming, not to compare whole documents.

I am sure StackOverflow has put some smarts into the similarity search, as this is not a trivial matter.

You can get halfway decent results with the similarity function and operators provided by the pg_trgm module:

SELECT content, similarity(content, 'grand new title asking foo') AS sim_score
FROM   posts
WHERE  content  % 'grand new title asking foo'
ORDER  BY 2 DESC, content;

Be sure to have a GiST index on content for this.

But you'll probably have to do more. You could combine it with Text Search after identifying keywords in the new content ..

Sign up to request clarification or add additional context in comments.

Comments

0

You need to use Full Text Search in Postgres.

http://www.postgresql.org/docs/9.1/static/textsearch-intro.html

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.