40

From what information I could find, they both solve the same problems - more esoteric operations like array containment and intersection (&&, @>, <@, etc). However I would be interested in advice about when to use one or the other (or neither possibly).
The PostgreSQL documentation has some information about this:

  • GIN index lookups are about three times faster than GiST
  • GIN indexes take about three times longer to build than GiST
  • GIN indexes are about ten times slower to update than GiST
  • GIN indexes are two-to-three times larger than GiST

However I would be particularly interested to know if there is a performance impact when the memory to index size ration starts getting small (ie. the index size becomes much bigger than the available memory)? I've been told on the #postgresql IRC channel that GIN needs to keep all the index in memory, otherwise it won't be effective, because, unlike B-Tree, it doesn't know which part to read in from disk for a particular query? The question would be: is this true (because I've also been told the opposite of this)? Does GiST have the same restrictions? Are there other restrictions I should be aware of while using one of these indexing algorithms?

2
  • 1
    These measurements vary according to which version of PG you're using. I just ran a test under 9.6.1 against an hstore column and got the following: GiST took 4.1 times longer to build than GIN and was about 170 times slower for lookups. Not even close to the figures you've quoted. Commented Dec 28, 2016 at 20:25
  • Are your figures (about what is faster) up to date? I haven't found them in the provided link (despite link is really helpful) Commented Jul 10, 2024 at 21:02

1 Answer 1

18

First of all, do you need to use them for text search indexing? GIN and GiST are index specialized for some data types. If you need to index simple char or integer values then the normal B-Tree index is the best.
Anyway, PostgreSQL documentation has a chapter on GIST and one on GIN, where you can find more info.
And, last but not least, the best way to find which is best is to generate sample data (as much as you need to be a real scenario) and then create a GIST index, measuring how much time is needed to create the index, insert a new value, execute a sample query. Then drop the index and do the same with a GIN index. Compare the values and you will have the answer you need, based on your data.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes, but the problem being time, it is more wise to ask in case someone already knows...
I think that generating you own data is the the only answer. I spent a few days reading all I could in the context of full-text and spatial about gist and gin indexes, inserting with indexes on or off, wal settings in postgres.sql and talking to people on #postgis/#postgresql and eventually came to the conclusion that the only way to be really sure, for my very specific usage, was to load test tables with large subsets of my actual data, and test. If you rush during setup to save a day here or there, you will end up wasting weeks over the following years, I have learnt by hard experience.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.