2

I have a project running on PostgreSQL v9.4 and Django 1.6 (planning to upgrade to v1.8 soon...)

Let's say I have the following model:

class Product(models.Model):
    x = models.ForeignKey(Shop)
    y = models.PositiveIntegerField()
    class Meta:
        unique_together = (("x", "y"), )

As far as I understand:

  1. Field x is a foreign key, so unless requested otherwise, Django automatically adds a db index for (x)
  2. Due to the unique_together statement, PostgreSQL implicitly generates a combined index for (x, y)
  3. Generally, an index for (x, y) also serves as an index for (x)
  4. Duplicates indexes consume extra time in SQL INSERT operations

My conclusion is that in such a scenario, it is better to explicitly declare db_index=false for field x, to avoid the duplicate indexes.

Is this a valid conclusion?

1 Answer 1

0

As per documentation, an index on (x, y) will serve also as an index for queries against x. Since the index has x as the first column, it is also more efficient than (y, x). So in that sense you are correct.

Tables with multiple indexes will consume more time on insert since more indexes will be updated. The amount of time depends on the table structure.

In the end it depends on the amount of data y will cause. If there is a lot of variation on y, the index size may be a lot bigger than an index on just x. In that case querying based on this index will cause more data to be loaded and processed from the index. If the variation is low and index size is not much larger, it may be better to just have the (x, y) index.

But, as always, this requires benchmarking, benchmarking and benchmarking.

Note on data integrity

If you decide to drop the index on x and only have the unique index on (x, y), then x itself may include non-unique values, as long as the y is different. Depending on the implementation this may or may not be important.

Sign up to request clarification or add additional context in comments.

2 Comments

perhaps I was not clear enough in my explanation: I need the (x, y) index for sure (benchmarking showed so as well). The question was regarding the (x) index that is automatically created because it's a foreign key: Does it make sense to explicitly ask Django not to create it (db_index=false)?
Yes, I did understand that. That's why it depends on the amount of data y would create, so it needs benchmarking, or at least checking index sizes on a real data. Also it affects data integrity, will amend that into my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.