1

Simple question. Let's say I have a users table and a cars table. The cars table has a user_id, make, model, and I always query its data with the user_id:

SELECT * FROM cars WHERE user_id = 123 AND make = 'honda'
SELECT * FROM cars WHERE user_id = 123 AND model = 'accord'

Assuming I always query the cars table with a user_id, is it better to add two multicolumn indexes [user_id, make] and [user_id, model] (and potentially more for additional columns), or a single-column index for each user_id, make, and model column?

What's confusing me is the idea of having several multicolumn indexes that all start with the same foreign_key. Seems like that best fits my queries, but not sure how correct/efficient/wasteful it is to the database.

2
  • the difference is that a multicolumn index will be a covering index if you use AND conditions,so the optimizer won`t have to touch the table for searching,just for retreving.So yes use the first option,not sure but Postgres wont be able to use 2 indexes at the same time so for OR conditions it might pick the index with the most restrictive condition Commented May 21, 2015 at 6:00
  • 1
    @Mihai. That is not what a covered query means. You are confusing the issue. Commented May 21, 2015 at 9:27

1 Answer 1

2

This answer considers what is most "correct/efficient/least-wasteful" in the database.

Assuming I always query the cars table with a user_id,

That, what you do with the data or how you access it, is not relevant to database design and overall performance. It is relevant only to that single query.

is it better to add two multicolumn indexes [user_id, make] and [user_id, model] (and potentially more for additional columns), or a single-column index for each user_id, make, and model column?

The single column index is superfluous, a non-performer, it produces no gain.

  • Separately, you should update the statistics for each of those single columns.

First, separate to your question, the PK should be:

    ( user_id, make, model )

because (without seeing the full DDL for the table), that is the only method of providing row uniqueness, which is demanded in Relational databases. You do not need additional indices, even if attribute columns are added.

  • if you have an car_id field in that file, it is superfluous, redundant, and negative performance, due to the additional index it requires. You can safely remove it.

Second, that PK index is the only one you need, for the queries you have described.

  • What's confusing me is the idea of having several multicolumn indexes that all start with the same foreign_key.*

Yes, that should raise an alarm. Not that they all start with the same FK, but that they start with the same column(s). The index with the largest set of columns makes the others redundant.

Sign up to request clarification or add additional context in comments.

4 Comments

Why are you assuming single user can't have two cars with the same (make, model)?
You can assume there's a PK id, I left that out for simplicity. So adding a single column index for user_id, make, and model would result in my queries only using the user_id index and not the others, it sounds like. And two multicolumn indexes starting with the same column (user_id) is redundant? So what's the solution then? If it's to index [user_id, make, model], it does seem that both queries would use that index - however, if I had more columns I had to query this same way, would I just keep adding them to that index? Seems like that index could get huge.
@RadekPostołowicz. I did not assume anything. I used the info James has given. Your issue is beyond James' question, beyond my answer. If the user can have two cars of the same make & model, then indeed, one should have an additional column eg. SequenceNo to provide uniqueness.
@James. Yes, to most of that. In the case, one index suffices. But if you "keep adding columns", that need to be indexed, the latter being a separate decision, then you have to evaluate the indices again. I would add a separate index, with a different mix of columns, with a different starting column. Do not add indices willy nilly, just to overcome the dumbness of your NONsql.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.