postgresql: multiple multicolumn indexes with foreign key?

Question

Simple question. Let's say I have a users table and a cars table. The cars table has a user_id, make, model, and I always query its data with the user_id:

SELECT * FROM cars WHERE user_id = 123 AND make = 'honda'
SELECT * FROM cars WHERE user_id = 123 AND model = 'accord'

Assuming I always query the cars table with a user_id, is it better to add two multicolumn indexes [user_id, make] and [user_id, model] (and potentially more for additional columns), or a single-column index for each user_id, make, and model column?

What's confusing me is the idea of having several multicolumn indexes that all start with the same foreign_key. Seems like that best fits my queries, but not sure how correct/efficient/wasteful it is to the database.

the difference is that a multicolumn index will be a covering index if you use AND conditions,so the optimizer won`t have to touch the table for searching,just for retreving.So yes use the first option,not sure but Postgres wont be able to use 2 indexes at the same time so for OR conditions it might pick the index with the most restrictive condition — Mihai
– Mihai, Commented May 21, 2015 at 6:00
@Mihai. That is not what a covered query means. You are confusing the issue. — PerformanceDBA
– PerformanceDBA, Commented May 21, 2015 at 9:27

PerformanceDBA · Accepted Answer · 2015-05-21 09:22:38Z

2

This answer considers what is most "correct/efficient/least-wasteful" in the database.

Assuming I always query the cars table with a user_id,

That, what you do with the data or how you access it, is not relevant to database design and overall performance. It is relevant only to that single query.

is it better to add two multicolumn indexes [user_id, make] and [user_id, model] (and potentially more for additional columns), or a single-column index for each user_id, make, and model column?

The single column index is superfluous, a non-performer, it produces no gain.

Separately, you should update the statistics for each of those single columns.

First, separate to your question, the PK should be:

    ( user_id, make, model )

because (without seeing the full DDL for the table), that is the only method of providing row uniqueness, which is demanded in Relational databases. You do not need additional indices, even if attribute columns are added.

if you have an car_id field in that file, it is superfluous, redundant, and negative performance, due to the additional index it requires. You can safely remove it.

Second, that PK index is the only one you need, for the queries you have described.

What's confusing me is the idea of having several multicolumn indexes that all start with the same foreign_key.*

Yes, that should raise an alarm. Not that they all start with the same FK, but that they start with the same column(s). The index with the largest set of columns makes the others redundant.

answered May 21, 2015 at 9:22

PerformanceDBA

33.9k10 gold badges72 silver badges94 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Radek Postołowicz Over a year ago

Why are you assuming single user can't have two cars with the same (make, model)?

James Over a year ago

You can assume there's a PK id, I left that out for simplicity. So adding a single column index for user_id, make, and model would result in my queries only using the user_id index and not the others, it sounds like. And two multicolumn indexes starting with the same column (user_id) is redundant? So what's the solution then? If it's to index [user_id, make, model], it does seem that both queries would use that index - however, if I had more columns I had to query this same way, would I just keep adding them to that index? Seems like that index could get huge.

PerformanceDBA Over a year ago

@RadekPostołowicz. I did not assume anything. I used the info James has given. Your issue is beyond James' question, beyond my answer. If the user can have two cars of the same make & model, then indeed, one should have an additional column eg. SequenceNo to provide uniqueness.

PerformanceDBA Over a year ago

@James. Yes, to most of that. In the case, one index suffices. But if you "keep adding columns", that need to be indexed, the latter being a separate decision, then you have to evaluate the indices again. I would add a separate index, with a different mix of columns, with a different starting column. Do not add indices willy nilly, just to overcome the dumbness of your NONsql.

Collectives™ on Stack Overflow

postgresql: multiple multicolumn indexes with foreign key?

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related