Rails with postgres - activerecord query: sort by column ASC, then group by column

Question

I have a model Laps, which belongs to :Car

So each car has many laps, but I need to do a query where i pull in the top 10 fastest Laps, but only pull in the fastest lap per each car. In other words, I do not want one car to be allowed to have 5 of the top 10 fastest laps in the list.

Each lap has a :car_id field. I want to group on this column, and then pull out the lap with the min(Lap.time) row from the grouping. (AKA fastest lap time from that unique :car_id).

So with Postgres, what is my best approach to this? Can I order all laps first, then group, and then pull the first from the group. Or will grouping not keep the sort order?

Here is my schema for the Lap Model:

  create_table "laps", force: true do |t|
    t.decimal  "time"
    t.string   "video_url"
    t.integer  "car_id"
    t.datetime "created_at"
    t.datetime "updated_at"
    t.integer  "user_id"
    t.boolean  "approved"
  end

Do I have to use two combined queries for this?

I can get a unique car id's lap by doing this:

select('DISTINCT ON (car_id) *')

but, I need to order the laps so that it gets the min(lap.time) per that car_id. So when i throw on an order like this:

select('DISTINCT ON (car_id) *').order('car_id, time ASC').sort_by! {|ts| ts.time}

This works, but seems like an odd way to go about it. As soon as i try and change around the order, like removing the car_id from the order, i get postgres errors.

Maybe this? Lap.select('min(laps.time), laps.*').group('laps.id, laps.car_id').limit(10) — MrYoshiji
– MrYoshiji, Commented Sep 3, 2013 at 15:24
I just tried this. It gives me multiple laps from the same car_id — Joel Grannas
– Joel Grannas, Commented Sep 3, 2013 at 15:30
This is what I did and its working to get the unique car_id's fastest, but when I debug the object it is not sorted by time it seems. If there is a better way to go about these please post your answer: select('DISTINCT ON (car_id) *').order('car_id, time ASC') — Joel Grannas
– Joel Grannas, Commented Sep 3, 2013 at 16:06

PinnyM · Accepted Answer · 2013-09-03 16:52:57Z

4

As you're discovering, DISTINCT ON isn't going to work for you here because it doesn't match the first term you want to sort on (time). You'll need to use a GROUP BY:

Lap.group(:car_id).limit(10).minimum(:time)

Alternatively, you can make a windowed subquery - but that is quite a bit messier to build. If you need the actual lap information aside for the time, you may have to go that route:

subquery = Lap.select('lap.*, ROW_NUMBER() OVER ( PARTITION BY car_id ORDER BY time, car_id ) as rowNum').to_sql
Lap.scoped.from(Arel.sql("(#{subquery}) fast_laps"))
   .where('rowNum = 1')
   .order('time, car_id')
   .limit(10)

edited Sep 3, 2013 at 16:52

answered Sep 3, 2013 at 16:35

PinnyM

35.6k4 gold badges77 silver badges84 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Joel Grannas Over a year ago

I tried this already, but I get: PG::InvalidColumnReference: ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions. Do I need to change my select order or something as well?

PinnyM Over a year ago

@JoelGrannas, right - this isn't going to work. Updated with alternative answer.

Joel Grannas Over a year ago

I do need the lap information (time, id, mph, video_url). Can you elaborate on the windowed subquery or provide that method

Joel Grannas Over a year ago

How do you think the subquery would be performance wise compared to my answer below with the resorting?

PinnyM Over a year ago

@JoelGrannas - for small amounts of data it won't matter much. For large amounts of data (several hundred rows) the subquery will outperform significantly since it won't need to transfer every car_id record to be created and sorted in ActiveRecord. For even larger amounts, the windowing will slow down somewhat as well - you'll probably need to change the subquery to use GROUP BY to make use of the indexes, and use the outer query to fetch the the entire row of data. But that likely won't happen until you're near the 1 million row mark.

Joel Grannas · Accepted Answer · 2013-09-03 16:19:29Z

0

This is what I did and its working. If there is a better way to go about these please post your answer:

in my model:

def self.fastest_per_car
    select('DISTINCT ON (car_id) *').order('car_id, time ASC').sort_by! {|ts| ts.time}
end

answered Sep 3, 2013 at 16:19

Joel Grannas

2,0162 gold badges24 silver badges46 bronze badges

1 Comment

Mohamed El Mahallawy Over a year ago

I'd stay away from the sort_by! and having ruby do the heavy loading your db should take care of

Collectives™ on Stack Overflow

Rails with postgres - activerecord query: sort by column ASC, then group by column

2 Answers 2

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related