1

I have the following code to join two tables microposts and activities with micropost_id column and then order based on created_at of activities table with distinct micropost id.

 Micropost.joins("INNER JOIN activities ON 
 (activities.micropost_id = microposts.id)").
 where('activities.user_id= ?',id).order('activities.created_at DESC').
 select("DISTINCT (microposts.id), *")

which should return whole micropost columns.This is not working in my developement enviornment.

(PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list

If I add activities.created_at in SELECT DISTINCT, I will get repeated micropost ids because the have distinct activities.created_at column. I have done a lot of search to reach here. But the problem always persist because of this postgres condition to avoid random selection.

I want to select based on order of activities.created_at with distinct micropost _id.

Please help..

2 Answers 2

4

To start with, we need to quickly cover what SELECT DISTINCT is actually doing. It looks like just a nice keyword to make sure you only get back distinct values, which shouldn't change anything, right? Except as you're finding out, behind the scenes, SELECT DISTINCT is actually acting more like a GROUP BY. If you want to select distinct values of something, you can only order that result set by the same values you're selecting -- otherwise, Postgres doesn't know what to do.

To explain where the ambiguity comes from, consider this simple set of data for your activities:

CREATE TABLE activities (
  id INTEGER PRIMARY KEY,
  created_at TIMESTAMP WITH TIME ZONE,
  micropost_id INTEGER REFERENCES microposts(id)
);
INSERT INTO activities (id, created_at, micropost_id)
VALUES (1, current_timestamp,                      1),
       (2, current_timestamp - interval '3 hours', 1),
       (3, current_timestamp - interval '2 hours', 2)

You stated in your question that you want "distinct micropost_id" "based on order of activities.created_at". It's easy to order these activities by descending created_at (1, 3, 2), but both 1 and 2 have the same micropost_id of 1. So if you want the query to return just micropost IDs, should it return 1, 2 or 2, 1?

If you can answer the above question, you need to take your logic for doing so and move it into your query. Let's say that, and I think this is pretty likely, you want this to be a list of microposts which were most recently acted on. In that case, you want to sort the microposts in descending order of their most recent activity. Postgres can do that for you, in a number of ways, but the easiest way in my mind is this:

SELECT micropost_id
FROM activities
JOIN microposts ON activities.micropost_id = microposts.id
GROUP BY micropost_id
ORDER BY MAX(activities.created_at) DESC

Note that I've dropped the SELECT DISTINCT bit in favor of using GROUP BY, since Postgres handles them much better. The MAX(activities.created_at) bit tells Postgres to, for each group of activities with the same micropost_id, sort by only the most recent.

You can translate the above to Rails like so:

Micropost.select('microposts.*')
  .joins("JOIN activities ON activities.micropost_id = microposts.id")
  .where('activities.user_id' => id)
  .group('microposts.id')
  .order('MAX(activities.created_at) DESC')

Hope this helps! You can play around with this sqlFiddle if you want to understand more about how the query works.

Sign up to request clarification or add additional context in comments.

8 Comments

Will it return all the microposts table rows or just the idea. If it doesnt , is there a way to get all the rows with single query?
The final snippet of Rails code will return what I believe you're looking for (distinct Microposts, ordered by their latest activity). I'd suggest you read my whole post to better understand why, though!
I get his error with your code PG::GroupingError: ERROR: column "microposts.id" must appear in the GROUP BY clause or be used in an aggregate function so I changed group('micropost_id') to group('microposts.id'). Its working but its not sorted on activities.created_at. I doubt if it sort on created_at only when ids are same..
Sorry about the typo; your correction is right. What order do you expect? Like I explained above, there can be multiple values of activities.created_at for each Micropost, so you need to specify an ordering that takes that fact into account. I chose to order by their most recent activity's created_on value, descending. It might help if you uploaded a small dataset somewhere with your input data, table structure, and what your expected result is.
The raw sql works but the Rails code doesnt. I am unable to understand why? It is not giving output in the order of created_at of activities table.
|
0

Try the below code

Micropost.select('microposts.*, activities.created_at')
.joins("INNER JOIN activities ON (activities.micropost_id = microposts.id)")
.where('activities.user_id= ?',id)
.order('activities.created_at DESC')
.uniq

1 Comment

Its not returning unique Microposts.. Dont understand why?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.