3

In this query, each row of table a could have hundreds of rows of table b associated with it. So the array_agg contains all of those values. I'd like to be able to set a limit for it, but instide array_agg I can do order by but there's no way to set a limit.

select a.column1, array_agg(b.column2)
from a left join b using (id)
where a.column3 = 'value'
group by a.column1

I could use the "slice" syntax on the array but that's quite expensive since it first has to retrieve all the rows then discard the other ones. What's the proper efficient way to do this?

3 Answers 3

3

I would use a lateral join.

select a.column1, array_agg(b.column2)
from a left join lateral 
    (select id, column2 from b where b.id=a.id order by something limit 10) b using (id)
where a.column3 = 'value'
group by a.column1

Since the "id" restriction is already inside the lateral query, you could make the join condition on true rather than using (id). I don't know which is less confusing.

Sign up to request clarification or add additional context in comments.

3 Comments

I think I understand but I'm still trying to wrap my head around LATERAL joins. Could you also please show the subquery equivalent so I can understand how they relate to each other?
I just realized, don't you need to sort the results in the array_agg again? Doesn't it need to be array_agg(b.column2 order by b.something) instead, or was the order by something in the subquery guaranteed to preserve the order in the "outer" query?
@user779159 . . . This doesn't set a limit per column1. It sets a limit per id. That is different from what you are asking.
2

I think you need to count first and then aggregate:

select a.column1, array_agg(b.column2)
from (select a.column1, b.column2,
             row_number() over (partition by a.column1 order by a.column1) as seqnum
      from a left join
           b 
           using (id)
      where a.column3 = 'value'
     ) a
where seqnum <= 10
group by a.column1

2 Comments

The answer using a LATERAL join looks a bit clearer to understand. Is there a reason this one would be better or is it a matter of preference?
@user779159 . . . The lateral join doesn't do the same thing. This allows you to control the number of rows in the final result. The lateral join only allows you to control the number per id, not per column1. I think this is more in the spirit of what you want to accomplish.
0

In many cases you can create an array from a subselect using the ARRAY keyword (near the bottom of the linked heading).

select a.column1, ARRAY(select column2 from b where b.id = a.id limit 10)
from a
where a.column3 = 'value'
group by a.column1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.