61

(have done this before, but memory fades, as does goggle)

wish to get select from users with the tag.tag_ids for each user returned as an array.

select usr_id,
       name,
       (select t.tag_id from tags t where t.usr_id = u.usr_id) as tag_arr
from   users u;

with the idea embedded query tag_arr would be an array

1
  • 1
    @a_horse_with_no_name - yep. too many languages, too many years - thanks Commented Jul 16, 2015 at 11:56

2 Answers 2

98

Use the aggregate function:

select
    usr_id, 
    name, 
    array_agg(tag_id) as tag_arr
from users
join tags using(usr_id)
group by usr_id, name

or an array constructor from the results of a subquery:

select
    u.usr_id, 
    name, 
    array(
        select tag_id 
        from tags t 
        where t.usr_id = u.usr_id
        ) as tag_arr
from users u

The second option is a simple one-source query while the first one is more generic, especially convenient when you need more than one aggregate from a related table. Also, the first variant should be faster on larger tables.

Note, that for better performance the usr_id columns in both tables should be indexed. While typically users.usr_id is a primary key, sometimes one may forget that the index of referencing column is also useful.

Sign up to request clarification or add additional context in comments.

2 Comments

Of those two options, can the first one using array_agg() have significantly better execution time? I presume it depends on the sizes of the data tables involved. In a real project I just compared the two queries with explain analyze and got execution times of 120ms vs 10s respectively. I'm not expert at db query plans, so any insight on how these two execute differently would be helpful.
@broc.seib - it depends on many factors such as the number of rows in both tables, the average number of tags per user, Postgres version, server and hardware configuration. For larger tables, the first query should be faster than the other. The difference in your case seems too large though. The reason may be the lack of index on tags(usr_id), which would be useful for the first query as well.
8

Use the array constructor of PostgreSQL:

select
    usr_id,
    name,
    array(select t.tag_id from tags t where t.usr_id = u.usr_id) as tag_arr
from users u;

Note:

If you're using psycopg2 with python, then the result will be converted to a python list as well! (Although for uuid[] array, you will need to convert it to text[] array using array(...)::text[], if you want to get IDs in python list). See this for details.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.