3

I have a large table with structure

CREATE TABLE t (
  id SERIAL primary key ,
  a_list int[] not null,
  b_list int[] not null,
  c_list int[] not null,
  d_list int[] not null,
  type int not null 
)

I want query all unique values from a_list, b_list, c_list, d_list for type like this

    select 
        some_array_unique_agg_function(a_list), 
        some_array_unique_agg_function(b_list), 
        some_array_unique_agg_function(c_list), 
        some_array_unique_agg_function(d_list),
        count(1) 
    where type = 30

For example, for this data

+----+---------+--------+--------+---------+------+
| id | a_list  | b_list | c_list | d_list  | type |
+----+---------+--------+--------+---------+------+  
| 1  | {1,3,4} | {2,4}  | {1,1}  | {2,4,5} | 30   |
| 1  | {1,2,4} | {2,4}  | {4,1}  | {2,4,5} | 30   |
| 1  | {1,3,5} | {2}    | {}     | {2,4,5} | 30   |
+----+---------+--------+--------+---------+------+

I want the next result

+-------------+--------+--------+-----------+-------+
| a_list      | b_list | c_list | d_list    | count |
+-------------+--------+--------+-----------+-------+  
| {1,2,3,4,5} | {2,4}  | {1,4}  | {2,4,5}   | 3     |
+-------------+--------+--------+-----------+-------+

Is there some_array_unique_agg_function for my purposes?

5
  • uniq() from the intarray extension Commented Aug 5, 2019 at 12:48
  • @a_horse_with_no_name I need some aggregate function like uniq to merge all values from rows Commented Aug 5, 2019 at 12:57
  • array_agg(distinct ...) Commented Aug 5, 2019 at 12:59
  • @a_horse_with_no_name array_agg(distinct ...) works for scalar values, but my columns have type int[] Commented Aug 5, 2019 at 13:02
  • Obviously you need to unnest those values before you can aggregate them with distinct. To be honest: if you need something like that you should probably think about normalizing your model. Commented Aug 5, 2019 at 13:04

2 Answers 2

5

Try this

with cte as (select 
        unnest( a_list::text[] )::integer as a_list, 
        unnest( b_list::text[] )::integer as b_list, 
        unnest( c_list::text[] )::integer as c_list, 
        unnest( d_list::text[] )::integer as d_list,
        (select count(type) from t) as type
    from t 
    where type = 30
)
select array_agg(distinct a_list),array_agg(distinct b_list)
,array_agg(distinct c_list),array_agg(distinct d_list),type from cte group by type ;

Result:

"{1,2,3,4,5}";"{2,4,NULL}";"{1,4,NULL}";"{2,4,5}";3
Sign up to request clarification or add additional context in comments.

3 Comments

I get syntax error ERROR: column "a_list" does not exist LINE 2: unnest( a_list::text[] )::integer as a_list,
Thank you, works great. One more question, why I need this cast ::text[] )::integer?
Actually I thought the data type is string. There is not need as already you are using integer array Datatype..You can try with unnest( a_list::integer[] )as a_list It works fine..:)
0

Try this take on old answer from this post: All Permutations of an Array

this lists unique ordered permutations of an INT array grouped by something (row_id and instrument_id in my case) on quite decent time for arrays with length <= 10 :

You may need to install the intarray extension ...

WITH RECURSIVE data AS (
                           SELECT a1.instrument_id, ARRAY_AGG(a1.obs_pos ORDER BY a1.obs_pos) AS arr
                           FROM tmp_xxxx a1
                           GROUP BY 1
                       )
   , keys           AS (
                           SELECT instrument_id, GENERATE_SUBSCRIPTS(d.arr, 1) AS rn
                           FROM data d
                       )
   , cte            AS (
                           SELECT DISTINCT x.instrument_id
                                         , public.sort(x.initial_arr) AS initial_arr
                                         , public.sort(x.new_arr)     AS new_arr
                                         , public.sort(x.used_rn)     AS used_rn
                           FROM (
                                    SELECT d.instrument_id
                                         , d.arr               initial_arr
                                         , ARRAY [d.arr[k.rn]] new_arr
                                         , ARRAY [k.rn]        used_rn
                                    FROM data d
                                    JOIN keys k
                                         ON d.instrument_id = k.instrument_id
                                ) x
                           UNION ALL
                           SELECT DISTINCT c.instrument_id
                                         , public.sort(initial_arr)                      AS initial_arr
                                         , public.sort(c.new_arr || c.initial_arr[k.rn]) AS new_arr
                                         , public.sort(used_rn || k.rn)                  AS used_rn
                           FROM cte  c
                           JOIN keys k
                                ON c.instrument_id = k.instrument_id AND NOT (k.rn = ANY (c.used_rn))
                       )
INSERT
INTO tmp_xxxx( row_id
                      , instrument_id
                      , obs_pos_array
                      )
SELECT DISTINCT _row_id              AS row_id
              , cte.instrument_id
              , public.sort(new_arr) AS obs_pos_array
FROM cte
WHERE ARRAY_LENGTH(new_arr, 1) >= 2 -- change it to your needs
ON CONFLICT ON CONSTRAINT pk_xxxx DO NOTHING;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.