0

I have a function which returns a table. One of the columns happens to be a text array. Currently the values in this array column will only ever have at most 2 elements, however there is the instance when the same row will be returned twice with the duplicate row elements in the opposite order. I'm hoping to find a way to only return 1 of these rows and discard the other. To give an example, I run a function as

SELECT * FROM schema.function($1,$2,$3)
WHERE conditions...;

which returns me something like this

ID | ARRAY_COL         | ...
1  | {'Good','Day'}    | ...
2  | {'Day','Good'}    | ...
3  | {'Stuck'}         | ...
4  | {'with'}          | ...
5  | {'array'}         | ...
6  | {'filtering'}     | ...

So in this example, I want to return the whole result set with the exception that I only want either row 1 or 2 as they have the same elements in the array (albeit inverted with respect to each other). I'm aware this is probably a bit of a messy problem, but it's something I need to get to the bottom of. Ideally I would like to stick a WHERE clause at the end of my function call which forced the result set to ignore any array value that had the same elements as a previous row. Pseudo code might be something like

 SELECT * FROM schema.function($1,$2,$3)
 WHERE NOT array_col @> (any previous array_col value);

Any pointers in the right direction would be much appreciated, thanks.

1 Answer 1

1

Not sure is the best solution, but could work especially in cases where you might have partially overlapping arrays.

The solution is in 3 steps:

  1. unnest the array_col column and order by id and the item value
select 
    id, 
    unnest(array_col) array_col_val 
from dataset 
order by 
    id, 
    array_col_val
  1. regroup by id, now row with id 1 and 2 have the same array_col value
select id, 
        array_agg(array_col_val) array_col 
    from ordering 
    group by id
  1. Select the min id grouping by array_col
select array_col, min(id) from regrouping group by array_col

Full statement

with dataset as (
    select 1 id, ARRAY['Good','Day'] array_col UNION ALL
    select 2 id, ARRAY['Day','Good'] array_col UNION ALL
    select 3 id, ARRAY['Stuck'] array_col UNION ALL
    select 4 id, ARRAY['with'] array_col UNION ALL
    select 5 id, ARRAY['array_colay'] array_col UNION ALL
    select 6 id, ARRAY['filtering'] array_col
)
, ordering as 
 (select id, 
    unnest(array_col) array_col_val 
 from dataset 
 order by id, 
 array_col_val)
, regrouping as 
 (
    select id, 
        array_agg(array_col_val) array_col 
    from ordering 
    group by id
)
select array_col, min(id) from regrouping group by array_col;

Result

   array_col   | min
---------------+-----
 {Stuck}       |   3
 {array_colay} |   5
 {filtering}   |   6
 {with}        |   4
 {Day,Good}    |   1
(5 rows)
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the response. The output looks like what I'm after alright. I'll try work that logic into my code. Much appreciated!
Worked a charm - Thanks for this!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.