0

I have the following nested data in two columns:

Categories_A    Categories_B
{"A"}           {"B","F","C"}
{"B","A"}       {"Z","B","F"}
{"B","F"}       {"A","E","R"}

I would like to return in a new column the missing category/categories in categories_B when compared to categories_A - what is in Categories_A that is not in categories_B. So ideally:

Categories_A    Categories_B      Missing_Category
{"A"}           {"B","F","C"}     {"A"}
{"B","A"}       {"Z","B","F"}     {"A"}
{"B","F"}       {"A","E","R"}     {"B","F"}

So far what I managed is to return if there is a complete match between the columns:

select Categories_A, Categories_B,
    case
        when Categories_A = Categories_B then 'TRUE'
        else 'FALSE'
    end is_a_match
from facts_themes
Categories_A    Categories_B      is_a_match
{"A"}           {"B","F","C"}     FALSE
{"B","A"}       {"Z","B","F"}     FALSE
{"B","F"}       {"A","E","R"}     FALSE
2
  • 2
    Unrelated, but: your CASE expression can be simplified to categories_a = categories_b as is_a_match Commented Jan 24, 2022 at 11:15
  • Please clarify: Your version of Postgres. Exact table definition? Are all arrays 1-dimensional? What makes your arrays "nested"? Can there be duplicate array elements? (How to deal with those?) Can the column be NULL? Can there be NULL array elements? (How to deal with NULL?) Roughly how many rows, and how many elements per array? Commented Jan 24, 2022 at 11:44

2 Answers 2

1

There is no built-in function for that, but it's easy to write your own:

create function array_except(p_one anyarray, p_two anyarray)
  returns anyarray
as
$$
  select array_agg(e)
  from (
    select e
    from unnest(p_one) as p1(e)
    except
    select e
    from unnest(p_two) as p2(e)
  ) x
$$
language sql
immutable
;

Then you can use it like this:

select categories_a, categories_b, 
       array_except(categories_a, categories_b) as missing_categories
from facts_themes

Online example

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for your answer. However, I'm getting a syntax error near create function. I have cte above it. A comma at the end of it didn't solve the error. The error: ERROR: syntax error at or near "create" Position: 1213
The CREATE FUNCTION is perfectly valid. Maybe you forgot to end the previous statement with ; or something else.
@a_horse_with_no_name would you mind explaining what x is? Can't find it in other create function examples.
@Joehat: It's remarkable how you ask for clarification while you are giving none.
@Joehat: it's a table alias
0

While no duplicates can be involved, use the faster EXCEPT ALL:

SELECT *, ARRAY(SELECT unnest(categories_a) EXCEPT ALL
                SELECT unnest(categories_b)) AS missing
FROM   facts_themes;

db<>fiddle here

If dupes can be involved, you'll first have to define desired behavior. About EXCEPT ALL:

Also note that EXCEPT treats NULL as just another value (while most array operators do not).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.