Oracle JSON_ARRAYAGG returns duplicate JSON_OBJECTs

Question

Need to create a Json_Object which can contain multiple nested Json_objects, Json_arrays & Json_arrayaggs within .

I have Created this table with some dummy data to demo the problem:

create table test_tbl(
test_col1 varchar2(20), 
test_col2 varchar2(20), 
test_col3 varchar2(20),
test_col4 varchar2(20)
);
insert into test_tbl values('val0', 'val1', 'val2', 'val7');
insert into test_tbl values('val0', 'val3', 'val4', 'val7');
insert into test_tbl values('val0', 'val5', 'val6', 'val7');
insert into test_tbl values('val0', 'val5', 'val6', 'val7');
insert into test_tbl values('val0', 'val5', 'val6', 'val8');
insert into test_tbl values('val1', 'val9', 'val10', 'val7');
insert into test_tbl values('val1', 'val9', 'val10', 'val7');

When Using following query to create Json:

SELECT
    JSON_OBJECT ( 'output' VALUE JSON_ARRAYAGG(JSON_OBJECT('common' VALUE test_col1, 'list' VALUE JSON_ARRAYAGG(JSON_OBJECT('key1'
    VALUE test_col2, 'key2' VALUE test_col3)))) )
FROM
    test_tbl
WHERE
    test_col4 = 'val7'
GROUP BY
    test_col1

This results in following json with duplicate key, value pairs in the aggregated array -

{
  "output": [
    {
      "common": "val0",
      "list": [
        {
          "key1": "val5",
          "key2": "val6"
        },
        {
          "key1": "val5",
          "key2": "val6"
        },
        {
          "key1": "val3",
          "key2": "val4"
        },
        {
          "key1": "val1",
          "key2": "val2"
        }
      ]
    },
    {
      "common": "val1",
      "list": [
        {
          "key1": "val9",
          "key2": "val10"
        },
        {
          "key1": "val9",
          "key2": "val10"
        }
      ]
    }

Whereas my expected Json is :

{
  "output": [
    {
      "common": "val0",
      "list": [
        {
          "key1": "val5",
          "key2": "val6"
        },
        {
          "key1": "val3",
          "key2": "val4"
        },
        {
          "key1": "val1",
          "key2": "val2"
        }
      ]
    },
    {
      "common": "val1",
      "list": [
        {
          "key1": "val9",
          "key2": "val10"
        }
      ]
    }
  ]
}

Thanks in advance for any suggestions on how to get the expected Json above.

Your question is clear. It is not clear, however, if you understand why you are getting the current result from the current query. Your question is similar to this: "3 + 5 + 5 + 1 produces the result 14, but I expect the result 9" (you don't want 5 to be counted twice). In your data you have duplicate values by val1, val2, val3; all that json_arrayagg does is to collect its inputs and create an array from them - there is no task to eliminate duplicate members. An array may validly contain repeated values. — user5683823
– user5683823, Commented Jul 29, 2021 at 14:00
In fact, if such duplicates may exist in the input, it is not clear why you want them removed in the output. — user5683823
– user5683823, Commented Jul 29, 2021 at 14:04
i agree that json_arrayagg is working to the spec. However, the dataset i am working with is not normalized, it's all just dumped into this huge table with lots of duplicate data. I am trying to remove the duplicates when creating a json from it. — Oten
– Oten, Commented Jul 29, 2021 at 16:54

MT0 · Accepted Answer · 2021-07-29 09:50:46Z

1

Use a sub-query with DISTINCT to remove the duplicates:

SELECT JSON_OBJECT (
         'output' VALUE JSON_ARRAYAGG(
           JSON_OBJECT(
             'common' VALUE test_col1,
             'list'   VALUE JSON_ARRAYAGG(
               JSON_OBJECT(
                 'key1' VALUE test_col2,
                 'key2' VALUE test_col3
               )
             )
           )
         )
       )
FROM   (
  SELECT DISTINCT
         test_col1, test_col2, test_col3
  FROM   test_tbl
  WHERE  test_col4 = 'val7'
)
GROUP BY
       test_col1

answered Jul 29, 2021 at 9:50

MT0

173k12 gold badges70 silver badges136 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Oten Over a year ago

On further testing, Above solution works if there is one list in the output, if there are 2 lists, each with duplicate elements, then it starts to produce duplicate key, value pairs within each list.I am not sure if I should update the post above or raise a new one for this problem

Oten Over a year ago

If there are 2 more columns - test_col5 varchar2(20), test_col6 varchar2(20). With values : insert into test_tbl values('val1', 'val9', 'val10', 'val7','val11', 'val12'); insert into test_tbl values('val1', 'val9', 'val10', 'val7','val13', 'val14');And following additional node as a sibling to 'list': 'anotherlist' VALUE JSON_ARRAYAGG( JSON_OBJECT('key1' VALUE test_col5, 'key2' VALUE test_col6 )).

MT0 Over a year ago

@Oten That seems like a very different problem if you have two lists in the same table and want to aggregate the distinct values from each. If you want a complete answer then you should ask a new question but it'll probably just be to SELECT DISTINCT from the first list and aggregate and then SELECT DISTINCT from the second list and aggregate and then JOIN the two lists on their shared primary key (and to question why you are storing the data like that rather than in different tables).

Oten Over a year ago

This data is from a third party app and I am trying to remove duplicates when creating the json. Created this new post - stackoverflow.com/questions/68610906/…

Collectives™ on Stack Overflow

Oracle JSON_ARRAYAGG returns duplicate JSON_OBJECTs

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related