0

When using this algorithm:

jq -s 'def deepmerge(a;b):
             reduce b[] as $item (a;
               reduce ($item | keys_unsorted[]) as $key (.;
                 $item[$key] as $val | ($val | type) as $type | .[$key] =
                 if ($type == "object") then
                   deepmerge({}; [if .[$key] == null then {} else .[$key] end, $val])
                 elif ($type == "array") then
                   (.[$key] + $val | unique)
                 else
                   $val
                 end)
               );
             deepmerge({}; .)' test1.json test2.json > merged-1.json

nested arrays are not concatenated/merged, but copied sequentially. I assume this is because when the method stack created by the recursive algorithm unwinds, values are written in the order node >> root, and deeper nested child values are thus always overwritten by their higher parents.

Input files:

test1.json

{
  "item1" : "test1",
  "item2" : "test2",
  "nestedItemsArray1" : [{
    "nestedItem1Item1" : "nestItem1Item1",
    "nestedItem1Array1" : ["nestedItem1Array1Item1", "nestedItem1Array1Item2"] 
  },
  {
      "nestedItem1Item1" : "nestItem1Item2",
      "nestedItem1Array1" : ["nestedItem1Array1Item3", "nestedItem1Array1Item4"]
  }]
}

test2.json

{
  "item1" : "test3",
  "item2" : "test4",
  "array1" : [ "array1item3", "array1item4" ],
  "nestedItemsArray1" : [{
    "nestedItem1Item1" : "nestItem1Item1",
    "nestedItem1Array1" : ["nestedItem1Array1Item5", "nestedItem1Array1Item6"]
  },
  {
      "nestedItem1Item1" : "nestItem1Item2",
      "nestedItem1Array1" : ["nestedItem1Array1Item7", "nestedItem1Array1Item8"]
  }]
}

Actual result:

{
  "item1": "test3",
  "item2": "test4",
  "nestedItemsArray1": [
    {
      "nestedItem1Item1": "nestItem1Item1",
      "nestedItem1Array1": [
        "nestedItem1Array1Item1",
        "nestedItem1Array1Item2"
      ]
    },
    {
      "nestedItem1Item1": "nestItem1Item2",
      "nestedItem1Array1": [
        "nestedItem1Array1Item3",
        "nestedItem1Array1Item4"
      ]
    },
    {
      "nestedItem1Item1": "nestItem1Item1",
      "nestedItem1Array1": [
        "nestedItem1Array1Item5",
        "nestedItem1Array1Item6"
      ]
    },
    {
      "nestedItem1Item1": "nestItem1Item2",
      "nestedItem1Array1": [
        "nestedItem1Array1Item7",
        "nestedItem1Array1Item8"
      ]
    }
  ],
  "array1": [
    "array1item3",
    "array1item4"
  ]
}

Expected result:

{
  "item1": "test3",
  "item2": "test4",
  "nestedItemsArray1": [
    {
      "nestedItem1Item1": "nestItem1Item1",
      "nestedItem1Array1": [
        "nestedItem1Array1Item1",
        "nestedItem1Array1Item2",
        "nestedItem1Array1Item5",
        "nestedItem1Array1Item6"
      ]
    },
    {
      "nestedItem1Item1": "nestItem1Item2",
      "nestedItem1Array1": [
        "nestedItem1Array1Item3",
        "nestedItem1Array1Item4",
        "nestedItem1Array1Item7",
        "nestedItem1Array1Item8"
      ]
    }
  ],
  "array1": [
    "array1item3",
    "array1item4"
  ]
}

Of course this is just one array deep, but I would like this to work on any number of nested levels, regardless of values being nested inside arrays or objects.

1 Answer 1

1

There is no canonical way of merging. jq implements one (basic) interpretation by simply overwriting matching paths, which includes array indices (which start again at zero for a colliding array). The way I understand your expected output is that you want to extend array indices if the array is the parent of a scalar, while relying on the builtin merger otherwise.

To achieve this, I tried to rebuild the way jq merges as jq --stream -n 'reduce (inputs | select(has(1))) as [$p,$v] (.; setpath($p;$v)), then extended the path's last item to a colliding array's length if the original index doesn't exceed it by itself:

jq --stream -n '
  reduce (inputs | select(has(1))) as [$p,$v] (.; setpath(
    if $p | last | type == "number"
    then $p[:-1] + [fmax($p | last; getpath($p[:-1]) | length)]
    else $p end;
    $v
  ))
' file1.json file2.json
{
  "item1": "test3",
  "item2": "test4",
  "nestedItemsArray1": [
    {
      "nestedItem1Item1": "nestItem1Item1",
      "nestedItem1Array1": [
        "nestedItem1Array1Item1",
        "nestedItem1Array1Item2",
        "nestedItem1Array1Item5",
        "nestedItem1Array1Item6"
      ]
    },
    {
      "nestedItem1Item1": "nestItem1Item2",
      "nestedItem1Array1": [
        "nestedItem1Array1Item3",
        "nestedItem1Array1Item4",
        "nestedItem1Array1Item7",
        "nestedItem1Array1Item8"
      ]
    }
  ],
  "array1": [
    "array1item3",
    "array1item4"
  ]
}

Note: If the command-line argument --stream comes in your way with your other processing, you can also move it inside your custom function by using the tostream builtin instead.

def deepmerge(in): reduce (in | tostream | select(has(1))) as [$p,$v] (.;
  setpath(
    if $p | last | type == "number"
    then $p[:-1] + [fmax($p | last; getpath($p[:-1]) | length)]
    else $p end;
    $v
  )
);

deepmerge(inputs)

Demo

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.