Lodash merging and unioning of nested array / object structure

Question

I have two arrays that need merging in Javascript. They are arranged as follows:

arrayA = [town1A, town2A, town3A];
arrayB = [town3B, town5B];

Each town is an object with a townName: 'town1' (matching the object variable name). Each town also has an array of occupants: [{}, {}] which each have their own personName, and a status: 'dead' or 'alive'.

My goal, is that after merging, the new array will contain every unique town according to townName (town3B and town3A both have townName : 'town3').

arrayC = [town1, town2, town3, town5]

Any new towns in arrayB (i.e., town5) should be added directly to the list. Any towns with the same name (i.e., town3) should combine their lists of occupants, but remove any "dead" people. ArrayB has priority over ArrayA when determining status, as it is "overwriting" the old data. For example:

arrayA.town3.occupants = [{name: 'Bob', status: 'alive'}, {name: 'Joe', status: 'alive'}];
arrayB.town3.occupants = [{name: 'Bob', status: 'dead'}, {name: 'Alice', status: 'alive'}];

arrayC.town3.occupants = [{name: 'Joe', status: 'alive'}, {name: 'Alice', status: 'alive'}];

I'm just struggling with the logic sequence process here and need a nudge to figure out what tools to use. Currently I'm trying to work with Lodash's _.merge and _.union in some combination. It seems I can use _.mergeWith or _.unionBy to "nest" the merging steps without resorting to manually looping over the arrays, but their usage is going over my head. If a solution exists that uses one of those, I would like to see an example to learn better how they work.

Edit: I was asked for the entire contents of an example arrayA and arrayB:

arrayA = [
    {
        townName: 'town1',
        occupants: [
            {name: 'Charlie', status: 'alive'},
            {name: 'Jim', status: 'dead'}
        ]
    },
    {
        townName: 'town2',
        occupants: [
            {name: 'Rachel', status: 'alive'},
        ]
    },
    {
        townName: 'town3',
        occupants: [
            {name: 'Bob', status: 'alive'},
            {name: 'Joe', status: 'alive'}
        ]
    }
];

arrayB = [
    {
        townName: 'town3',
        occupants: [
            {name: 'Bob', status: 'dead'},
            {name: 'Alice', status: 'alive'}
        ]
    },
    {
        townName: 'town5',
        occupants: [
            {name: 'Sam', status: 'dead'},
            {name: 'Ray', status: 'alive'},
            {name: 'Bob', status: 'alive'},
        ]
    }
];

The output I expect is:

arrayC = [
    {
        townName: 'town1',
        occupants: [
            {name: 'Charlie', status: 'alive'},
        ]
    },
    {
        townName: 'town2',
        occupants: [
            {name: 'Rachel', status: 'alive'},
        ]
    },
    {
        townName: 'town3',
        occupants: [
            {name: 'Joe', status: 'alive'},
            {name: 'Alice', status: 'alive'}
        ]
    },
    {
        townName: 'town5',
        occupants: [
            {name: 'Ray', status: 'alive'},
            {name: 'Bob', status: 'alive'},
        ]
    }
];

The main text already shows everything, but I have now extended it out into full arrays at the bottom of the comment. Also included the result array I expect. — Trevor Buckner
– Trevor Buckner, Commented Aug 3, 2022 at 15:28

Enlico · Accepted Answer · 2022-08-04 06:21:30Z

The complexity with this problem is that you want to merge on 2 different layers:

you want to merge two arrays of towns, so you need to decide what to do with towns common to the two arrays;
when handling two towns with common name, you want to merge their occupants.

Now, both _.merge and _.mergeWith are good candidates to accomplish the task, except that they are for operating on objects (or associative maps, if you like), whereas you have vectors of pairs (well, not really pairs, but objects with two elements with fixed keys; name/status and townName/occupants are fundamentally key/value) at both layers mentioned above.

One function that can be useful in this case is one that turns an array of pairs into an object. Here's such a utility:

arrOfPairs2Obj = (k, v) => (arr) => _.zipObject(..._.unzip(_.map(arr, _.over([k, v]))));

Try executing the following

townArr2townMap = arrOfPairs2Obj('townName', 'occupants');
mapA = townArr2townMap(arrayA);
mapB = townArr2townMap(arrayB);

to see what it does.

Now you can merge mapA and mapB more easily…

_.mergeWith(mapA, mapB, (a, b) => {
    // … well, not that easily
})

Again, a and b are arrays of "pairs" name/status, so we can reuse the abstraction I showed above, defining

personArr2personMap = arrOfPairs2Obj('name', 'status');

and using it on a and b.

But still, there are some problems. I thought that the (a, b) => { … } I wrote above would be called by _.mergeWith only for elements which have the same key across mapA and mapB, but that doesn't seem to be the case, as you can verify by running this line

_.mergeWith({a: 1, b: 3}, {b:2, c:4, d: 6}, (x, y) => [x, y])

which results in

{
  a: 1
  b: [3, 2]
  c: [undefined, 4]
  d: [undefined, 6]
}

revealing that the working lambda is called for the "clashing" keys (in the case above just b), and also for the keys which are absent in the first object (in the case above c and d), but not for those absent in the second object (in the case above a).

This is a bit unfortunate, because, while you could filter dead people out of towns which are only in arrayB, and you could also filter out those people which are dead in arrayB while alive in arrayA, you'd still have no place to filter dead people out of towns which are only in arrayA.

But let's see how far we can get. _.merge doc reads

Source objects are applied from left to right. Subsequent sources overwrite property assignments of previous sources.

So we can at least handle the merging of towns common across the array in a more straightforward way. Using _.merge means that if a person is common in the two arrays, we'll always pick the one from arrayB, whether that's (still) alive or (just) dead.

Indeed, a strategy like this doesn't give you the precise solution you want, but not even one too far from it,

notSoGoodResult = _.mergeWith(mapA, mapB, (a, b) => {
   return _.merge(personArr2personMap(a), personArr2personMap(b));
})

its result being the following

{
  town1: [
    {name: "Charlie", status: "alive"},
    {name: "Jim", status: "dead"}
  ],
  town2: [
    {name: "Rachel", status: "alive"}
  ],
  town3:
    Alice: "alive",
    Bob: "dead",
    Joe: "alive"
  },
  town5: {
    Bob: "alive",
    Ray: "alive",
    Sam: "dead"
  }
}

As you can see

Bob in town3 is correctly dead,
we've not forgotten Alice in town3,
nor have we forogtten about Joe in town3.

What is left to do is

"reshaping" town3 and town5 to look like town1 and town2 (or alternatively doing the opposite),
filtering away all dead people (there's no more people appearing with both the dead and alive status, so you don't risk zombies).

Now I don't have time to finish up this, but I guess the above should help you in the right direction.

The bottom line, however, in my opinion, is that JavaScript, even with the power of Lodash, is not exactly the best tool for functional programming. _.mergeWith disappointed me, for the reason explained above.

Also, I want to mention that there a module named lodash/fp that

promotes a more functional programming (FP) friendly style by exporting an instance of lodash with its methods wrapped to produce immutable auto-curried iteratee-first data-last methods.

This shuould slightly help you be less verbose. With reference to your self answer, and assuming you wanted to write the lambda

person => {return person.status == "alive";}

in a more functional style, with "normal" Lodash you'd write

_.flowRight([_.curry(_.isEqual)('alive'), _.iteratee('status')])

whereas with lodash/fp you'd write

_.compose(_.isEqual('alive'), _.get('status'))

Ahmed Lazhar · Accepted Answer · 2022-08-03 05:15:20Z

1

You can define a function for merging arrays with a mapper like this:

  const union = (a1, a2, id, merge) => {
    const dict = _.fromPairs(a1.map((v, p) => [id(v), p]))
    return a2.reduce((a1, v) => {
      const i = dict[id(v)]
      if (i === undefined) return [...a1, v] 
      return Object.assign([...a1], { [i]: merge(a1[i], v) })
    }, a1)
  }

and use it like this:

union(
  arrayA, 
  arrayB, 
  town => town.townName, 
  (town1, town2) => ({
    ...town1,
    occupants: union(
      town1.occupants,
      town2.occupants,
      occupant => occupant.name,
      (occupant1, occupant2) => occupant1.status === 'alive' ? occupant1 : occupant2
    ).filter(occupant => occupant.status === 'alive')
  })
)

edited Aug 3, 2022 at 5:15

answered Aug 3, 2022 at 4:52

Ahmed Lazhar

8666 silver badges10 bronze badges

Comments

Enlico · Accepted Answer · 2022-08-04 06:00:20Z

1

I managed to find a consistent way to do this (thanks to @Enlico for some hints). Since _.mergeWith() is recursive, you can watch for a specific nested object property and handle each property differently if needed.

// Turn each array into an Object, using "townName" as the key
var objA = _.keyBy(arrayA, 'townName');
var objB = _.keyBy(arrayB, 'townName');

// Custom handler for _.merge()
function customizer(valueA, valueB, key) {
  if(key == "occupants"){
    //merge occupants by 'name'. _.union prioritizes first instance (so swap A and B)
    return _.unionBy(valueB, valueA, 'name'); 
  //Else, perform normal _.merge
  }
}

// Merge arrays, then turn back into array
var merged = _.values(_.mergeWith(objA, objB, customizer));

// Remove dead bodies
var filtered = _.map(merged, town => {
  town.occupants = _.filter(town.occupants, person => {return person.status == "alive";});
  return town;
});

edited Aug 4, 2022 at 6:00

Enlico

30.2k10 gold badges72 silver badges161 bronze badges

answered Aug 4, 2022 at 3:17

Trevor Buckner

6475 silver badges16 bronze badges

4 Comments

Enlico Over a year ago

I didn't know of _.keyBy, thanks. Did you find it by truly going through the whole list on the docs?!

Enlico Over a year ago

By the way, if you wanted, the lambda person => {return person.status == "alive";} can be written as _.flowRight([_.curry(_.isEqual)('alive'), _.iteratee('status')]), if one likes the point-free style; in this case it actually obscures readability, also because one has to use _.curry explicitly, but unfortunately _.isEqual (like many other functions in lodash) is not curried. (Notice that _.curry(_.isEqual)('alive') is the same as _.partial(_.isEqual, 'alive').)

Trevor Buckner Over a year ago

@Enlico Your idea of key/value pairs gave me the idea to search the docs for those terms and it turns out there were functions to do that already built in.

Enlico Over a year ago

I should spend more time thinking "let's see if it exist already first"! :D

Collectives™ on Stack Overflow

Lodash merging and unioning of nested array / object structure

3 Answers 3

Comments

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related