Filter duplicates, including original value, from array of objects based on multiple properties

Question

data is a list of objects. We want to filter all duplicates out, including the original value, based on multiple object properties.

My code works in filtering duplicates out based on multiple object properties, but how can we adjust it to filter out the original value as well?

The goal is to end up with a list of these duplicates.

const data = [{
  name: 'x',
  latitude: '45.9',
  longitude: '50.2'
}, {
  name: 'y',
  latitude: '45.9',
  longitude: '50.2'
}, {
  name: 'z',
  latitude: '40.5',
  longitude: '85.7'
}];

const duplicates = data
  .filter((obj, index, array) =>
    array.findIndex(o =>
      o.latitude === obj.latitude &&
      o.longitude === obj.longitude
    ) != index
  );

console.log(duplicates);

Output:

[{
  name: 'y',
  latitude: '45.9',
  longitude: '50.2'
}]

Desired output:

[{
  name: 'x',
  latitude: '45.9',
  longitude: '50.2'
}, {
  name: 'y',
  latitude: '45.9',
  longitude: '50.2'
}]

Yes, updated the question to clarify - I want to keep only the elements that are duplicates — brienna
– brienna, Commented Sep 3, 2021 at 19:01

Tushar Shahi · Accepted Answer · 2021-09-03 19:07:36Z

2

Instead of findIndex, you can run a forloop with the extra condition that index should not be the same index you are checking for.

Based on that you can directly return from inside the loop.

var data = [
    { name: 'x',
        latitude: '45.9',
      longitude: '50.2'},
    { name: 'y',
        latitude: '45.9',
      longitude: '50.2'},
    { name: 'z',
        latitude: '40.5',
      longitude: '85.7'},
];

var duplicates = data.filter((obj, index, array) => {
  for(let i = 0 ; i < array.length;i++){
  if(i!=index && array[i].latitude == obj.latitude 
    && array[i].longitude == obj.longitude ){
      return true;
    }
  }
  return false;    
});
    
    

console.log(duplicates);

edited Sep 3, 2021 at 19:07

answered Sep 3, 2021 at 18:55

Tushar Shahi

21.6k2 gold badges25 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Peter Seliger Over a year ago

@brienna ... 1st there is no need for nested loops; 2nd one does not return in the middle of a for loop.

brienna Over a year ago

@PeterSeliger it works though? Also if i flip the return false with return true it results in uniques

Peter Seliger Over a year ago

@brienna ... it works though is not an argument for a badly implemented solution.

Peter Seliger · Accepted Answer · 2021-09-03 20:02:44Z

This reduce based approach detects duplicates by their geo coordinates signature which is a string based key, concatenated by each item's latitude and longitude property values.

This key gets used for the grouping of coordinate items, and the value of the group type tells whether the signature refers to a single item or same coordinate items (duplicates). As soon as at least a double was found these items also get collected by the internal accumulators list object. Thus this approach iterates just once and also delivers the final result with the end of the single reduce cycle ...

function collectDuplicates(collector, item) {
  const { index, list } = collector;
  const { latitude, longitude } = item;

  const key = [
    parseFloat(latitude),
    parseFloat(longitude),
  ].join('/');

  const grouped = index[key];

  if (Array.isArray(grouped)) {
    // already more than 2 duplicates detected.

    grouped.push(item);
    list.push(item);

  } else if (grouped) {
    // first time duplicate detection (2 same items).

    index[key] = [grouped, item];
    list.push(grouped, item);

  } else {
    // register first item of its kind.
    index[key] = item;
  }
  return collector;
}

const data = [{
  name: 'x',
  latitude: '45.9',
  longitude: '50.2'
}, {
  name: 'y',
  latitude: '45.9',
  longitude: '50.2'
}, {
  name: 'z',
  latitude: '40.5',
  longitude: '85.7'
}];

const duplicates =
  data.reduce(collectDuplicates, { index: {}, list: [] }).list;

console.log({ duplicates });

console.log(
  'data.reduce(collectDuplicates, { index: {}, list: [] }) ...',
  data.reduce(collectDuplicates, { index: {}, list: [] })
)

.as-console-wrapper { min-height: 100%!important; top: 0; }

Interesting approach. See my answer which compare output structure and performance characteristics between your answers and the others here.

Scott Sauyet · Accepted Answer · 2021-09-05 19:12:45Z

A simple fix to your code to do this might look like the following:

const duplicates = (data) => data
  .filter((obj, index, array) =>
    array.find((o, i) =>
      o.latitude === obj.latitude &&
      o.longitude === obj.longitude &&
      i != index
    ) 
  )

We simply need to test for mismatched indices inside the find callback.

But I think there is much to be gained by separating out the filtering/dup-checking logic from the code that tests whether two elements are equal. The breakdown is more logical and we get a potentially reusable function from it.

So I might write it like this:

const keepDupsBy = (eq) => (xs) => xs .filter (
  (x, i) => xs .find ((y, j) => i !== j && eq (x, y))
)

const dupLocations = keepDupsBy ((a, b) => 
  a .latitude == b.latitude && 
  a .longitude == b .longitude
) 

const data = [{name: 'x', latitude: '45.9', longitude: '50.2'}, {name: 'y', latitude: '45.9', longitude: '50.2'}, {name: 'z', latitude: '40.5', longitude: '85.7'}];

console .log (dupLocations (data))

.as-console-wrapper {max-height: 100% !important; top: 0}

This keeps all the elements in the original array that have duplicates elsewhere, and returns them in their relative order from the original array. This is the same order as the above, but different from the interesting approach in Peter Seliger's answer which groups together all the matching values, returned in the relative order of the first elements of each group.

Note too the performance difference if you're expecting to use this on large lists. Your original and all the answers but Peter's operate in O (n^2) time. Peter's operates in O (n). For larger lists, the difference could be substantial. The tradeoff is different when it comes to memory resources, as Peter's operates in O (n) additional memory, while all other here operate in constant memory -- O (1). None of this likely makes a difference unless you're working in tens of thousands of elements or above, but it's often worth considering.

Collectives™ on Stack Overflow

Filter duplicates, including original value, from array of objects based on multiple properties

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related