Remove duplicates (both values) - duplicate values from an ArrayList

Question

I have an ArrayList with the following strings;

 List<String> e = new ArrayList<String>();
 e.add("123");
 e.add("122");
 e.add("125");
 e.add("123");

I want to check the list for duplicates and remove them from the list. In this case my list will only have two values, and in this example it would be the values 122 and 125, and the two 123s will go away.

What will be the best way to this? I was thinking of using a Set, but that will only remove one of the duplicates.

You can use a Map<String, Integer> (that represent the number of times the String is in the list), then filter the entries that only have a value of 1, and collect the corresponding keys into a new list. — Alexis C.
– Alexis C., Commented Oct 14, 2015 at 13:20
@3Kings he to want remove if value has duplicate then remove value along with duplicate....so in above example both 123 — Shadow Droid
– Shadow Droid, Commented Oct 14, 2015 at 13:20
set's add() method returns true if the value has no dupe and was inserted successfully. you can use that to get an indication if the new value you're inserting is a dupe. then you can find and remove the dupe — svarog
– svarog, Commented Oct 14, 2015 at 13:21
A Set would not remove items, it would prevent adding duplicate items. — Thomas Weller
– Thomas Weller, Commented Oct 14, 2015 at 15:23

Maroun · Accepted Answer · 2015-10-14 15:48:27Z

25

In Java 8 you can do:

e.removeIf(s -> Collections.frequency(e, s) > 1);

If !Java 8 you can create a HashMap<String, Integer>. If the String already appears in the map, increment its key by one, otherwise, add it to the map.

For example:

put("123", 1);

Now let's assume that you have "123" again, you should get the count of the key and add one to it:

put("123", get("aaa") + 1);

Now you can easily iterate on the map and create a new array list with keys that their values are < 2.

References:

edited Oct 14, 2015 at 15:48

answered Oct 14, 2015 at 13:22

Maroun

96.3k30 gold badges195 silver badges249 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

Paul Boddington Over a year ago

The Java 8 version does work, but only because it's an ArrayList, and removeIf is overridden to do all the removals in bulk at the end. It doesn't work on a LinkedList, for example.

Flown Over a year ago

List::removeIf is a clean solution but its complexity is O(n²) because of iterating over the List and Collection::frequency am I right?

Maroun Over a year ago

The complexity of finding duplicates is O(n²), applying removeIf only adds a constant time. So the overall complexity is indeed O(n²).

Flown Over a year ago

If you create a Map<String, Long> to count the occurrences and then iterate over the EntrySet to get the unique elements you've got an O(2*n) -> O(n) complexity or am I wrong?

Voo Over a year ago

@Taemyr Correct, but m is in O(n) in any implementation of hashmap I know of (I'd go so far as to say it's the only reasonable choice) since the capacity is resized to keep it larger by some fraction than the current size.

|

Sajan Chandran · Accepted Answer · 2015-10-14 13:29:30Z

12

You can also use filter in Java 8

e.stream().filter(s -> Collections.frequency(e, s) == 1).collect(Collectors.toList())

answered Oct 14, 2015 at 13:29

Sajan Chandran

11.5k3 gold badges34 silver badges41 bronze badges

Comments

Tom · Accepted Answer · 2015-10-15 12:49:23Z

6

You could use a HashMap<String, Integer>.

You iterate over the list and if the Hash map does not contain the string, you add it together with a value of 1.

If, on the other hand you already have the string, you simply increment the counter. Thus, the map for your string would look like this:

{"123", 2}
{"122", 1}
{"125", 1}

You would then create a new list where the value for each key is 1.

edited Oct 15, 2015 at 12:49

Tom

17.7k17 gold badges48 silver badges55 bronze badges

answered Oct 14, 2015 at 13:26

npinti

52.2k5 gold badges74 silver badges98 bronze badges

Comments

deikyb · Accepted Answer · 2020-04-21 00:54:23Z

4

Here is a non-Java 8 solution using a map to count occurrences:

Map <String,Integer> map = new HashMap<String, Integer>();
for (String s : list){
    if (map.get(s) == null){
      map.put(s, 1);
    } 
    else {
      map.put(s, map.get(s) + 1);
    }
}

List<String> newList = new ArrayList<String>();

// Remove from list if there are multiples of them.
for (Map.Entry<String, String> entry : map.entrySet())
{
  if(entry.getValue() > 1){
    newList.add(entry.getKey());
  }
}

list.removeAll(newList);

edited Apr 21, 2020 at 0:54

deikyb

1,4212 gold badges16 silver badges32 bronze badges

answered Oct 14, 2015 at 13:30

ergonaut

7,0671 gold badge21 silver badges50 bronze badges

2 Comments

ergonaut Over a year ago

newList adds all the entries with 2 or more. This is a temporary list. list is the original list, so in order to "return list", I'm modifying this one by removing all the entries with count = 1.

Stack Exchange Broke The Law Over a year ago

The question is to remove the ones with count >= 2.

Android Developer · Accepted Answer · 2015-10-14 13:32:15Z

2

Solution in ArrayList

public static void main(String args[]) throws Exception {
      List<String> e = new ArrayList<String>();
      List<String> duplicate = new ArrayList<String>();
      e.add("123");
      e.add("122");
      e.add("125");
      e.add("123");

      for(String str : e){
          if(e.indexOf(str) != e.lastIndexOf(str)){
              duplicate.add(str);
          }
      }

      for(String str : duplicate){
          e.remove(str);              
      }

      for(String str : e){
          System.out.println(str);
      }
  }

answered Oct 14, 2015 at 13:32

Android Developer

4662 silver badges12 bronze badges

Comments

Paul Boddington · Accepted Answer · 2015-10-15 14:14:38Z

2

The simplest solutions using streams have O(n^2) time complexity. If you try them on a List with millions of entries, you'll be waiting a very, very long time. An O(n) solution is:

list = list.stream()
           .collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
           .entrySet()
           .stream()
           .filter(e -> e.getValue() == 1)
           .map(Map.Entry::getKey)
           .collect(Collectors.toList());

Here, I used a LinkedHashMap to maintain the order. Note that static imports can simplify the collect part.

This is so complicated that I think using for loops is the best option for this problem.

Map<String, Integer> map = new LinkedHashMap<>();
for (String s : list)
    map.merge(s, 1, Integer::sum);
list = new ArrayList<>();
for (Map.Entry<String, Integer> e : map.entrySet())
    if (e.getValue() == 1)
        list.add(e.getKey());

edited Oct 15, 2015 at 14:14

answered Oct 14, 2015 at 14:40

Paul Boddington

37.8k10 gold badges69 silver badges120 bronze badges

7 Comments

Flown Over a year ago

Stream complexity is also O(2*n) therefor O(n)

Paul Boddington Over a year ago

@Flown It says O(n)

Flown Over a year ago

You're saying O(n^2)

Paul Boddington Over a year ago

@Flown I'm not. It says the simplest solutions using streams are O(n^2). My solution is not the simplest.

Alexis C. Over a year ago

you could also use .collect(groupingBy(identity(), counting()))

|

Peter Mortensen · Accepted Answer · 2015-11-02 01:19:15Z

2

List<String> e = new ArrayList<String>();
e.add("123");
e.add("122");
e.add("125");
e.add("123");
e.add("125");
e.add("124");
List<String> sortedList = new ArrayList<String>();
for (String current : e){
    if(!sortedList.contains(current)){
        sortedList.add(current);
    }
    else{
        sortedList.remove(current);
    }
}
e.clear();
e.addAll(sortedList);

edited Nov 2, 2015 at 1:19

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Oct 14, 2015 at 13:43

Ray Lloy

17112 bronze badges

Comments

Jeremiah · Accepted Answer · 2015-10-15 09:58:33Z

I'm a fan of the Google Guava API. Using the Collections2 utility and a generic Predicate implementation it's possible to create a utility method to cover multiple data types.

This assumes that the Objects in question have a meaningful .equals implementation

@Test
    public void testTrimDupList() {
        Collection<String> dups = Lists.newArrayList("123", "122", "125", "123");
        dups = removeAll("123", dups);
        Assert.assertFalse(dups.contains("123"));

        Collection<Integer> dups2 = Lists.newArrayList(123, 122, 125,123);
        dups2 = removeAll(123, dups2);
        Assert.assertFalse(dups2.contains(123));
    }

    private <T> Collection<T> removeAll(final T element, Collection<T> collection) {
        return Collections2.filter(collection, new Predicate<T>(){
            @Override
            public boolean apply(T arg0) {
                return !element.equals(arg0);
            }});
    }

Thinking about this a bit more

Most of the other examples in this page are using the java.util.List API as the base Collection. I'm not sure if that is done with intent, but if the returned element has to be a List, another intermediary method can be used as specified below. Polymorphism ftw!

@Test
    public void testTrimDupListAsCollection() {
        Collection<String> dups = Lists.newArrayList("123", "122", "125", "123");
        //List used here only to get access to the .contains method for validating behavior.
        dups = Lists.newArrayList(removeAll("123", dups)); 
        Assert.assertFalse(dups.contains("123"));

        Collection<Integer> dups2 = Lists.newArrayList(123, 122, 125,123);
      //List used here only to get access to the .contains method for validating behavior.
        dups2 = Lists.newArrayList(removeAll(123, dups2));
        Assert.assertFalse(dups2.contains(123));
    }

    @Test
    public void testTrimDupListAsList() {
        List<String> dups = Lists.newArrayList("123", "122", "125", "123");
        dups = removeAll("123", dups);
        Assert.assertFalse(dups.contains("123"));

        List<Integer> dups2 = Lists.newArrayList(123, 122, 125,123);
        dups2 = removeAll(123, dups2);
        Assert.assertFalse(dups2.contains(123));
    }

    private <T> List<T> removeAll(final T element, List<T> collection) {
        return Lists.newArrayList(removeAll(element, (Collection<T>) collection));

    }
    private <T> Collection<T> removeAll(final T element, Collection<T> collection) {
        return Collections2.filter(collection, new Predicate<T>(){
            @Override
            public boolean apply(T arg0) {
                return !element.equals(arg0);
            }});
    }

Peter Mortensen · Accepted Answer · 2015-11-02 01:16:43Z

1

Something like this (using a Set):

Set<Object> blackList = new Set<>()

public void add(Object object) {
    if (blackList.exists(object)) {
        return;
    }
    boolean notExists = set.add(object);
    if (!notExists) {
       set.remove(object)
       blackList.add(object);
    }
}

edited Nov 2, 2015 at 1:16

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Oct 14, 2015 at 13:26

svarog

9,8855 gold badges66 silver badges79 bronze badges

4 Comments

Alexis C. Over a year ago

And what do you do if you have 3 times 123 in the list?

svarog Over a year ago

you will get 1 instance of 123, it's bad or good depending what you're trying to accomplish and do. technically when you insert values one by one, you handle them one by one. so after the 2'nd insert of 123, you remove 123. you don't have 123 in your collection so it's ok to reinsert it. if you want to be fancy use a blacklist

Alexis C. Over a year ago

"you will get 1 instance of 123, it's bad or good depending what you're trying to accomplish and do. " But this is not what the OP wants to do. If he has more than 1 time a string in his list, he doesn't want it to be in the final list.

svarog Over a year ago

you're right, my bad! i've changed the code. it's not tested but the idea should be obvious

Peter Mortensen · Accepted Answer · 2015-11-02 01:21:33Z

If you are going for set then you can achieve it with two sets. Maintain duplicate values in the other set as follows:

List<String> duplicateList = new ArrayList<String>();

duplicateList.add("123");
duplicateList.add("122");
duplicateList.add("125");
duplicateList.add("123");
duplicateList.add("127");
duplicateList.add("127");

System.out.println(duplicateList);

Set<String> nonDuplicateList = new TreeSet<String>();
Set<String> duplicateValues = new TreeSet<String>();

if(nonDuplicateList.size()<duplicateList.size()){
    for(String s: duplicateList){
        if(!nonDuplicateList.add(s)){
            duplicateValues.add(s);
        }
    }

    duplicateList.removeAll(duplicateValues);

    System.out.println(duplicateList);
    System.out.println(duplicateValues);
}

Output: Original list: [123, 122, 125, 123, 127, 127]. After removing
duplicate: [122, 125] values which are duplicates: [123, 127]

Note: This solution might not be optimized. You might find a better
solution than this.

Lii · Accepted Answer · 2015-12-23 09:00:09Z

0

With the Guava library, using a multiset and streams:

e = HashMultiset.create(e).entrySet().stream()
    .filter(me -> me.getCount() > 1)
    .map(me -> me.getElement())
    .collect(toList());

This is pretty, and reasonably fast for large lists (O(n) with a rather large constant factor). But it does not preserve order (LinkedHashMultiset can be used if that is desired) and it creates a new list instance.

It is also easy to generalise, to instead remove all triplicates for example.

In general the multiset data structure is really useful to keep in ones toolbox.

edited Dec 23, 2015 at 9:00

answered Oct 15, 2015 at 11:05

Lii

12.2k9 gold badges69 silver badges92 bronze badges

Collectives™ on Stack Overflow

Remove duplicates (both values) - duplicate values from an ArrayList

11 Answers 11

13 Comments

Comments

Comments

2 Comments

Comments

7 Comments

Comments

Comments

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

13 Comments

Comments

Comments

2 Comments

Comments

7 Comments

Comments

Comments

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related