0

I wanted to 'remove' elements in ArrayList by creating a new ArrayList that contains all elements except the duplicates. Duplicates are selected by an comparator and the ordering should be maintained. The algorithm is just iterating over the ArrayList and adding each element that isn't already in the new list. The work is done in a method that is going through the entire ArrayList and checking for equality. Here is the code:

public static <T> ArrayList<T> noDups(Comparator<T> cmp, ArrayList<T> l) {

    ArrayList<T> noDups = new ArrayList<T>();
    for(T o : l) {
        if(!isIn(cmp, o, l)) 
            noDups.add(o);
    }
    return noDups;
}

public static <T> boolean isIn(Comparator<T> cmp, T o, ArrayList<T> l) {
    Iterator<T> i = l.iterator();
    if (o==null) {
        while (i.hasNext())
            if (i.next()==null)
                 return true;
    } else {
        while (!(i.hasNext()))
            if (o.equals(i.next()))
                return true;
    }
    return false;
}

I'm curious about the time complexity this algorithm has. My thoughts are: For the first step there is no work, because the ArrayList noDups is empty. For the second step there is one check, for the third 3 and so on to n-1. So we have 1+2+...+n-1. Is that correct and what time complexity would that be?

6
  • 1
    This is O(n^2). For each element in the array you have to check every single one added before for a possible duplicate. Creating a HashSet from ArrayList would be O(n*log(n)), which is better in terms of time-complexity Commented Jan 18, 2018 at 20:33
  • As far as I know HashSets don't maintain the ordering of the ArrayList, which I don't want. Commented Jan 18, 2018 at 20:35
  • Did you mean "which I want to maintain"? If you care about the order, consider LinkedHashSet, which additionally keeps a linked list that represents the order of the added elements Commented Jan 18, 2018 at 20:36
  • Yes, that's what I meant. I am not very familiar with HashSets, how would this work with a LinkedHashSet? Commented Jan 18, 2018 at 20:41
  • Let me write an answer then Commented Jan 18, 2018 at 20:41

2 Answers 2

3

The time-complexity of your algorithm is O(n^2), since for each element in the array you have to compare it with every single one added before. You can improve that to O(n*log(n)) actually O(n) using a Set.

You mentioned that you care about the order and as far as you know, the HashSet does not maintain it. It's true, but there is a way to make it work. Use LinkedHashSet like so:

ArrayList<Integer> array = 
    new ArrayList<>(Arrays.asList(1, 5, 4, 2, 2, 0, 1, 4, 2));
LinkedHashSet<Integer> set = new LinkedHashSet<>(array);

This will construct a LinkedHashSet, which time-complexity of insertion, deletion and finding is O(1). The important part is the difference between this data structure and your typical HashSet. LinkedHashSet additionally creates a linked list that indicates the order of the elements in terms of insertion.

If you run the above code with this loop:

for(Integer i : set) {
    System.out.print(i + " ");
}

The output will be: 1 5 4 2 0 - as you wanted.

Sign up to request clarification or add additional context in comments.

6 Comments

This helped a lot, thanks. However I am wondering on how to do this with any equality check given by a comparator. I think your code works only for already implemented equality checks like '==' for ints, correct me if I'm wrong.
If you wish to use ArrayList<YourClass> and convert it into LinkedHashSet<YourClass> you should implement equals and hashCode in YourClass.
Simply List<Integer> x = Arrays.asList(1, 5, 4, 2, 2, 0, 1, 4, 2); will create that initial list, no new ArrayList() required. You don't really care what kind of list it is; ArrayList or LinkedList etc.
I completely disagree. We do care what kind of list it is, since OP specifially mentioned ArrayList in his question and we should not assume that it might not matter.
The OP specification that states what needs to be accomplished will work on anything that implements the List interface; there is no reason to limit it to one specifically chosen implementation of a List. One can (and I have) return the same kind of thing that was passed in, if that's even a requirement (so pass in a LinkedList, get a LinkedList back; pass in an ArrayList get an ArrayList back). If you absolutely must have an ArrayList (for some totally bizarre reason) you can new ArrayList<>(noDups(origList))
|
0

You are correct that the time complexity of this would be O(N^2). However, "a list without duplicates" is the definition of a set. This question has some details on implementation and corner cases that might cover what you're interested in.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.