42

I read data from a text file, so there may be:

John
Mary
John
Leeds

I now need to get 3 unique elements in the ArrayList, because there are only 3 unique values in the file output (as above).

I can use a HashTable and add information to it, then simply copy its data into the List. Are there other solutions?

8 Answers 8

78

Why do you need to store it in a List? Do you actually require the data to be ordered or support index-based look-ups?

I would suggest storing the data in a Set. If ordering is unimportant you should use HashSet. However, if you wish to preserve ordering you could use LinkedHashSet.

Sign up to request clarification or add additional context in comments.

9 Comments

Why? Cause Lists are used most often and it are easy to work with. I think if you write code for other developers it's better to return a List type than any other type.
Not necessarily. The only case where you'd need a list is if you want random indexed access to items in the collections. Otherwise it would be preferrable to go with plain HashSet or LinkedHashSet.
@EugeneP: Don't write C code in Java. In every language, it is better to take advantage of its own features.
@EugeneP: I disagree that it's better to return a List than any other type. It's better to return the appropriate data structure. If uniqueness is important but ordering isn't then return a Set. If neither are important then return a Collection. If you only wish to iterate over the elements then return an Iterable. The more decoupled the calling code is from the internal implementation the easier it is to make changes or substitute in a different implementation.
@EugeneP: I guess your point is that there is no reason to use something just to conform with dogma. I don't disagree. But in this case, a List is both unnecessary and less efficient (O(n^2) vs O(n)). Unless you actually need to preserve the order of the Strings, using a List is really a worse choice in practice. That is not the case for for(;;). It may be that using a List comes more naturally to you because C does not really have any Set like data structure, but it is really not the natural thing to use.
|
71

If you have a List containing duplicates, and you want a List without, you could do:

List<String> newList = new ArrayList<String>(new HashSet<String>(oldList));

That is, wrap the old list into a set to remove duplicates and wrap that set in a list again.

2 Comments

I guess, there is no guarantee that the sorting of the newList will be the same as the one of oldList, or?
O.k. I found the answer to my comment: If you use LinkedHashSet instead of HashSet, you preserve the order.
18

You can check list.contains() before adding.

if(!list.contains(value)) {
    list.add(value);
}

I guessed it would be obvious! However, adding items to a HashSet and then creating a list from this set would be more efficient.

3 Comments

I suppose it uses the "equals" method?
Yes it does use equals() on the the string internally.
Yep. Thanks. In my case that would be preferable, 'cause the code will be kept clean. But I guess the HashTable algorithm is faster than contains method, unless they use a HashTable internally.
5

Use a set instead of a list. Take a look at here: Java Collections Tutorials and specifically about Sets here: Java Sets Tutorial

In a nutshell, sets contain one of something. Perfect :)

Comments

3

Here is how I solved it:

import groovy.io.*;
def arr = ["5", "5", "7", "6", "7", "8", "0"]
List<String> uniqueList = new ArrayList<String>(new HashSet<String>( arr.asList() ));
System.out.println( uniqueList )

Comments

2

Another approach would be to use Java 8 stream's distinct

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

// public static void main(String args[]) ...

// list of strings, including some nulls and blanks as well ;)
List<String> list = Arrays.asList("John", "Mary", "John", "Leeds", 
null, "", "A", "B", "C", "D", "A", "A", "B", "C", "", null);
 
// collect distinct without duplicates
List<String> distinctElements = list.stream()
                        .distinct()
                        .collect(Collectors.toList());
 
// unique elements
System.out.println(distinctElements);

Output:

 [John, Mary, Leeds, null, , A, B, C, D]

Comments

1
 List<String> distinctElements = list.stream()
            .distinct().filter(s -> s != null && s != "")
            .collect(Collectors.toList());

This will collect the distinct items and also avoid null or empty String

Comments

-1
class HashSetList<T extends Object>
    extends ArrayList<T> {

    private HashSet<Integer> _this = new HashSet<>();

    @Override
    public boolean add(T obj) {
        if (_this.add(obj.hashCode())) {
            super.add(obj);
            return true;
        }
        return false;
    }
}

I now use those kind of structure for little programs, I mean you have little overhead in order to have getters and setters but uniqueness. Moreover you can override hashCode to decide wether your item equals another one.

1 Comment

Nice idea, but... what about remove and add(int, *) etc? This solution is not complete.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.