57

I'm trying to read in from two files and store them in two separate arraylists. The files consist of words which are either alone on a line or multiple words separated by commas. I read each file with the following code (not complete):

ArrayList<String> temp = new ArrayList<>();

FileInputStream fis;
fis = new FileInputStream(fileName);

Scanner scan = new Scanner(fis);

while (scan.hasNextLine()) {
    Scanner input = new Scanner(scan.nextLine());
    input.useDelimiter(",");
    while (scan.hasNext()) {
        String md5 = scan.next();
        temp.add(md5);
    }
}
scan.close();    

return temp;

I now need to read two files in and remove all words from the first file which also exist in the second file (there are some duplicate words in the files). I have tried with for-loops and other such stuff, but nothing has worked so any help would be greatly appreciated!

Bonus question: I also need to find out how many duplicates there are in the two files - I've done this by adding both arraylists to a HashSet and then subtracting the size of the set from the combined size of the two arraylists - is this a good solution, or could it be done better?

1

4 Answers 4

72

You can use the removeAll method to remove the items of one list from another list.

To obtain the duplicates you can use the retainAll method, though your approach with the set is also good (and probably more efficient)

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks! I tried using removeAll like so: ArrayList<String> file1 = readFile(fileName1); ArrayList<String> file2 = readFile(fileName2); file1.removeAll(file2); return file1; Any idea why this is not working?
Your problem must be elsewhere. Try printing the contents of file1, file2, and file1 after the remove operation to see what is happening.
Problem is that the removeAll operation makes the entire thing hang. I let it run for 50 minutes and nothing happened - everything I do until I call the method works fine (and if I remove the removeAll operation the method works fine). Literally the only thing not working is the removeAll, which is what confuses me.
You should post a new question with the relevant parts of the code (this one doesn't get many views anymore). It's impossible to say what could be wrong. Try using a debugger to see what the program is doing while it's stuck.
28

The collection facility has a convenient method for this purpose:

list1.removeAll(list2);

Comments

18

First you need to override equal method in your custom class and define the matching criteria of removing list

public class CustomClass{

 @Override
    public boolean equals(Object obj) {

        try {
            CustomClass licenceDetail  = (CustomClass) obj;
            return name.equals(licenceDetail.getName());
        }
        catch (Exception e)
        {
            return false;
        }

    }
}

Second you call the removeAll() method

list1.removeAll(list2);

1 Comment

Upvoting this as it is the only answer which explicitly mentions that the equals method needs to be overridden in the classes which are in the Lists. This may not be obvious to some people and removeAll just won't work as expected if equals is not implemented correctly for your custom classes.
4

As others have mentioned, use the Collection.removeAll method if you wish to remove all elements that exist in one Collection from the Collection you are invoking removeall on.

As for your bonus question, I'm a huge fan of Guava's Sets class. I would suggest the use of Sets.intersection as follows:

Sets.intersection(wordSetFromFile1, wordSetFromFile2).size();

Assuming you created a Set of words from both files, you can determine how many distinct words they have in common with that one liner.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.