4

I'm having issues with removing duplicate objects from an ArrayList. Im parsing XML into what i call an IssueFeed object. This consists of a symptom, problem, solution.

Most of my objects are unique and don't share a symptom, problem, solution but some share the same symptom but have a different problem.

Im trying to accomplish several things.

  1. Capture objects that share the same symptom as a duplicate Arraylist
  2. Remove the duplicate items from the main list, leaving at least 1 item with that symptom to be displayed.
  3. When the user clicks on the item that we know has duplicates, set the duplicate data Arraylist in my listview/adapter.

Steps i've taken.

  1. I've tried sorting the objects and i am able to capture the duplicates, however not sure how to remove all but one from the main list.
  2. 2 Loops between the list and looking for objects that aren't themselves and symptom = symptom and then remove and update my duplicate array and main array.

Some code

IssueFeed - object

public IssueFeed(String symptom, String problem, String solution) {
    this.symptom = symptom;
    this.problem = problem;
    this.solution = solution;
}
public String getSymptom() {
    return symptom;
}
public String getProblem() {
    return problem;
}
public String getSolution() {
    return solution;
}

My ArrayList<IssueFeed>'s

duplicateDatalist = new ArrayList<IssueFeed>(); // list of objects thats share a symptom

list_of_non_dupes = new ArrayList<IssueFeed>(); // list of only objects with unique symptom

mIssueList = mIssueParser.parseLocally(params[0]); // returns ArrayList<IssueFeed> of all objects

I can obtain the duplicates by the following sort code below.

Collections.sort(mIssueList, new Comparator<IssueFeed>(){
            public int compare(IssueFeed s1, IssueFeed s2) {
                if (s1.getSymptom().matches(s2.getSymptom())) {
                    if (!duplicateDatalist.contains(s1)) {
                        duplicateDatalist.add(s1);
                        System.out.print("Dupe s1 added" + " " + s1.getSymptom() + ", " + s1.getProblem() + "\n");
                    }
                    if (!duplicateDatalist.contains(s2)) {
                        duplicateDatalist.add(s2);
                        System.out.print("Dupe s2 added" + " " + s2.getSymptom() + ", " + s2.getProblem() + "\n");
                    }
                }
                return s1.getSymptom().compareToIgnoreCase(s2.getSymptom());
            }
        });

Now i need to create the new list of non dupes, This code only added all of the objects. :/

for (int j = 0; j < mIssueList.size(); j++) {
            IssueFeed obj = mIssueList.get(j);

            for (int i = 0; i < mIssueList.size(); i++) {
                IssueFeed obj_two = mIssueList.get(j);

                if (obj.getSymptom().matches(obj_two.getSymptom())) {
                    if (!list_non_dupes.contains(obj_two)) {
                        list_non_dupes.add(obj_two);
                    }
                    break;
                } else {
                    if (!list_non_dupes.contains(obj_two)) {
                        list_non_dupes.add(obj_two);
                    }
                }
            }
        }

2 Answers 2

1

If you could modify the IssueFeed object the consider overrding the equals() and hashCode() methods and use a set to find duplicates. Eg

import java.util.ArrayList;
import java.util.Arrays;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;

class IssueFeed {
    private String symptom;
    private String problem;
    private String solution;

    public IssueFeed(String symptom, String problem, String solution) {
        this.symptom = symptom;
        this.problem = problem;
        this.solution = solution;
    }
    public String getSymptom() {
        return symptom;
    }
    public String getProblem() {
        return problem;
    }
    public String getSolution() {
        return solution;
    }
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((symptom == null) ? 0 : symptom.hashCode());
        return result;
    }
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        IssueFeed other = (IssueFeed) obj;
        if (symptom == null) {
            if (other.symptom != null)
                return false;
        } else if (!symptom.equals(other.symptom))
            return false;
        return true;
    }
    @Override
    public String toString() {
        return "IssueFeed [symptom=" + symptom + ", problem=" + problem
                + ", solution=" + solution + "]";
    }
}

public class Sample {

    public static void main(String[] args) {
        List<IssueFeed> mainList = new ArrayList<IssueFeed>(
                Arrays.asList(new IssueFeed[] {
                        new IssueFeed("sym1", "p1", "s1"),
                        new IssueFeed("sym2", "p2", "s2"),
                        new IssueFeed("sym3", "p3", "s3"),
                        new IssueFeed("sym1", "p1", "s1") }));
        System.out.println("Initial List : " + mainList);
        Set<IssueFeed> list_of_non_dupes = new LinkedHashSet<IssueFeed>();
        List<IssueFeed> duplicateDatalist = new ArrayList<IssueFeed>(); 
        for(IssueFeed feed : mainList){
            if(!list_of_non_dupes.add(feed)) {
                duplicateDatalist.add(feed);
            }
        }
        mainList = new ArrayList<IssueFeed>(list_of_non_dupes); // Remove the duplicate items from the main list, leaving at least 1 item with that symptom to be display
        list_of_non_dupes.removeAll(duplicateDatalist); // list of only objects with unique symptom
        System.out.println("Fina main list : " + mainList);
        System.out.println("Unique symptom" + list_of_non_dupes);
        System.out.println("Duplicate symptom" + duplicateDatalist);
    }
}
Sign up to request clarification or add additional context in comments.

9 Comments

I think list_of_non_dupes is for objects in mainList with a unique symptom. See the OP's comments in the code.
I considered it as OP's second requirement "Remove the duplicate items from the main list, leaving at least 1 item with that symptom to be display"
however, OP says that is what the main list would be and seems to be handling 3 lists. It's possible to accomplish both the things we mentioned in linear time and without overriding .equals() and .hashCode().
Modified to have all 3 lists. :)
@SyamS this is extremely close to what i need, however the reason for the duplicate list is to obtain objects where the Symptom as the same, but problem is different. new IssueFeed("sym1", "p1", "s1"), new IssueFeed("sym2", "p2", "s2"), new IssueFeed("sym3", "p3", "s3"), new IssueFeed("sym1", "p4", "s4")
|
1

You should iterate through the ArrayList twice. Using this approach, you don't even need to sort the ArrayList by duplicates (Collections.sort is an O(n log n) operation) and can process the list in linear time. You also don't need to override equals() and hashCode() for IssueFeed objects.

In the first iteration, you should fill a HashMap of the symptoms hashed against the number of occurrences in the ArrayList. It would probably be

class SymptomInfo {
    int incidence;
    boolean used;
}
HashMap<String, SymptomInfo> symptomIncidence = new HashMap<String, SymptomInfo>();

However, you may want to use a thread-safe HashMap data structure if you are reading and writing to the HashMap from multiple threads.

In the second iteration through the ArrayList, you should look up the incidence value from the hashmap and find the total number of occurrences of that symptom. This is a quick and easy way to find out whether the object should be added to duplicateDataList or list_of_non_dupes. Also, the first time you encounter an object with a particular symptom value, you can set used to true. So, if you encounter an object where used is true, you know it is a duplicate occurrence and can remove it from the main list.

1 Comment

Thank you for this suggestion. Between yours and Syam's answer i have fixed my original issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.