3

I am performing some maintenance tasks on an old system. I have an arraylist that contains following values:

a,b,12
c,d,3
b,a,12
d,e,3
a,b,12

I used following code to remove duplicate values from arraylist

ArrayList<String> arList;
  public static void removeDuplicate(ArrayList arlList)
  {
   HashSet h = new HashSet(arlList);
   arlList.clear();
   arlList.addAll(h);
  }

It works fine, if it finds same duplicate values. However, if you see my data carefully, there are some duplicate entries but not in same order. For example, a,b,12 and b,a,12 are same but in different order.

How to remove this kind of duplicate entries from arraylist?

Thanks

5
  • You're really close, try a set of sets. Commented Jan 24, 2011 at 3:14
  • @Cpfohl - could you be more precise? Commented Jan 24, 2011 at 3:18
  • Yup, but first can you answer this: Your array list is a set of what? (Character arrays? Numbers? Classes?) Commented Jan 24, 2011 at 3:19
  • My ArrayList<String> is string list Commented Jan 24, 2011 at 3:26
  • Try this simple solution...(No Set interface used) stackoverflow.com/a/19434592/369035 Commented Oct 18, 2013 at 5:23

5 Answers 5

3

Assuming the entries are String. Then you can sort each of the entry and then do the duplicate check. Then you can store the entry in a map and use the contains(key) to see if they exist.

EDIT: added a complete code example.

public class Test {

    /**
     * @param args
     */
    public static void main(String[] args) {
        Test test = new Test();
        List<String> someList = new ArrayList<String>(); 
        someList.add("d,e,3");
        someList.add("a,b,12");
        someList.add("c,d,3");
        someList.add("b,a,12");
        someList.add("a,b,12");
            //using a TreeMap since you care about the order
        Map<String,String> dupMap = new TreeMap<String,String>();
        String key = null;
        for(String some:someList){
            key = test.sort(some);
            if(key!=null && key.trim().length()>0 && !dupMap.containsKey(key)){
                dupMap.put(key, some);
            }
        }
        List<String> uniqueList = new ArrayList<String>(dupMap.values());
        for(String unique:uniqueList){
            System.out.println(unique);
        }

    }
    private String sort(String key) {
      if(key!=null && key.trim().length()>0){
        char[] keys = key.toCharArray();
        Arrays.sort(keys);
        return String.valueOf(keys);
      }
      return null;
   }
}

Prints:

a,b,12

c,d,3

d,e,3

Sign up to request clarification or add additional context in comments.

6 Comments

first solution is expensive in terms of performance, also it can cause some disorder problems. Second solution does not work because contain key will not work for different order.
Why do you think, String.split is too expensive? I don't think it could be done faster. What disorder problems? Do you care about the order of items in you strings or not? Shouldn't you use List<Set<String>> or better Set<Set<String>> instead of List<String>? Your duplicate removal destroys the order, anyway, so why to use List?
I do care of order in string. It is small part of the main processing module of the software, if I start splitting, it may cost me performance. All arraylists contain couple of hundred thousands records.
I am open to use any data structure other than ArrayList if it can solve my problem. Could you provide pseudo code?
@Tweety - I will provide you code details. It's useful to know that you do care about the order.
|
2

Wrap the element as "Foo" instead of "String", rest of code 'removeDuplicate' remains:

public class Foo {
    private String s1;
    private String s2;
    private String s3;

    public Foo(String s1, String s2, String s3) {
     this.s1 = s1;
     this.s2 = s2;
     this.s3 = s3;
    }

 @Override
    public int hashCode() {
     final int prime = 31;
     int result = 1;
     result = prime * result + ((s1 == null) ? 0 : s1.hashCode());
     result = prime * result + ((s2 == null) ? 0 : s2.hashCode());
     result = prime * result + ((s3 == null) ? 0 : s3.hashCode());
     return result;
    }

 @Override
    public boolean equals(Object obj) {
     if (this == obj)
      return true;
     if (obj == null)
      return false;
     if (getClass() != obj.getClass())
      return false;
     Foo other = (Foo) obj;
     //Notice here: 'a,b,12' and 'b,a,12' will be same
     if(fieldsAsList().containsAll(other.fieldsAsList())){
      return true;
     }

     return false;
    }

 private List<String> fieldsAsList(){
  ArrayList<String> l = new ArrayList<String>(3);
  l.add(s1);
     l.add(s2);
     l.add(s3);
     return l;
 }    
}

Then arList will be ArrayList < Foo>.

Comments

1

Create a class to wrap around a row string (triplet) to provide your equality semantics. Implement the equals() and hashCode() methods. Then use the HashSet method to remove duplicates.

Comments

1

Try this simple solution...(No Set interface used)

https://stackoverflow.com/a/19434592/369035

Comments

-1

In ArrayList we don't have a chance to remove duplicate elements directly. We can achieve it with sets, because sets don't allow duplicates, so, better to use HashSet or LinkedHashSet classes. See reference.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.