I'm trying to find the index position of the duplicates in an arraylist of strings. I'm having trouble figuring out a way to efficiently loop through the arraylist and report the index of the duplicate. My initial thought was to use Collections.binarySearch() to look for a duplicate, but I'm not sure how I would be able to compare the elements of the arraylist to each other with binarySearch. The only other thought I had would involve looping through the list, which is quite massive, too many times to even be feasible. I have limited java knowledge so any help is appreciated.
4 Answers
Not elegant, but should work:
Map<String, List<Integer>> indexList = new HashMap<String, List<Integer>>();
for (int i = 0; i < yourList.size(); i++) {
String currentString = yourList.get(i);
List<String> indexes = indexList.get(currentString);
if (indexes == null) {
indexList.put(currentString, indexes = new LinkedList<Integer>());
}
indexes.add(i);
if (indexes.size() > 1) {
// found duplicate, do what you like
}
}
// if you skip the last if in the for loop you can do this:
for (String string : indexList.keySet()) {
if (indexList.get(string).size() > 1) {
// String string has multiple occurences
// List of corresponding indexes:
List<Integer> indexes = indexList.get(string);
// do what you want
}
}
5 Comments
It sounds like you're out of luck.
You will have to inspect every element (i.e. iterate through the whole list). Think about it logically - if you could avoid this, it means that there's one element that you haven't inspected. But this element could be any value, and so could be a duplicate of another list element.
Binary searches are a smart way to reduce the number of elements checked when you are aware of some relationship that holds across the list - so that checking one element gives you information about the others. For instance, for a sorted list if the middle element is greater than 5, you know that every element after it is also greater than five.
However, I don't think there's a way to make such an inference when it comes to duplicate checking. You'd have to sort the list in terms of "number of elements that this duplicates" (which is begging the question), otherwise no tests you perform on element x will give you insight into whether y is a duplicate.
Comments
Now this may not be a memory efficient solution but yes I guess this is what you were looking for.. May be this program could be further improved.
import java.io.*;
import java.util.*;
class ArrayList2_CountingDuplicates
{
public static void main(String[] args)throws IOException
{
ArrayList<String> als1=new ArrayList<String>();
ArrayList<String> als2=new ArrayList<String>();
int arr[];
int n,i,j,c=0;
String s;
BufferedReader p=new BufferedReader(new InputStreamReader(System.in));
n=Integer.parseInt(p.readLine());
arr=new int[n];
for(i=0;i<n;i++)
als1.add(p.readLine());
for(i=0;i<n;i++)
{
s=als1.get(i);
als1.remove(i);
als2.add(s);
arr[c]=1;
while(als1.contains(s))
{
j=als1.indexOf(s);
als1.remove(j);
arr[c]=arr[c]+1;
}
n=n-arr[c];
c=c+1;
i=-1;
}
for(i=0;i<c;i++)
System.out.println(als2.get(i)+" has frequency "+arr[i]);
}
}
Comments
I was looking for such a method and eventually I came up with my own solution with a more functional approach to solve the problem.
public <T> Map<T, List<Integer>> findDuplicatesWithIndexes(List<T> elems) {
return IntStream.range(0, elems.size())
.boxed()
.collect(Collectors.groupingBy(elems::get))
.entrySet().stream()
.filter(e -> e.getValue().size() > 1)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
}
It returns a map consisting of duplicated elements as the keys and list of all indexes of repeating element as the value.
binarySearchonly works with sorted list.Collections.sort(yourList);the duplicates will be right next to each other, but I guess this is not applicable to your use case, is it?