3

I have an Array of Strings and want to count the occurrences of any single String.

I have already sorted it. (It's a long Array and I wanted to get rid of the O(n²)-loop)

Here my code.. obviously it runs out in an ind.outOfB. exc.. the reason is clear but I donno how to solve..

for (int i = 0; i < patternsTest.length-1; i++) {
        int occ=1;
        String temp=patternsTest[i];
        while(temp.equals(patternsTest[i+1])){
            i++;
            occ++;
        }
    }
4
  • Why not use Map<String, Integer>? Commented May 24, 2013 at 23:24
  • I need the raw counts.. i dont know if i would create a Map only for that... Commented May 24, 2013 at 23:28
  • Why wouldn't you? It'd be faster, and easier to modify in the future. Commented May 24, 2013 at 23:29
  • Do you not want to use a Map for efficiency? by sorting you are losing out on a lot of efficiecy to start with, using a map means you don't need to presort. But if you really don't want to use a map just explain the reasons, or I guess just say so :) Commented May 24, 2013 at 23:30

6 Answers 6

11

This would be a good place for a HashMap, the key would be the Word, and the value the Number of times it occurs. The Map.containsKey and Map.get methods are constant time lookups which are very fast.

Map<String,Integer> map = new HashMap<String,Integer>();
for (int i = 0; i < patternsTest.length; i++) {
    String word=patternsTest[i];
    if (!map.containsKey(word)){
        map.put(word,1);
    } else {
        map.put(word, map.get(word) +1);
    }
}

As a side benefit you don't even need to sort beforehand!

Sign up to request clarification or add additional context in comments.

7 Comments

This is a good answer, however I would change the declaration to the parent class Map<String,Integer> map = new HashMap<String,Integer>(); amazon.com/Effective-Java-2nd-Joshua-Bloch/dp/0321356683
but do the .containsKey() iterate over the whole Map? and the existing entry will be overwritten everytime..? seems to be inefficient this way.. Not saying that it is a bad aproach..
The containsKey is O(1) search. Which means it will not iterate over the entire map, it's more akin to indexing into an array than a full search. I'll update the answer with this as well.
How efficient are you looking for? You can make this slightly faster with an extra line of code if you would like me to put that version up instead of this
Nice question :P .. everybody was looking for the fastest, right? ;) its enough so far.. I think a workin version of my approach would been slower.. with all the sorting and bla. Thank you very much :)
|
4

You can use Java HashMap:

Map<String, Integer> occurrenceOfStrings = new HashMap<String, Integer>();

for(String str: patternsTest)
{
    Integer currentValue = occurrenceOfStrings.get(str);
    if(currentValue == null)
        occurrenceOfStrings.put(str, 1);
    else
        occurrenceOfStrings.put(str, currentValue + 1);
}

Comments

0

This does not have index out of bounds:

String[] patternsTest = {"a", "b"};
for (int i = 0; i < patternsTest.length-1; i++) {
    int occ=1;
    String temp=patternsTest[i];
    while(temp.equals(patternsTest[i+1])){
        i++;
        occ++;
    }
}

You can cause an Index Out of Bounds by changing the data to:

String[] patternsTest = {"a", "a"};

Comments

0

you could try a map and only one loop

Map<String, Integer> occurences = new HashMap<String, Integer>();
String currentString = patternsTest[0];
Integer count = 1;

for (int i = 1; i < patternsTest.length; i++) {
    if(currentString.equals(patternsTest[i]) {
        count++;
    } else {
        occurrences.put(currentString, count);
        currentString = patternsTest[i];
        count = 1;
    }
}
occurrences.put(currentString, count);

Comments

0

Guava Multiset solution (two lines of code):

Multiset<String> multiset = HashMultiset.create();
multiset.addAll(Arrays.asList(patternsTest));

//Then you could do...
multiset.count("hello");//Return count the number of occurrences of "hello".

We could use it both sorted and un-sorted arrays. Easy to maintain code.

Comments

0

My solution is:

public int cantOccurences(String pattern, String[] values){
  int count = 0;

  for (String s : values) {
    count +=  (s.replaceAll("[^".concat(pattern).concat("]"), "").length());
  }
return count;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.