I Have looked through Stack, but none of the examples work in my case (from what I have tried).
I want to count how many times a word occurs in an array. This is done by splitting up an input String, such as "Henry and Harry went out" and counting the distinct characters of varying length (in the following example it is 2) Please forgive me if my style is bad, its my first project...
He = 1
en = 2
nr = 1
ry = 2
a = 1
an = 1
etc....... Here is my code for the constructor:
public NgramAnalyser(int n, String inp)
{
boolean processed = false;
ngram = new HashMap<>(); // used to store the ngram strings and count
alphabetSize = 0;
ngramSize = n;
ArrayList<String> tempList = new ArrayList<String>();
System.out.println("inp length: " + inp.length());
System.out.println();
int finalIndex = 0;
for(int i=0; i<inp.length()-(ngramSize - 1); i++)
{
tempList.add(inp.substring(i,i+ngramSize));
alphabetSize++;
if(i == (inp.length()- ngramSize))
// if i (the index) has reached the boundary limit ( before it gets an error), then...
{
processed = true;
finalIndex = i;
break;
}
}
if(processed == true)
{
for(int i=1; i<(ngramSize); i++)
{
String startString = inp.substring(finalIndex+i,inp.length());
String endString = inp.substring(0, i);
tempList.add(startString + endString);
}
}
for(String item: tempList)
{
System.out.println(item);
}
}
// code for counting the ngrams and sorting them
ngramSizewhere comes from.StringUtilsclass of Apache. The class has many useful methods for this. You can use thesplit(String, char)to split the strings and then usecountMatches(String, String)to find how many times a string occurs.