I'm trying to extract only lowercase alphanumerical characters from a document with this:
String delim = "abcdefghijklmnopqrstuvwxyz0123456789";
StringTokenizer strtok = new StringTokenizer(str, delim, true);
String newstr = "";
while (strtok.hasMoreTokens()) {
newstr = newstr + strtok.nextToken();
}
return newstr;
Note that the document is already lowercase only. But for some reason all of the punctuation characters are still being returned along with parethesis and /'s, etc.
I thought using the true boolean in the creation of the tokenizer would count delimiters as tokens?