File
Tokenize a java source file
With this example we are going to demonstrate how to tokenize a java source file.
In short, to tokenize a java source file you should:
- Create a new FileReader.
- Create a new StreamTokenizer that parses the given FileReader.
- Use
parseNumbers()API method of StreamTokenizer that specifies that numbers should be parsed by this tokenizer. - Use
wordChars(int low, int hi)API method that specifies that all characters c in the range low <= c <= high are word constituents. - Use
eolIsSignificant(boolean flag)method that determines whether or not ends of line are treated as tokens. - Use
ordinaryChars(int low, int hi)that specifies that all characters c in the range low <= c <= high are “ordinary” in this tokenizer. - Use
slashSlashComments(boolean flag)method that determines whether or not the tokenizer recognizes C++-style comments. - Use
slashStarComments(boolean flag)API method that determines whether or not the tokenizer recognizes C-style comments. - Iterate over the tokens of the tokenizer and for every token of the tokenizer, and check if it a String, the end of a line, a number, a word or something else,
- Close the fileReader using its
close()API method.
Let’s take a look at the code snippet that follows:
package com.javacodegeeks.snippets.core;
import java.io.FileReader;
import java.io.StreamTokenizer;
public class Main {
public static void main(String[] argv) throws Exception {
FileReader fileReader = new FileReader("C:/Users/nikos7/Desktop/Main.java");
StreamTokenizer tokenizer = new StreamTokenizer(fileReader);
tokenizer.parseNumbers();
tokenizer.wordChars('_', '_');
tokenizer.eolIsSignificant(true);
tokenizer.ordinaryChars(0, ' ');
tokenizer.slashSlashComments(true);
tokenizer.slashStarComments(true);
int tok = tokenizer.nextToken();
while (tok != StreamTokenizer.TT_EOF) {
tok = tokenizer.nextToken();
switch (tok) {
case StreamTokenizer.TT_NUMBER:
double n = tokenizer.nval;
System.out.println(n);
break;
case StreamTokenizer.TT_WORD:
String word = tokenizer.sval;
System.out.println(word);
break;
case '"':
String doublequote = tokenizer.sval;
System.out.println(doublequote);
break;
case ''':
String singlequote = tokenizer.sval;
System.out.println(singlequote);
break;
case StreamTokenizer.TT_EOL:
break;
case StreamTokenizer.TT_EOF:
break;
default:
char character = (char) tokenizer.ttype;
System.out.println(character);
break;
}
}
fileReader.close();
}
}
Output:
ch
=
(
char
)
tokenizer.ttype
;
This was an example of how to tokenize a java source file in Java.
