2

I've got some text files I need to extract data from. The file itself contains around a hundred lines and the interesting part for me is:

AA====== test==== ====================================================/
AA    normal         low          max          max2         max3      /
AD     .45000E+01   .22490E+01   .77550E+01   .90000E+01   .47330E+00 /

Say I need to extract the double values under "normal", "low" and "max". Is there any efficient and not-too-error-prone solution other than regexing the hell out of the text file?

3
  • What's wrong with split()? Commented May 30, 2014 at 18:19
  • I need to know the context, there are many other lines in the file Commented May 30, 2014 at 18:21
  • I would look into grammars :) Commented May 30, 2014 at 19:14

3 Answers 3

2

If you really want to avoid regexes, and assuming you'll always have this same basic format, you could do something like:

HashMap<String, Double> map = new HashMap<>();
Scanner scan = new Scanner(filePath); //or your preferred input mechanism
assert (scan.nextLine().startsWith("AA====:); //remove the top line, ensure it is the top line

while (scan.hasNextLine()){
   String[] headings = scan.nextLine().split("\\s+"); //("\t") can be used if you're sure the delimiters will always be tabs
   String[] vals = scan.nextLine().split("\\s+");
   assert headings[0].equals("AA"); //ensure  
   assert vals[0].equals("AD"); 
   for (int i = 1; i< headings.length; i++){ //start with 1
       map.put(headings[i], Double.parseDouble(vals[i]);
   }
}
   //to make sure a certain value is contained in the map: 
   assert map.containsKey("normal");
   //use it:
   double normalValue = map.get("normal"); 
}

Code is untested as I don't have access to an IDE at the moment. Also, I obviously don't know what's variable and what will remain constant here (read: the "AD", "AA", etc.), but hopefully you get the gist and can modify as needed.

Sign up to request clarification or add additional context in comments.

3 Comments

Better to use scan.nextLine().split("\\s+") if you are not sure that what whitespace character is being used.
Rather than nextLine().split(), probably, you may find it cleaner to use Scanner.findInLine() and the Scanner.nextLine()
@KelvinNg I agree, but I didn't want the solution to be dependent on Scanner methods. .nextLine().split() can be easily adapted to be used with any input mechanism (for instance, a BufferedReader can be used by changing it to readLine().split()), whereas Scanner.findInLine() is rather unique.
0

If each line will always have this exact form you can use String.split()

String line; // Fill with one line from the file
String[] cols = line.split(".")

String normal = "."+cols[0]
String low = "."+cols[1]
String max = "."+cols[2]

Comments

0

If you know what index each value will start, you can just do substrings of the row. (The split method technically does a regex).

i.e.

 String normal = line.substring(x, y).trim();
 String low = line.substring(z, w).trim();

etc.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.