3

Hey, I need to read a textfile in java. The problem is that the file has the following format:

Id time1 time2 time3 ...
ID2 time1 time2 time3 ...

I need to be able to first read all the IDs, then read all the time1, then all time2 etc. Can anyone give me some hints how can I do this please in java? Efficiency is important here since this needs to be done for thounsands of times <- this is my problem Thanks in advance for your help

6
  • 6
    Please see Google for approximately 1 billion examples of how to read in a file line-by-line in java. Or search SO. Commented Apr 19, 2011 at 12:10
  • The problem is efficiency, I have already naively implemented this reading line by line and getting to the specified timer but it is taking quite long. Commented Apr 19, 2011 at 12:12
  • @Richard I don't think his question had to do with reading in a text file, but reading a text file of that particular structure efficiently... Commented Apr 19, 2011 at 12:12
  • @tzer: you can only read in a file as fast as your disk-access will allow. AFAIK you can't really do better than BufferedReader or whatever. Commented Apr 19, 2011 at 12:18

5 Answers 5

2

The simplest way would be to read the whole file line by line once, parsing the lines as you go - then you can very easily get "all the IDs" followed by "all the first times" etc.

If the file is too large to do that, you may want to consider writing a tool to change the file structure - open up several files for writing (one per column) then you can read an input line, write the output data to each file, move onto the next line etc. You can do this once and then read each file as and when you need it.

Sign up to request clarification or add additional context in comments.

Comments

2

Transpose the file. Ids on line 1, time1 on line 2, and so on. Of course, this is beneficial if this can be done only once and then many reads on that file are expected.

Comments

2

One solution is to parse the file once and create an index of the positions of each ids in the file. Then, you can reposition the reading 'cursor' as needed to ids.

EDIT

This solution is practical if the whole file content cannot be loaded into memory. To limit the number of physical readings, a LRU cache keeping the most recently read or used id-times combinations could improve performance.

Comments

1

We can't read files column-by-column. Read the whole file into memory (FileReader of java.nio) and parse the content (String#split on each line) in a datastructure like

Map<String, List<String>>

where the maps key is the id (ID, ID2, ..) and the value a simple list that contains all the time values.

Comments

0

If you're on a Linux/UNIX platform, you could do some preprocessing with the cut command

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.