0

I am reading a newline-separated text file into a String-array.

Since I know the delimiter will always be \n, I should be able to append each word to a StringBuilder, then split it using the delimiter.

Simply put, which method should I use and why?

Method A:
1. Create an ArrayList (or another more suited Collection)
2. Append each row to the list
3. Return list.toArray()

Method B:
1. Create a StringBuilder
2. Append each row to the builder
3. Return builder.split("\n")

1
  • You'd better return a List<String> from the method rather than an array. But whatever you use, most of the time will be spent in reading the file, not in processing the String/List. Reading a file from a disk is very slow compared to transforming things in memory. Commented Apr 13, 2014 at 11:30

3 Answers 3

1

Not sure it makes much of a difference, the toArray method is most likely faster as there is less String processing. The split would have to process the entire data with regex; the toArray method would just need to loop over the Collection.

If you amend your method B so that you don't read the file line-by-line into the StringBuilder but use Files.readAllBytes to get the entire file as a String then split you will probably find performance more or less identical.

If you have Java 8:

final Path path = /*some path*/
final String[] lines = Files.lines(path).toArray(String[]::new);

Note, your method A can be improved by using Files.readAllLines:

final String[] lines = Files.readAllLines(path, StandardCharsets.UTF_8).
    toArray(new String[0]);
Sign up to request clarification or add additional context in comments.

Comments

0

There's probably very little difference. I don't think you're working with very large files anyway, so it shouldn't matter. You can profile the different ways if you really are interested in it, but the choice you make is quite irrelevant.

I would go with the ArrayList way if it was my choice, since concatenation just for splitting afterwards seems redundant.

8 Comments

Relatively large (>100.000). I agree ArrayList is prettier and looks a bit more robust, just curious regarding performance. :)
Well, a LinkedList would save you the underlying array resizing then. 100,000 lines is not even relatively large though.
ArrayList have same array limit but it's very large (Integer.MAX_INT), if your list is really so long maybe a LinkedList is what you need.
@MarcoAcierno The issue is that ArrayList will resize its internal array as it grows (unless you initialize it with a large enough size at creation). I already recommended LinkedList, since it doesn't require the resizing (the max length of an array isn't really an issue here).
I benchmarked the process using ArrayList and LinkedList and, surprisingly, LinkedList took an average of 1,5-2x the time of ArrayList (did multiple runs).
|
0

Wait, if you read a file in this format:

A
B
C
D
E
F

Why not just read it and save in the same time?

Something like:

BufferedReader bufferedReader = new BufferedReader(new FileReader("test.txt"));
List<String> lines = new ArrayList<String>();

for (String line; (line = bufferedReader.readLine()) != null; )
{
    lines.add(line);
}

System.out.println(lines);

And you will have [A, B, C, D, E, F, G] in your lines List.

2 Comments

The result has to be a String array. It's size can not be known until all elements are processed, thus elements can not be added while processing.
lines.toArray(); --- About "not be added while processing" i need for info.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.