StringBuilder.append outofmemory

Question

I'm using StringBuilder.append() to parse and process a file as following :

        StringBuilder csvString = new StringBuilder();

        bufferedReader.lines().filter(line -> !line.startsWith(HASH) && !line.isEmpty()).map(line -> line.trim())
                .forEachOrdered(line -> csvString.append(line).append(System.lineSeparator()));

        int startOfFileTagIndex = csvString.indexOf(START_OF_FILE_TAG);

        int startOfFieldsTagIndex = csvString.indexOf(START_OF_FIELDS_TAG, startOfFileTagIndex);
        int endOfFieldsTagIndex = csvString.indexOf(END_OF_FIELDS_TAG, startOfFieldsTagIndex);

        int startOfDataTagIndex = csvString.indexOf(START_OF_DATA_TAG, endOfFieldsTagIndex);
        int endOfDataTagIndex = csvString.indexOf(END_OF_DATA_TAG, startOfDataTagIndex);

        int endOfFileTagIndex = csvString.indexOf(END_OF_FILE_TAG, endOfDataTagIndex);

        int timeStartedIndex = csvString.indexOf("TIMESTARTED", endOfFieldsTagIndex);
        int dataRecordsIndex = csvString.indexOf("DATARECORDS", endOfDataTagIndex);
        int timeFinishedIndex = csvString.indexOf("TIMEFINISHED", endOfDataTagIndex);

        if (startOfFileTagIndex != 0 || startOfFieldsTagIndex == -1 || endOfFieldsTagIndex == -1
                || startOfDataTagIndex == -1 || endOfDataTagIndex == -1 || endOfFileTagIndex == -1) {

            log.error("not in correct format");

            throw new Exception("not in correct format.");
        }

The problem is that when the file is quite large i get an outofmemoryexception. Can you help me transform my code to avoid that exception with large files?

Edit: As I can understand charging a huge file into a string Builder is not a good idea and won't work. So the question is which structure in Java is the more appropriate to use to parse my huge file, delete some lines , find the index of some lines and seperate the file into parts (where to store those parts thaht can be huge) according to the found indexes then creating an output file in the end?

You have mainly 2 options: increase the memory available to the JVM or parse the lines one by one without storing them... — assylias
– assylias, Commented Feb 19, 2020 at 10:45

Sree Kumar · Accepted Answer · 2020-02-19 10:45:49Z

The OOM seems to be due to the fact that you are storing all lines in the StringBuilder. When the file has too many lines, it will take up a huge amount of memory and may lead to OOM.

The strategy to avoid this depends upon what you are doing with appended strings. As I can see in your code, you are only trying to verify the structure of the input file. In that case, you don't need to store all the lines in a StringBuilder instance. Instead,

Have multiple ints to hold each index you are interested in, (or have an array of ints)
Instead of adding the line to the StringBuilder, detect the presence of the "tag" or "index" you are looking for and save it in its designated int variable.
Finally, the check you are already doing may need to undergo a change to test not as -1 but relative to other indices. (This you are currently achieving using a start index in the indexOf() call.)
If there is a risk of a tag spanning across lines, then you may not be able to use streams, but will have to use a simple for loop in which to save some previous lines, append them and check. (Just one idea; you may have a better one.)

Collectives™ on Stack Overflow

StringBuilder.append outofmemory

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related