2

Basically i do have a database import/export file that i needed to edit (file is 1.4m lines). The editing is quite fast even though i have to go through all the lines and check if 2 fields are correct length and if not just fulfill it to the correct length.

The problem i am having is the export into csv file being really ineffective.

I am currently using a Method i created that basically creates a string and then creates a new line with every object from list (as seen in the code below). The problem is that it goes really slowly. I guess it is pretty slow because all the objects that are in the list have 14 parameters that are being put into the string (code is edited so it is not that long). But is there any way to make that faster?

public static string CsvExport(List<DataLine> inputList)
        {
            string exportString = "";
            string delimiter = ";";

            foreach (DataLine line in inputList)
            {
                exportString += line.SCREENINGREQUESTUNIQUEID + delimiter + line.REQUESTTIMESTAMP + "\n";
            }

            return exportString;
        }

2 Answers 2

3

First. Use right tool for the job - StringBuilder

var exportString = lines
    .Aggregate(
        new StringBuilder(), 
        (builder, line) => 
        {
            builder.Append(line.SCREENINGREQUESTUNIQUEID);
            builder.Append(";"); 
            builder.AppendLine(line.REQUESTTIMESTAMP);

            return builder;
        })
    .ToString();

Notice that string concatenation with + operator will instantiate new instance of string every time + is used, where previous string instance will still remain in the memory (for 1M lines this will become important). With StringBuilder you can do this efficiently without instantiating new objects and use exact amount of memory you need for the whole file without copying previous values as Thomas mentions in his answer.

Second. If you writing text to the file, you can do this in more memory efficient way by using StreamWriter (but probably not speed efficient)

using (var csv = new StreamWriter("pathToFile"))
{
    foreach (var line in lines)
    {
        csv.Write(line.SCREENINGREQUESTUNIQUEID);
        csv.Write(";"); 
        csv.WriteLine(line.REQUESTTIMESTAMP);
    }
}
Sign up to request clarification or add additional context in comments.

Comments

2

You say that you need to export to a CSV file, but actually, you're exporting to a string in RAM.

Using += on a string like you do will create at least 1.4 million temporary strings, which will all need to be garbage collected. If each line is 100 characters long, you'll have a memory throughput of

200 + 400 + 600 + ...

with 1.4 million terms. That is 200 * sum(1..1.4M) or ~ 200 TB. At a rate of DDR3 1333 = 10.6 GB/s, this will take about 20000 seconds or 5:30 hours

Write to the file using a StreamWriter. This will save you tons of memory in RAM, reduce the memory and be faster, since writing to disk can already occur while you're still computing.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.