12

I'm trying to parse a CSV file into a 2D array in C#. I'm having a very strange issue, here is my code:

string filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
StreamReader sr = new StreamReader(filePath);
data = null; 
int Row = 0;
while (!sr.EndOfStream)
{
    string[] Line = sr.ReadLine().Split(',');
    if (Row == 0)
    {
        data = new string[Line.Length, Line.Length];
    }
    for (int column = 0; column < Line.Length; column++)
    {
        data[Row, column] = Line[column];
    }
    Row++;
    Console.WriteLine(Row);
}

My .csv file has 87 rows, but there is a strange issue in the execution where it will read the first 15 rows into the data array exactly as expected but when it gets down to the data[Row, column] = Line[column]; line for the 16th time it seems to just break out of the entire loop (without meeting the sr.EndOfStream condition) and not read any more data into the data array.

Can anyone explain what might be happening?

8
  • 2
    Are the amount of columns the same for each row? And is the amount of columns equal to the number of rows? You are initialising the total rows in your array to the amount of columns in the first line of the csv. Commented Sep 14, 2013 at 21:58
  • I thought you have some special chr in your csv file . First remove first 15 line from the csv and then uplode it . If you get same error then reply Commented Sep 14, 2013 at 22:08
  • I removed the 16th line, same thing happened, I removed several lines around the 15th line and the same thing happened. It seems its only capable of reading 15 lines but gives now explanation why and actually the code never leaves the while loop and doesn't execute anything afterwards. This is the strangest thing I've ever encountered programming. Commented Sep 14, 2013 at 22:09
  • @MattR there is 87 rows not all rows have the same amount of columns, but for the first 15 rows it just fills the empty spaces with blank values exactly as expected, so I dont think this is the issue Commented Sep 14, 2013 at 22:11
  • Is the amount of columns on line 16 larger than the first row? Commented Sep 14, 2013 at 22:13

5 Answers 5

17

A shorter version of the code above:

var filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
var data = File.ReadLines(filePath).Select(x => x.Split(',')).ToArray();

Note the user of ReadLines instead of ReadAllLines, which is more efficient on larger files as per MSDN documentation:

When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient.

Sign up to request clarification or add additional context in comments.

3 Comments

But since we are converting to array immediately it does not make any difference here.
This solution has the same problem as Khan's. x.Split() will split cell data if it contains a comma.
Great solution, shortest lines I've found
12

Nothing in your code gets the number of lines out of your file in time to use it.

Line.Length represents the number of columns in your csv, but it looks like you're also trying to use it to specify the number of lines in your file.

This should get you your expected result:

string filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
StreamReader sr = new StreamReader(filePath);
var lines = new List<string[]>();
int Row = 0;
while (!sr.EndOfStream)
{
    string[] Line = sr.ReadLine().Split(',');
    lines.Add(Line);
    Row++;
    Console.WriteLine(Row);
}

var data = lines.ToArray();

4 Comments

This is not a robust solution. Suppose you have this data: 1, 2, "This, you see, is text." The output from .Split() will contain 5 items instead of 3.
Depends on the data. If you know the data it's dealing with isn't going to contain comma's, then it should be fine to do this.
I always change my CSV default to a pipe "|" delimited file format, for this reason.
this not work for cells with comma like: data1,data2,data3,"data,with,comma",data5
4

This is the same as posted by Pavel, but it ignores empty lines that may cause your program to crash.

var filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";

string[][] data = File.ReadLines(filepath).Where(line => line != "").Select(x => x.Split('|')).ToArray();

Comments

0

Without knowing the contents of your csv file, I would assume that the error is generated by this line:

if (Row == 0)
{
    data = new string[Line.Length, Line.Length];
}

By initialising the total amount of rows to the amount of columns in the first line of the csv, you are assuming that the amount of rows is always equal to the amount of columns.

As soon as the amount of rows is greater than the total columns of the first line of the csv, you are going to overrun the data array by attempting to access a row that isn't there.

You can simplify your code by changing your data to be a list to allow for dynamic adding of items:

string filePath = @"C:\Users\Matt\Desktop\Eve Spread Sheet\Auto-Manufacture.csv";
StreamReader sr = new StreamReader(filePath);
List<string> data = new List<string[]>();
int Row = 0;
while (!sr.EndOfStream)
{
    string[] Line = sr.ReadLine().Split(',');
    data.Add(Line);
    Row++;
    Console.WriteLine(Row);
}

Comments

0

With Open File Dialog

OpenFileDialog opn = new OpenFileDialog();

        if (opn.ShowDialog() == DialogResult.OK)
        {
           StreamReader sr = new StreamReader(opn.FileName);

           List<string[]> data = new List<string[]>(); 

           int Row = 0;

           while (!sr.EndOfStream)
           {
               string[] Line = sr.ReadLine().Split(',');
               data.Add(Line);
               Row++;
               Console.WriteLine(Row);
           }


        }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.