Parsing a CSV file problems C#

Question

Having a problem with parsing a CSV file. I connect to the file using the following:

string connString = "Provider=Microsoft.Jet.OLEDB.4.0;"
          + "Data Source=\"" + dir + "\\\";"
          + "Extended Properties=\"text;HDR=No;FMT=Delimited\"";
        //create the database query
        string query = "SELECT * FROM [" + file + "]";

        //create a DataTable to hold the query results
        DataTable dTable = new DataTable();


        //create an OleDbDataAdapter to execute the query
        OleDbDataAdapter dAdapter = new OleDbDataAdapter(query, connString);

        //Get the CSV file to change position.

        //fill the DataTable
        dAdapter.Fill(dTable);

        return dTable;

For some reason, the first column reads as a "Header" ok (i.e. HDR=Yes allows the values to be displayed). The problem is when I have HDR=No, nothing after the first 'cell' is displayed in that row. However I need to have HDR=No as I'll be writing the CSV later.

As a quick aside, the rest of the row only has a value in every other column. Also, there is a period in each of these columns. Any help?

Cheers.

EDIT: Here are a fake few lines similar to the CSV:

//Problem row->>    
File:,GSK1.D,,GSK2.D,,GSK3.D,
//The following rows, however, are fine:
 / 69,120.3,16.37%,128.9,7.16%,188.92,13.97%
D / 71,48.57,75.50%,32.15,26.65%,58.35,71.43%
T / 89,35.87,45.84%,50.01,28.87%,15.38,43.30%

EDIT: When I put any value into the "blank spaces" above they are parsed, but no matter what I put into the problematic cells (e.g. GSK1.D) they won't parse - unless it is a number! Is there any chance it is automatically converting this cell to a "float" cell? And how can I stop it doing this?

Any chance you can paste in the first few lines of the CSV ? — Plip
– Plip, Commented Nov 24, 2009 at 14:07
@David: Then paste in the first few lines of a dummy file. For Pete's sake, if you want help, cooperate! — Ken White
– Ken White, Commented Nov 24, 2009 at 14:16
Sounds like there is possibly some formatting issue with the file itself? — ajdams
– ajdams, Commented Nov 24, 2009 at 14:16
And while you're at it, I'd highly recommend you remove that try/catch block. You're not handling the error, you're just masking any potential problems. — senfo
– senfo, Commented Nov 24, 2009 at 14:18
@Ken I was in the middle of pasting a fake one when you made that comment, see above. — David Archer
– David Archer, Commented Nov 24, 2009 at 14:19

Summer-Time · Accepted Answer · 2009-11-24 14:29:55Z

1

at Codeproject there is an parsing library: http://www.codeproject.com/KB/database/CsvReader.aspx

with an interesting article, how this stuff work. Its working faster (Author), than the OleDB Provider.

answered Nov 24, 2009 at 14:29

Summer-Time

1,89215 silver badges19 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

David Archer Over a year ago

Thanks for the link, but as mentioned in the comments in the previous answer I am most of the way there and would not like to rebuild now if possible.

Summer-Time Over a year ago

You can create a DataTable easy from an Reader and i'm thinking therse also an table adapter you can fill DataSets.

David Archer · Accepted Answer · 2009-11-26 11:04:49Z

1

I have finished this, just to let anyone know who may have this problem in the future. It turns out the reason there was nothing being taken in was because ADO tries to determine a column type. If other values in this column are not of said type, it removes them completely.

To counter this, you need to create a schema.ini file, like so:

StreamWriter writer = new StreamWriter(File.Create(dir + "\\schema.ini"));
writer.WriteLine("[" + fileToBeRead + "]");
writer.WriteLine("ColNameHeader = False");
writer.WriteLine("Format = CSVDelimited");
writer.WriteLine("CharacterSet=ANSI");

int iColCount = dTable.Columns.Count + 1;
for (int i = 1; i < iColCount; i++)
{
    writer.WriteLine("Col" + i + "=Col" + i + "Name Char Width 20");
}


//writer.WriteLine("Col1=Col1Name Char Width 20");
//writer.WriteLine("Col2=Col1Name Char Width 20");
//etc.

writer.Close();

Thanks for everyone's suggestions!

answered Nov 26, 2009 at 11:04

David Archer

1,1746 gold badges18 silver badges34 bronze badges

1 Comment

mrduclaw Over a year ago

I just found your solution after coming to the same solution on my own project just now. +1 for wishing I had this an hour ago.

Murph · Accepted Answer · 2009-11-24 14:53:40Z

0

I've seldom done well with database type access to text files - the possibilities for "issues" with the file tend to exceed theoretical time savings.

Personally I've more often than not hand crafted the code to do this. A lot (going back over 20+ years so generic solutions have been thin on the ground). That said, if I were having to process a .csv file now the first thing I'd reach for would be FileHelpers or similar.

answered Nov 24, 2009 at 14:53

Murph

10.3k3 gold badges29 silver badges42 bronze badges

2 Comments

Robert Groves Over a year ago

Murph, have you (or anyone else here) used FileHelpers? Would you recommend it for processing large (~10-50GB) delimited files that may possibly contain malformed records (e.g. quote mismatches, incorrect number of fields per record, etc)?

Murph Over a year ago

I have but not for that sort of file size, most of the processing I've done is of much smaller files and the malformed stuff I dealt with was in code that predates (my finding of file helpers) so I had my own code to deal with.

Collectives™ on Stack Overflow

Parsing a CSV file problems C#

3 Answers 3

2 Comments

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related