1

I need to write a program to parse a large CSV file (approx. 2000*2000) in C and store in the form of a double[] [] array. I wrote a program, which seems to work for small files (i checked for a 4*4 csv file), but for large files it gives me incorrect results.(as in the number of rows and columns are wrong and the program crashes after that).

This is the code:

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

int main (void)
{
    int rowMaxIndex,columnMaxIndex;
    double **mat;
    double *matc;
    int i,j,idx,len;
    char part[5000];
    char *token;
    char *temp;
    char *delim = ",";
    double var;
{
    FILE *fp;
    fp = fopen("X1_CR2_new1.csv","r");

    if(fp == NULL)
    {
        perror("Error while opening the file.\n");
        exit(EXIT_FAILURE);
    }

    // count loop
    rowMaxIndex = 0;
    columnMaxIndex = 0;
    while(fgets(part,5000,fp) != NULL){
        token = NULL;
        token=strtok(part,delim);
                    while(token != NULL){
                       if(rowMaxIndex==0)
                       {
                       columnMaxIndex++;}
                       token=strtok(NULL,delim);
        }
        rowMaxIndex++;
    }
    fclose(fp);

    printf("Number of rows is %d, and Number of columns is %d", rowMaxIndex, columnMaxIndex);
    // allocate the matrix

    mat = malloc(rowMaxIndex * sizeof(double*));

    for (i = 0; i < rowMaxIndex; i++)
    {
        mat[i] = malloc(columnMaxIndex * sizeof(double));
        }
        fclose(fp);
}
    // rewind the file to the beginning. The rewind(fp) wasnt working so closed and reopened file.

{
    FILE *fp;
    fp = fopen("X1_CR2_new1.csv","r");

    if(fp == NULL)
    {
        perror("Error while opening the file.\n");
        exit(EXIT_FAILURE);
    }

    // read loop
    i = j = 0;
    while(fgets(part,5000,fp)!=NULL)
    {    
        token=strtok(part,delim);
        j=0;
        while (token != NULL){
              mat[i][j]=atof(token);
              //printf("\n %f", mat[i][j]);
              token=strtok(NULL,delim);
              j++;
          }
        i++;
    }
    printf("\n The value of mat 1, 2 is %f", mat[1][0]);  //print some element to check
    free(mat);
    fclose(fp);
}    

    return 0;
}
1

1 Answer 1

2

You say you data has 2000 columns but your fgets() reads at most 4999 characters. Isn't there a chance your data is wider than 4999 chars? You should probably check that each line read in ends with a newline (except perhaps the last line in the file).

As an aside, you don't need to reopen the file--just rewind() it.

Sign up to request clarification or add additional context in comments.

2 Comments

Or, simpler, use POSIX getline(), which always returns a complete line unless it cannot allocate enough memory.
Thanks. changed to 20000 chars and it works fine now. However, tried rewind() but it did not go back to start of the file. I dont know why. Getline is not included in my default package so did not use.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.