Read a text file into an array in plain C

Question

Is there a way to read a text file into a one dimensional array in plain C? Here's what I tried (I am writing hangman):

int main() {
    printf("Welcome to hangman!");

    char buffer[81];
    FILE *dictionary;
    int random_num;
    int i;
    char word_array[80368];

    srand ( time(NULL) );

    random_num = rand() % 80368 + 1;
    dictionary = fopen("dictionary.txt", "r");

    while (fgets(buffer, 80, dictionary) != NULL){
        printf(buffer); //just to make sure the code worked;
        for (i = 1; i < 80368; i++) {
            word_array[i] = *buffer;
        }
    }

    printf("%s, \n", word_array[random_num]);
    return 0;
}

What's wrong here?

Word_array chould be an array of char *, not char. Buffer needs to be dynamically allocated (your single buffer is overwritten each fgets, and the word_array assignments will all be the same. You should look up some sample C text handling. — mpez0
– mpez0, Commented Dec 27, 2011 at 18:18

BRPocock · Accepted Answer · 2011-12-27 18:50:21Z

Try changing a couple of things;

First; you're storing a single char. word_array[i] = *buffer; means to copy a single character (the first one on the line/in the buffer) into each (and every) single-char slot in word_array.

Secondly, your array will hold 80K characters, not 80K words. Assuming that that's the length of your dictionary file, you can't fit it all in there using that loop.

I'm assuming you have 80,368 words in your dictionary file. That's about 400,000 words less than /usr/share/dict/words on my workstation, though, but sounds like a reasonable size for hangman…

If you want a one-dimensional array intentionally, for some reason, you'll have to do one of three things:

pretend you're on a mainframe, and use 80 chars for every word:

  char word_array[80368 * 80];

memcpy (&(word_array[80 * i]), buffer, 80);

create a parallel array with indices to the start of each line in a huge buffer

   int last_char = 0;
   char* word_start[80368];
   char word_array[80368 * 80];
   for ( … i++ ) {
       memcpy (&word_array[last_char], buffer, strlen(buffer));
       word_start[i] = last_char;
       last_char += strlen(buffer);
   }

switch to using an array of pointers to char, one word per slot.

  char* word_array[80368];

  for (int i = 0; i < 80368, i++) {
       fgets (buffer, 80, dictionary);
       word_array[i] = strdup (buffer);
  }

I'd recommend the latter, as otherwise you have to guess at the max size or waste a lot of RAM while reading. (If your average word length is around 4-5 chars, as in English, you're on average wasting 75 bytes per word.)

I'd also recommend dynamically allocating the word_array:

   int max_word = 80368;
   char** word_array = malloc (max_word * sizeof (char*));

… which can lead you to a safer read, if your dictionary size ever were to change:

   int i = 0;
   while (1) {
        /* If we've exceeded the preset word list size, increase it. */
        if ( i > max_word ) {
            max_word *= 1.2; /* tunable arbitrary value */
            word_array = realloc (word_array, max_word * sizeof(char*));
        }
        /* Try to read a line, and… */
        char* e = fgets (buffer, 80, dictionary);
        if (NULL == e) { /* end of file */
            /* free any unused space */
            word_array = realloc (word_array, i * sizeof(char*));
            /* exit the otherwise-infinite loop */
            break;
        } else {
            /* remove any \r and/or \n end-of-line chars */
            for (char *s = &(buffer[0]); s < &(buffer[80]); ++s) {
               if ('\r' == *s || '\n' == *s || '\0' == *s) {
                  *s = '\0'; break;
               }
            }
            /* store a copy of the word, only, and increment the counter.
             * Note that `strdup` will only copy up to the end-of-string \0,
             * so you will only allocate enough memory for actual word
             * lengths, terminal \0's, and the array of pointers itself. */
            *(word_array + i++) = strdup (buffer);
        }
    }
    /* when we reach here, word_array is guaranteed to be the right size */
    random = rand () % max_word;
    printf ("random word #%d: %s\n", random, *(word_array + random));

Sorry, this is posted in an hurry, so I haven't tested the above. Caveat emptor.

sverre · Accepted Answer · 2011-12-27 18:28:41Z

3

This part is wrong:

while (fgets(buffer, 80, dictionary) != NULL){
    printf(buffer); //just to make sure the code worked;
    for (i = 1; i < 80368; i++) {
        word_array[i] = *buffer;
    }
}

You are copying 80368 chars from buffer which has size 81. Change it to:

i = 0;
while (fgets(buffer, 80, dictionary) != NULL){
    printf(buffer); //just to make sure the code worked;
    for (j = 0; j < 80; j++) {
        word_array[i++] = buffer[j];
    }
}

edited Dec 27, 2011 at 18:28

answered Dec 27, 2011 at 18:17

sverre

6,9292 gold badges29 silver badges35 bronze badges

7 Comments

Ahmad Gaffoor Over a year ago

fyi, you missed a semicolon in line 1 ;-)

sverre Over a year ago

@AhmadGaffoor nothing useful, I just didn't remember if fgets was guaranteed to 0-terminated and felt paranoid.

Ahmad Gaffoor Over a year ago

printf("%s, \n", word_array[random_num]);

Ahmad Gaffoor Over a year ago

That line does not print the corresponding array value. it prints out (null). What did I do wrong? Is it the identifier?

BRPocock Over a year ago

… this is going to copy 80 chars (needed or not) from buffer into a sequential part of word_array … but then his random-selection code will choose a character offset to jump into the array …

|

Collectives™ on Stack Overflow

Read a text file into an array in plain C

2 Answers 2

Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related