4

I'm using the getline() function to get every line of stdin. Every line is a string with different length:

#include <stdio.h>
#include <stdlib.h>

int main() {
    char *line = NULL;
    size_t foo = 0;
    ssize_t reader;

    while ((reader = getline(&line, &foo, stdin)) != -1) { // %zu of reader is length of line
        printf("%s", line);
    }

    free(line);
    return 0;
}

In every iteration, line is a string and is containing the current line. How can I take each string-line and store it inside an array? There are several things I have tried but none of them worked or they just lead to memory access failure :(


I hope my question is clear? If it's not, please tell me and I will change it!

3
  • 2
    "There are several things I have tried". Please show what you tried and what the error was. Commented Jun 22, 2021 at 21:22
  • @tenepolis - It's good that none of the things you tried led to memory access failure. Commented Jun 22, 2021 at 21:23
  • none of them ...lead to memory access failure? None of them ... lead to OBSERVED memory access failure. FTFY. Commented Jun 22, 2021 at 21:27

2 Answers 2

6

Unless you know up front how many lines to expect, then you will have to allocate the array dynamically, eg:

#include <stdio.h>
#include <stdlib.h>

int main() {
    char *line = NULL;
    size_t foo = 0;
    ssize_t reader;
    int result = 0;

    int numlines = 0, maxlines = 10;
    char **lines = malloc(sizeof(char*) * maxlines);

    if (!lines) {
        printf("error allocating array\n");
    }
    else {
        while ((reader = getline(&line, &foo, stdin)) != -1) { // %zu of reader is length of line
            printf("%s", line);

            if (numlines == maxlines) {
                maxlines *= 2; // <-- or use whatever threshold makes sense for you
                char **newlines = realloc(lines, sizeof(char*) * maxlines);
                if (!newlines) {
                    printf("error reallocating array\n");
                    result = -1;
                    break;
                }
                lines = newlines;
            }

            lines[numlines] = line;
            ++numlines;

            line = NULL;
            foo = 0;
        }
        free(line); // <-- in case getline() or realloc() failed...

        // use lines up to numlines as needed

        // free lines
        for(int i = 0; i < numlines; ++i) {
            free(lines[i]);
        }
        free(lines);
    }

    return result;
}
Sign up to request clarification or add additional context in comments.

6 Comments

Instead of calling realloc every 10 lines, it would be better to double the buffer size every time the buffer gets full. That way, it is guaranteed that the elements won't have to be copied more than twice, on average. Because you resize by adding a constant value instead, your algorithm has a time complexity of O(n^2), whereas the algorithm of doubling the size every time has a time complexity of O(n).
@AndreasWenzel that would fall under "use whatever threshold makes sense for you" ;-) This was just an example. But OK, I have updated it.
ssize_t reader; and %zu of reader is length of line --> Is there some Posix spec that supports this? I thought ssize_t may differ in width from size_t - even if that is uncommon.
There's one memory block not freed at exit.
@AndreasWenzel Doubling is only a good strategy if you've got virtual memory. Otherwise, you risk running out of memory prematurely.
|
6

You need to create an array of pointers that gets resized when needed:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    // start with an array that ends with a NULL pointer
    // (like argv does)
    size_t numLines = 0;
    char **lines = malloc( ( numLines  + 1 ) * sizeof( *lines ) );

    lines[ numLines ] = NULL;

    // break the loop explicitly - easier to handle and much less
    // bug-prone than putting the assignment into a while statement
    for ( ;; )
    {

        // get the next line
        size_t bytes = 0UL;
        char *line = NULL;
        ssize_t result = getline( &line, &bytes, stdin );
        if ( result < 0 )
        {
            break;
        }

        // enlarge the array by one
        numLines++;
        char **tmp = realloc( lines, ( numLines + 1 ) * sizeof( *tmp ) );
        if ( !tmp )
        {
            break;
        }

        lines = tmp;

        // add the new line to the end of the array
        lines[ numLines ] = line;
        lines[ numLines + 1 ] = NULL;
    }

    // use lines - then free them 
    return( 0 );
}

That can be optimized by doing the realloc() calls in chunks, such as every 32 or 64 lines. But given that you're already effectively calling malloc() once per line, that might not help much.

4 Comments

A realloc() on every line would not be a good idea.
@Andrew Henle - A peculiarity of getline easily overlooked: … getline() will allocate a buffer for storing the line. This buffer should be freed by the user program even if getline() failed. But surely you implied line when you wrote use lines - then free them. :-)
"But given that you're already effectively calling malloc() once per line, that might not help much." -- The problem is not calling malloc once per loop, but rather that realloc may have to copy the entire buffer contents to a new location. If you call realloc at fixed intervals, the time complexity of the algorithm is O(n^2). It doesn't matter whether these fixed intervals are 1, 32 or 64, the time complexity stays the same. However, if you don't use fixed intervals, but for example double the buffer size every time it gets full, the time complexity is reduced to O(n).
@AndreasWenzel IMO, until actual performance requirements are not being met and profiling of the application shows that the use of realloc() is a performance bottleneck, that would fall under unnecessary optimizations. If the added complexity of the code doesn't demonstrably improve measured performance, the extra time spent isn't merely wasted - the markedly increased chance of creating a bug for no measurable benefit makes the effort an actual negative contribution. Simpler code is easier to keep bug-free.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.