0

I need to make read lines from a file but i'm do not know how long a line would be. So far the only thing i could think of was to use fgetc and realloc

FILE* cFile = fopen(filename, "r");
....
//some while cycle for going from line to line
....
//now for reading the line itself
char* line = malloc(sizeof(char)); //one empty spot for the '\n'
unsigned int = 0;
char c = getc(cFile);
while (c != '\n') {
    line[i] = c;
    line = realloc(line, (i+2)*(sizeof(char));
    i++;
    c = getc(cfile);
}
line[i] = c;

I omnited all the checks for EOL or whether i really got the allocated memory, this is just an example.

My question is, is there any more efficient method of getting a line of unknown length ?

16
  • This line[0] = c; should be line[i] = c;. Commented Apr 21, 2017 at 14:39
  • And the final line[i] = c; should be line[i] = '\0'; Commented Apr 21, 2017 at 14:40
  • "getting a line of unknown length" allows a hacker to overwhelm memory resources of a computer. Defensive programming would insure a line of input does not exceed some sane size. So use a big buffer and if that is not enough, declare an error . Commented Apr 21, 2017 at 14:41
  • 1
    Many systems employ a tentative allocation, so allocating a buf = malloc(1024*1204); is not memory inefficient. Real memory is allocated when it is used, not necessarily when *alloc() is called. Commented Apr 21, 2017 at 14:47
  • 1
    @ZergOvermind: '\0' is not a newline character. Commented Apr 21, 2017 at 14:50

2 Answers 2

1

It would probably be more efficient to increase the buffer size with more than one character at a time, for example by starting with size 80, doubling the size when the buffer is full, and if necessary shrink it at the end.

But that makes your code more complicated and therefore more error-prone, so remember the two rules of how to hand-optimize code:

  1. Don't do it.
  2. Only for experts: Don't do it yet.

That is, don't do it, because it is probably not worth the effort. You will spend maybe an extra hour "improving" your code, and unless you know that the speedup is actually needed, you probably won't notice the difference. Add to that the risk of getting the more complicated code wrong, and spending maybe hundreds of hours finding the elusive bug that in the end turns out to be memory corruption caused by this little reading function.

And, if you really know what you are doing, and need the extra speed, don't start optimizing this piece of code until you know (that is, have measured) that it actually is here that the execution time is spent.

Sign up to request clarification or add additional context in comments.

2 Comments

I thought of doing that but i didn't wanna risk as i expected to make some form of error. So does the one-by-one solution seem efficient enough?
@ZergOvermind: If it is efficient enough depends entirely on your application and the time constraints you have. The realloc-one-char-at-a-time solution reads several hundred thousand lines per second on a standard modern desktop computer. Do you need more than that?
1

If you're using a POSIX system, use getline(3), which does exactly what you want. Otherwise, you can find a free implementation of getline in many places, such as here or here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.