I've seen people trying to read files like this in a lot of posts lately:
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
char *path = "stdin";
FILE *fp = argc > 1 ? fopen(path=argv[1], "r") : stdin;
if( fp == NULL ) {
perror(path);
return EXIT_FAILURE;
}
while( !feof(fp) ) { /* THIS IS WRONG */
/* Read and process data from file… */
}
if( fclose(fp) != 0 ) {
perror(path);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
What is wrong with this loop?
This question is related to
c
file
while-loop
eof
feof
feof()
indicates if one has tried to read past the end of file. That means it has little predictive effect: if it is true, you are sure that the next input operation will fail (you aren't sure the previous one failed BTW), but if it is false, you aren't sure the next input operation will succeed. More over, input operations may fail for other reasons than the end of file (a format error for formatted input, a pure IO failure -- disk failure, network timeout -- for all input kinds), so even if you could be predictive about the end of file (and anybody who has tried to implement Ada one, which is predictive, will tell you it can complex if you need to skip spaces, and that it has undesirable effects on interactive devices -- sometimes forcing the input of the next line before starting the handling of the previous one), you would have to be able to handle a failure.
So the correct idiom in C is to loop with the IO operation success as loop condition, and then test the cause of the failure. For instance:
while (fgets(line, sizeof(line), file)) {
/* note that fgets don't strip the terminating \n, checking its
presence allow to handle lines longer that sizeof(line), not showed here */
...
}
if (ferror(file)) {
/* IO failure */
} else if (feof(file)) {
/* format error (not possible with fgets, but would be with fscanf) or end of file */
} else {
/* format error (not possible with fgets, but would be with fscanf) */
}
No it's not always wrong. If your loop condition is "while we haven't tried to read past end of file" then you use while (!feof(f))
. This is however not a common loop condition - usually you want to test for something else (such as "can I read more"). while (!feof(f))
isn't wrong, it's just used wrong.
feof()
is not very intuitive. In my very humble opinion, the FILE
's end-of-file state should be set to true
if any read operation results in the end of file being reached. Instead, you have to manually check if the end of file has been reached after each read operation. For example, something like this will work if reading from a text file using fgetc()
:
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *in = fopen("testfile.txt", "r");
while(1) {
char c = fgetc(in);
if (feof(in)) break;
printf("%c", c);
}
fclose(in);
return 0;
}
It would be great if something like this would work instead:
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *in = fopen("testfile.txt", "r");
while(!feof(in)) {
printf("%c", fgetc(in));
}
fclose(in);
return 0;
}
It's wrong because (in the absence of a read error) it enters the loop one more time than the author expects. If there is a read error, the loop never terminates.
Consider the following code:
/* WARNING: demonstration of bad coding technique!! */
#include <stdio.h>
#include <stdlib.h>
FILE *Fopen(const char *path, const char *mode);
int main(int argc, char **argv)
{
FILE *in;
unsigned count;
in = argc > 1 ? Fopen(argv[1], "r") : stdin;
count = 0;
/* WARNING: this is a bug */
while( !feof(in) ) { /* This is WRONG! */
fgetc(in);
count++;
}
printf("Number of characters read: %u\n", count);
return EXIT_SUCCESS;
}
FILE * Fopen(const char *path, const char *mode)
{
FILE *f = fopen(path, mode);
if( f == NULL ) {
perror(path);
exit(EXIT_FAILURE);
}
return f;
}
This program will consistently print one greater than the number of characters in the input stream (assuming no read errors). Consider the case where the input stream is empty:
$ ./a.out < /dev/null
Number of characters read: 1
In this case, feof()
is called before any data has been read, so it returns false. The loop is entered, fgetc()
is called (and returns EOF
), and count is incremented. Then feof()
is called and returns true, causing the loop to abort.
This happens in all such cases. feof()
does not return true until after a read on the stream encounters the end of file. The purpose of feof()
is NOT to check if the next read will reach the end of file. The purpose of feof()
is to determine the status of a previous read function
and distinguish between an error condition and the end of the data stream. If fread()
returns 0, you must use feof
/ferror
to decide whether an error occurred or if all of the data was consumed. Similarly if fgetc
returns EOF
. feof()
is only useful after fread has returned zero or fgetc
has returned EOF
. Before that happens, feof()
will always return 0.
It is always necessary to check the return value of a read (either an fread()
, or an fscanf()
, or an fgetc()
) before calling feof()
.
Even worse, consider the case where a read error occurs. In that case, fgetc()
returns EOF
, feof()
returns false, and the loop never terminates. In all cases where while(!feof(p))
is used, there must be at least a check inside the loop for ferror()
, or at the very least the while condition should be replaced with while(!feof(p) && !ferror(p))
or there is a very real possibility of an infinite loop, probably spewing all sorts of garbage as invalid data is being processed.
So, in summary, although I cannot state with certainty that there is never a situation in which it may be semantically correct to write "while(!feof(f))
" (although there must be another check inside the loop with a break to avoid a infinite loop on a read error), it is the case that it is almost certainly always wrong. And even if a case ever arose where it would be correct, it is so idiomatically wrong that it would not be the right way to write the code. Anyone seeing that code should immediately hesitate and say, "that's a bug". And possibly slap the author (unless the author is your boss in which case discretion is advised.)
Source: Stackoverflow.com