Thursday, February 1, 2018

c++ - Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

Bottom-line top: With proper handling of white-space, the following is how eof can be used (and even, be more reliable than fail() for error checking):


while( !(in>>std::ws).eof() ) {
int data;
in >> data;
if ( in.fail() ) /* handle with break or throw */;
// now use data
}

(Thanks Tony D for the suggestion to highlight the answer. See his comment below for an example to why this is more robust.)




The main argument against using eof() seems to be missing an important subtlety about the role of white space. My proposition is that, checking eof() explicitly is not only not "always wrong" -- which seems to be an overriding opinion in this and similar SO threads --, but with proper handling of white-space, it provides for a cleaner and more reliable error handling, and is the always correct solution (although, not necessarily the tersest).


To summarize what is being suggested as the "proper" termination and read order is the following:


int data;
while(in >> data) { /* ... */ }
// which is equivalent to
while( !(in >> data).fail() ) { /* ... */ }

The failure due to read attempt beyond eof is taken as the termination condition. This means is that there is no easy way to distinguish between a successful stream and one that really fails for reasons other than eof. Take the following streams:



  • 1 2 3 4 5

  • 1 2 a 3 4 5

  • a


while(in>>data) terminates with a set failbit for all three input. In the first and third, eofbit is also set. So past the loop one needs very ugly extra logic to distinguish a proper input (1st) from improper ones (2nd and 3rd).


Whereas, take the following:


while( !in.eof() )
{
int data;
in >> data;
if ( in.fail() ) /* handle with break or throw */;
// now use data
}

Here, in.fail() verifies that as long as there is something to read, it is the correct one. It's purpose is not a mere while-loop terminator.


So far so good, but what happens if there is trailing space in the stream -- what sounds like the major concern against eof() as terminator?


We don't need to surrender our error handling; just eat up the white-space:


while( !in.eof() )
{
int data;
in >> data >> ws; // eat whitespace with std::ws
if ( in.fail() ) /* handle with break or throw */;
// now use data
}

std::ws skips any potential (zero or more) trailing space in the stream while setting the eofbit, and not the failbit. So, in.fail() works as expected, as long as there is at least one data to read. If all-blank streams are also acceptable, then the correct form is:


while( !(in>>ws).eof() )
{
int data;
in >> data;
if ( in.fail() ) /* handle with break or throw */;
/* this will never fire if the eof is reached cleanly */
// now use data
}

Summary: A properly constructed while(!eof) is not only possible and not wrong, but allows data to be localized within scope, and provides a cleaner separation of error checking from business as usual. That being said, while(!fail) is inarguably a more common and terse idiom, and may be preferred in simple (single data per read type of) scenarios.

No comments:

Post a Comment

plot explanation - Why did Peaches' mom hang on the tree? - Movies & TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...