|
|||||||||||||||||||
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | | | | |||||||||||||||||||
|
C++ IOStreams In-Depth
|
||||||||||||||||||
New Features in Standard IOStreams Comparing Classic and Standard IOStreams
Whitepaper, 1998
Introduction What is Standard IOStreams? It is the standardised version of the classic IOStreams that has been around since the first days of C++. All C++ programmers use it. Just think of the notorious "Hello-world-program" that basically consists of one statement:
In the process of standardisation, IOStreams was formally specified
and cleaned-up and enhanced. What are the differences between the traditional
and the new standard IOStreams? This question might pop up in the heads
of those developers who have existing IOStreams applications and want to
migrate to the standard IOStreams. Let's explore the major changes.
Templatizing the IOStreams Classes. When you look at the new IOStreams header files you will immediately notice that most classes that you might know from the traditional IOStreams turned into class templates in the standard IOStreams. The template parameters are the character type, and the character traits type. Here is an example: class ostream turned into template <class charT, class Traits = char_traits<charT> > class basic_ostream. The character type usually is one of the built-in character types char or wchar_t . However, it can also be of any other conceivable, user-defined type. Naturally, a user-defined character type should exhibit the expected behavior of a character type, such as comparisons for instance. The exact requirements to a user-defined character type are not specified though, and depend on the respective implementation of the standard IOStreams. The character traits type describes the properties of the character type, such as:
There is a standard character traits class template defined in the C++ Standard. Its name is char_traits<class charT> . Specializations of this class template are defined for the built-in character types char and wchar_t . Every standard conforming library implementation has to provide them. Note, however, that char_traits<class charT> is not meant to be instantiated for an arbitrary character type. It just defines the interface that a specialization of this class template is expected to provide. For all iostreams class templates the traits template parameter has a sensible default value. Class basic_ostream for instance is defined as: class basic_ostream{ ... }; For ease of use, and for backward compatibility, the standard defines type definitions for the stream class templates instantiated with the character types char and wchar_t . For type char these are: typedef basic_ostream<char> ostream; typedef basic_iostream<char> iostream; typedef basic_ifstream<char> ifstream; typedef basic_ofstream<char> ofstream; typedef basic_fstream<char> fstream; Splitting Class ios . In the process of transforming the IOStreams classes into class templates, the base class of all traditional IOStreams classes, class ios, was split into:
One might expect that the error handling would be contained in ios_base because it is character independent. However, error indication is done in basic_ios<class charT, class Traits> . This is because ios_base is also used in the locale section of the standard library, where is serves as an abstraction for passing formatting information to the locale. Would ios_base contain the error handling, which in the standard iostreams includes the indication of errors by throwing exceptions (see subsequent sections for details), then these exceptions could also be raised by the standard locale. This effect was neither intended nor acceptable. Hence, ios_base only contains the definition of all flags for error indication; the raising of exceptions and the indication of error states is located in basic_ios<class charT, class Traits>. The advantage of splitting class ios into class ios_base and class template basic_ios<class charT, class Traits> is that all behavior that is independent of the template parameters is factored out into a non-template. This minimizes the binary code size of the library as well as a user programs. For instance, if you write a function that resets the formatting of a stream to the default settings, this functions does not have to be a function template; it can be an ordinary function receiving a reference to an ios_base object as parameter: { str.width(0); str.precision(6); str.setf(ios_base::skipws); str.setf(ios_base::left, ios_base::adjustfield); str.setf(ios_base::dec, ios_base::basefield); str.setf(ios_base::fixed, ios_base::floatfiled); } In IOStreams each stream maintains a stream state that indicates success or failure of a operation. The stream state can either be good , or any of the following three states when an exceptional condition occurred in a preceding operation:
The following code example demonstrates how to check whether some text is properly written to standard output: handle_error(); // some calculation ... cout << "The calculated value is: " << value << ‘\n’; if (!cout) handle_error(); Before we see how the code example changes when we use exceptions, lets have a more detailed look at exceptions in the standard IOStreams. The classes ios_base and basic_ios<class charT, class Traits> provide the means for enabling iostreams exception: ios_base contains type definitions for a type called iostate along with the following flags of that type: static const iostate eofbit; static const iostate failbit; static const iostate goodbit; iostate exceptions(); The type of the exception that is thrown by the stream is ios_base::failure . To determine which exceptional condition triggered the throw, you can either use he exception’s member functions what(), which returns a descriptive text of type const char* , or you check the stream with one of the stream’s member functions shown in table 1. Let’s see how exceptions change our example code: // some calculation ... ios_base::iostate old_flags = cout.exceptions(); try { cout.exceptions(ios_base::badbit | ios_base::failbit); cout << "The calculated value is: " << value << ‘\n’; } catch(ios_base::failure& exc) { cerr << exc.what() << endl; } cout.exceptions(old_flags);
Note, that it is not guaranteed that all exceptions will be suppressed
after a call to
exceptions(ios_base::goodbit), although
this call clears all bits in the exception mask .
All that is assured
is that errors detected by the stream and the stream buffer are not indicated
via exceptions. Any other kind of error might as well result in an exception
thrown. Think, for instance, of a stream that is instantiated with a user
defined character and traits type. Imagine that an operation of the character
or traits type throws exceptions, e.g.
bad_alloc
.
These exceptions will not be caught by the stream and might be propagated
into your application.
Internationalizing IOStreams. As already mentioned above, the Standard Library includes a component for internationalization. Internationalization services are bundled into a so-called locale object. The standard IOStreams is internationalized and uses standard locales. Each stream holds a locale object in its base object ios_base . The stream stores an additional locale object in its stream buffer. When a locale is attached to the stream via basic_ios::imbue(locale loc) the locale received is stored in ios_base and, redundantly, in the stream buffer. A locale is a rather lightweight object. Hence, storing two locale objects in each stream does not impose much space overhead. The advantage is that those classes that eventually need a locale for processing have direct access to the locale object. Moreover, the two locale objects are used for different purposes. The locale in ios_base . The locale that is held in ios_base is used for the formatting of numeric values. The radix separator, for instance, is no longer hard-coded as a decimal point. Instead, a character that is specified by the attached locale is used. For example, in a German locale the radix separator will be ‘,’ and the output of 0 as a float will not be 0.000000 but 0,000000 . In the traditional IOStreams the radix separator was hard-coded as a decimal point. In the standard IOStreams the radix separator depends on the locale imbued to the stream. This change might lead to surprising results. Consider a situation where a file, that was written with a traditional output stream, shall now be read with a standard input stream that holds a locale with ‘,’ as the radix separator. If the file contains rational numbers that were written in a decimal notation, the input stream will try to parse these numbers with its different radix separator. It is very likely that the input stream will fail to produce the same rational numbers that were once written to the file. The problem can easily be solved be imbuing an appropriate locale into the standard input stream, i.e. a locale where the radix separator is a decimal point. The best thing to do is not to imbue a locale at all, in which case a default locale will be used. For reasons of compatibility the default locale in standard IOStreams is the US English ASCII locale. The locale in ios_base is not only used for formatting of numbers. It is also used to determine which characters of the character set are to be treated as white space characters. This information is needed when an input stream parses input data and has to skip white spaces during this process. There is a subtle difference between the traditional and the standard IOStreams: In the traditional IOStreams the recognition of white space character depends on the active C locale, because the functionality is based on the C standard function isspace(), which is internationalized using the C locale. In the standard IOStreams the recognition of white space characters depends on the C++ locale imbued to each stream. However, it is safe to assume that for the same locale the behavior of the standard IOStreams is compatible to the behavior of the traditional IOStreams. The locale in streambuf . The locale that is held in the stream buffer is used for file i/o when code conversion between the internal and external character set is required. The traditional IOStreams did not perform any code conversions. Code conversion is a new feature in the standard IOStreams. Let’s see what it is and why it is needed. Some cultures, such a Japan or China, have large alphabets with tens of thousands of characters. Characters of such a huge alphabet cannot be encoded in just one byte. Instead there are encodings that use two or more bytes for representing a character. Some of these encodings mix characters of different size (multibyte character encodings); in other encodings all characters are of same size (wide character encodings). It is common practice to use wide character encodings inside the program and multibyte character encodings outside on the external device.
This chapter briefly sketched some aspects of internationalization that
are related to IOStreams. As already mentioned before, a more detailed
description of internationalization support in the Standard C++ Library
supports will be given in our next column.
Removing _withassign Classes. In the traditional IOStreams the classes istream , ostream , and iostream had a private copy constructor and assignment operator. They were private in order to prevent copy and assignment for objects of these classes because they contained a stream buffer by reference. (To be precise, their common base class ios held a pointer to the stream buffer.) The crucial point is that there is no ‘right’ semantics for copying or assigning a stream with respect to its stream buffer. There are different possibilities, e.g. sharing the stream buffer after the assignment, or flushing the stream buffer during the assignment and then providing both streams with entirely independent buffers, and so on. None of these possibilities is intuitively right, though. Consequently, copying and assigning was prohibited. However, there is a need for assigning streams on the other hand. The most convincing example is the wish to redirect standard output (or any of the other standard i/o objects) by assigning a valid stream object to cout . In order to satisfy this requirement, the classes istream_withassign , ostream_withassign , and iostream_withassign were introduced. They implemented a public copy constructor and assignment operator, which let both streams share the stream buffer after the copying or assignment. One might expect that the references to the shared stream buffer would be counted. However, we don’t know of any traditional IOStreams implementation that counted the references to shared stream buffers. Instead, the shared stream buffer was deleted when the stream object that was constructed with the stream buffer went out of scope. In other words, the responsibility for the buffer stayed with the stream object that had initially created the stream buffer. Naturally, this imposed dependencies between the lifetimes of the two stream objects used in the copy constructor or assignment operator. In sum, the correct use of the _withassign classes was rather complicated. This is the reason why in the Standard IOStreams the classes istream_withassign , ostream_withassign , and iostream_withassign do not exist anymore. To perform operations equivalent to the copy constructor and the assignment operator of the old _withassign classes, the user of the standard streams has to explicitly implement this functionality. Standard streams have the following member functions defined in basic_ios<class charT, class Traits>, that can be used for this purpose:
streamcpy(Stream &dest, const Stream& src) { dest.copyfmt(src); dest.clear(src.rdstate()); typedef StreamBase basic_ios<typename Stream::char_type,typename Stream::traits_type>; (static_cast<StreamBase&> dest).rdbuf((static_cast<StreamBase&> src).rdbuf()); } Removing File Descriptors. In the traditional IOStreams all file streams offered a member function fd(). It returned the file descriptor of the file that was associated with the file stream. This feature was helpful when some functionality of the underlying file system was needed, that was not available in IOStreams. For example the function int ftruncate(int fd, off_t length) is available on some UNIX platforms and allows to set a file to a defined length. This non-portable feature was not supported in the traditional IOStreams. The fd() function is omitted from the C++ Standard. The simple reason is that the C++ standard does not want to exclude operating systems that do not have file descriptors from providing a standard conforming IOStreams library.
On the other hand, vendors of the Standard C++ Library are free to extend
the library, as long as these extensions do not conflict with the standard.
Hence it is quite possible that a functionality like
fd()
will be included as a non standard extension in some library
implementations.
String Streams: Replacing strstream by stringstream . The string stream classes in the traditional IOStreams, class strstream , istrstream , ostrstream , and strstreambuf, are deprecated features in the standard IOStreams. This means that they are still provided by implementations of the standard IOStreams, but will be omitted in the future. The purpose of string streams is to facilitate text input and output to memory locations. The deprecated strstream classes allow input and output to and from character arrays of type char*. In the standard IOStreams they are replaced by corresponding stringstream classes that allow input and output to and from strings of type basic_string<charT>, charT being char , wchar_t , or any user-defined character type. The most obvious difference is that instead of providing character arrays to a strstream you now provide string objects to a stringstream . As you can convert character arrays into string objects and vice versa, there are no major restrictions regarding the functionality of string streams. However, there are subtle differences. String streams are dynamic, which means that the internal character buffer is resized and reallocated once it is full. String streams also allow to retrieve the content of the internal character buffer by calling the member function str(). In the traditional IOStreams str() returns a pointer to the internal character buffer . After such a call to str() the string stream is frozen, i.e. the buffer is not resized any longer. This is very sensible since every reallocation would invalidate the buffer pointer. In the standard IOStreams string streams are always dynamic; they do not freeze. A call to str() provides a string object that is a copy of the internal buffer, but does not allow access to the buffer itself.
A similar difference occurs regarding the construction of string streams.
There are constructors taking a character array or a string for use as
the internal character buffer. In the traditional IOStreams this character
array was actually used as the internal buffer, and the string stream constructed
this way was frozen. In the standard IOStreams the string is not used as
internal buffer; only its content is copied into an independent internal
buffer area. Again, the internal buffer is not accessible from outside
the string stream and freezing is not necessary.
Minor changes. Additional to the differences explained above, there are a couple of minor deviations from the traditional IOStreams. Some items are renamed, for instance. Examples are: the type io_state from the traditional IOStreams, which now is named iostate . The same holds for open_mode and seek_dir , which now are openmode and seekdir . And some more.
A standard IOStreams implementation is allowed to support the old names
and interfaces for sake of compatibility with the traditional IOStreams.
The C++ standards document contains a list of these compatibility features.
Summary. The standard IOStreams are modeled after the traditional IOStreams. However, there are a couple substantial differences:
|
|||||||||||||||||||
© Copyright 1995-2003 by Angelika Langer. All Rights Reserved. URL: < http://www.AngelikaLanger.com/Articles/Papers/IOStreams/IOStreams.htm> last update: 22 Nov 2003 |