Angelika Langer - Training & Consulting
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | Twitter | Lanyrd | Linkedin
 
HOME 

  OVERVIEW

  BY TOPIC
    JAVA
    C++

  BY COLUMN
    EFFECTIVE JAVA
    EFFECTIVE STDLIB

  BY MAGAZINE
    JAVA MAGAZIN
    JAVA SPEKTRUM
    JAVA WORLD
    JAVA SOLUTIONS
    JAVA PRO
    C++ REPORT
    CUJ
    OTHER
 

GENERICS 
LAMBDAS 
IOSTREAMS 
ABOUT 
CONTACT 
Curiously Recurring Manipulators

Curiously Recurring Manipulators
Curiously Recurring Manipulators
Using Modern Programming Techniques for Implementation of Stream Manipulators

C/C++ Users Journal, June 2001
Klaus Kreft & Angelika Langer


 
 

Let us revisit stream manipulators. We explained them in two articles in C++ Report last summer [ 1 , 2 ]. The techniques that we demonstrated for implementation of stream manipulators were inspired by classic programming techniques that have traditionally been used to implement manipulators using the classic, pre-standard IOStreams library. We took those old ideas, twisted and tweaked them a little bit, and made them work with the new templatized standard IOStreams classes. The resulting solution works, but it turned out that the classic techniques did not seamlessly fit into the context of a library of templates — and almost all standard stream classes are templates these days.

For those readers who cannot recall the two reference articles, we will give a brief recap of the classic-inspired solution and its downside in a minute. (Similar ground is covered in our IOStreams book (see [3] on page 176ff.) But what we really want to discuss in this installment of our column is not manipulators again, but application of more modern programming techniques for implementing stream manipulators in a significantly more elegant way. We will be using template programming techniques and in particular the Curiously Recurring Template pattern [4] , which is also referred to as parameterized inheritance . And we want to demonstrate how useful function object types can be.

Recap: Implementing Manipulators with Parameters — The Old-Style Approach

A stream manipulator is an object that we can insert into an output stream or extract from an input stream. The effect of this insertion or extraction is a manipulation of the stream. An example is the widely known endl manipulator. Here is the classic example of its use:
cout << "Hello World!" << endl;
The endl manipulator is an object that can be inserted into an output stream; the effect of which is insertion of an end-of-line character into the stream and subsequent flushing of the stream buffers. There are countless further pre-defined manipulators in the standard IOStreams library, such as boolalpha for switching from numeric to textual display of Boolean values, noskipws for suppressing white-space-skipping on input, or setw(n) for setting the field width for subsequent output.

User-Defined Stream Manipulators

The IOStreams library is extensible, and we can add user-defined special-purpose manipulators. As an example, let us implement a multi-end-of-line manipulator, pretty much like the standard manipulator endl , but with the additional capability of inserting an arbitrary number of end-of-line characters:
cout << mendl(5);
Implementing such a stream manipulator is pretty straightforward. Here is a solution:
class mendl {
public:
   explicit mendl(unsigned int i) : i_(i) {}
private:
   const unsigned int i_;


   template <class charT, class Traits>
   friend basic_ostream<charT,Traits>&
   operator<<(basic_ostream<charT,Traits>& os
            , const mendl& w)
   { for (int i=0; i<w.i_; ++i)
        os.put(os.widen('\n')); 
     os.flush();
   }
};
We must provide an inserter, that is, an overloaded version of operator<< , that allows insertion of an object of type mendl into an output stream and performs the desired manipulation of the stream. The inserter is implemented as a friend of class mendl . An expression such as mendl(5) is the construction of an unnamed object of type mendl , with the integer literal 5 as the constructor argument. The constructor argument is stored as a data member and later used to perform the manipulation. That's one of the classic ways of implementing stream manipulators with arguments.

Factoring out Common Code into a Manipulator Base Class

Now, manipulators have a lot in common, and in order to avoid redundancies, we aimed to factor out all IOStreams-specific tasks into a manipulator base class. These IOStreams-specific chores can for instance include proper error and exception handling, which uses the stream state judiciously, pays attention to the stream's exception mask, and does everything in perfect conformance with the IOStreams policies. For our purposes in this article, we omit the error and exception handling functionality that the manipulator base class could add. For the missing details, see page 186ff in our IOStreams book [3] . The manipulator base class is called one_arg_manip_weh and is designed as a base class for output stream manipulators that have one argument, like mendl . Here is a sketch of the implementation:
template <class charT, class Traits, class Argument> 
class one_arg_manip_weh {
public:
 typedef void (* manipFct)(basic_ostream<charT,Traits>&, Argument);
 one_arg_manip_weh(manipFct pf, const Argument& arg)
 : pf_(pf), arg_(arg), error_(ios_base::goodbit) {}
private:
 manipFct pf_;
 const Argument arg_;
 ios_base::iostate error_;
    
friend
void do_manip(basic_ostream<charT,Traits>& str,
              const one_arg_manip_weh<charT,Traits, Argument>& oamw);
};


template <class charT, class Traits, class Argument>
void do_manip(basic_ostream<charT,Traits>& str,
              const one_arg_manip_weh<charT,Traits, Argument>& oamw)
{
// do lots of stuff, and in particular call the manipulation function
  ...
  oamw.pf_(str,oamw.arg_);
  ...
}

template <class charT, class Traits, class Argument>
basic_ostream<charT,Traits>& operator<< 
(basic_ostream<charT,Traits>& os, 
 const one_arg_manip_weh<charT,Traits,Argument>& oamw)
{
 if(os.good())
    do_manip<charT,Traits,Argument>(os,oamw);
 return os;
}
Again, this solution is modeled after classic IOStreams solutions. In fact, you will find a similar base class in the IOStreams library, often named smanip or _Smanip or the like, which is used for implementation of the pre-defined standard manipulators with parameters such as setw(n) for instance. The key idea here is that the manipulator base class stores the manipulator argument and a function that performs the actual manipulation. The manipulator base class provides the required overloaded shift operators and implements them by calling the manipulation function with the stored argument. Note, that the manipulator function is represented in the form of a function pointer. That's a classic C/C++ technique of passing around functions.

Using this base class, we could re-implement the mendl manipulator as follows:

template <class charT,class Traits=char_traits<charT> >
class mendl 
: public one_arg_manip_weh<charT,Traits,unsigned int>
{public:
  explicit mendl(unsigned int i) 
  : one_arg_manip_weh<charT,Traits,unsigned int>(mendl::fct,i)
  { }
 private:
  static void fct (basic_ostream<charT,Traits>& os, 
                   unsigned int n) 
  { for (int i=0; i<n; ++i)
        os.put(os.widen('\n')); 
    os.flush();
  }
};
For better encapsulation, we provide the required manipulation function as a static member function of class mendl and pass the pointer to that function to the manipulator base class along with the argument that must later be passed to the function when it is invoked.

The Caveat

So far so good. But now we face a minor inconvenience that stems from the fact that the standard IOStreams classes are templates taking the character type as a template argument. In order to make the manipulator base class universally applicable, that is, to all kinds of streams regardless of their character type, the manipulator base class is a template, too. Consequently, the derived manipulator class mendl is a class template as well.

With mendl being a template, the manipulator expression is less convenient than it used to be. Each time we manipulate a stream by inserting a mendl object, we need to know the character type of that stream. Instead of simply saying:

cout << mendl(5); // wrong ! — does not compile
wcout << mendl(5); // wrong ! — does not compile
we now have to specify the template arguments and say:
cout << mendl<char>(5);
wcout << mendl<wchar_t>(5);
That's ugly. How did that happen? Look at our manipulation function mendl::fct . It calls the stream function flush , which is a member function of output streams, that is, of streams derived from class basic_ostream<charT,Traits> . For this reason, it must take a basic_ostream<charT,Traits>& as an argument, and since it is a static member function of class mendl , it turns the entire class into a class template. There is no way to avoid the templatization of class mendl with this approach.

In addition, our manipulator base class has other deficiencies: it can only serve as a base class for manipulators with exactly one argument, because it has only one data member for storing arguments and it is tied to a specific function signature for declaring the function pointer type.

Can we find an alternative and more flexible solution for the manipulator base class? The goal is to provide a manipulator base class that does not require the manipulator type to be a template and allows manipulators with an arbitrary number and type of arguments. The intended usage for manipulators implemented in terms of this new manipulator base class should ideally be as convenient as:

cout << mendl(5);
wcout << mendl(5);
cout << multi('*',80);
where multi is a manipulator with two arguments that insert a given character n times into an output stream.

Implementing Manipulators with Parameters — The Modern Approach

In order to find a better solution, let us look at the original solution again. Here is our original base class template:
template <class charT, class Traits, class Argument> 
class one_arg_manip_weh {
public:
 typedef void (* manipFct)(basic_ostream<charT,Traits>&, Argument);
 one_arg_manip_weh(manipFct pf, const Argument& arg)
 : pf_(pf), arg_(arg), error_(ios_base::goodbit) {}
private:
 manipFct pf_;
 const Argument arg_;
 ... 
friend
void do_manip(basic_ostream<charT,Traits>& str,
              const one_arg_manip_weh<charT,Traits, Argument>& oamw);
};

template <class charT, class Traits, class Argument>
void do_manip(basic_ostream<charT,Traits>& str,
              const one_arg_manip_weh<charT,Traits, Argument>& oamw)
{  ... oamw.pf_(str,oamw.arg_); ...  }

template <class charT, class Traits, class Argument>
basic_ostream<charT,Traits>& operator<< 
(basic_ostream<charT,Traits>& os, 
 const one_arg_manip_weh<charT,Traits,Argument>& oamw)
{
 if(os.good())
    do_manip<charT,Traits,Argument>(os,oamw);
 return os;
}

Replacing Function Pointers by Function Objects

The manipulator base class stores a pointer to the manipulation function and the argument and later uses both in the shift operator for performing the actual manipulation by invoking the manipulation function with its argument (indirectly via the helper function do_manip ). Instead of using a function pointer, we could use a function object.

What is a function object? It is an object that can be called. What does it mean to call a function object? It means invocation of a member function of the respective function object type. Often this member function is an overloaded version of the function call operator, which is just a special case that allows for function call syntax. It is not required, and in our examples below, we will use a member function named fct instead.

Where is the advantage over a function pointer? A function object can know how to "call itself." In particular, it could store the parameters that control the manipulation as data members. When we invoke a function through a pointer, then we have to provide the function argument separately. In contrast, when we invoke a function object, then this invocation is a call to a member function, which has access to the function object's data members. The information that is provided in form of the function argument (in the pointer solution) could already be contained in the function object as a data member. Let us see how this can be done.

Here is the old implementation of mendl , where the manipulation function is a static member function that takes one argument, namely the number of times it must print the end-of-line character:

template <class charT,class Traits=char_traits<charT> >
class mendl 
: public one_arg_manip_weh<charT,Traits,unsigned int>
{public:
  explicit mendl(unsigned int i) 
  : one_arg_manip_weh<charT,Traits,unsigned int>(mendl::fct,i)
  { }
 private:
  static void fct (basic_ostream<charT,Traits>& os, 
                   unsigned int n) 
  { for (int i=0; i<n; ++i)
        os.put(os.widen('\n')); 
    os.flush();
  }
};
Note, that the constructor of mendl receives the integral value as an argument that will later be passed as a function argument to the static member function mendl::fct . This integral value is handed over to the manipulator base class along with the pointer to the static member function mendl::fct .

If we implemented mendl as a function object type, then it could look like this:

 
class mendl : public 
manipBase
 {
public:
   mendl(unsigned int n) 
   : 
manipBase(...)
, how_many_(n) {}
private:
   const unsigned int how_many_;
public:
   template <class Ostream>
   Ostream& fct(Ostream& os) const
   {
      for (unsigned int i=0; i<how_many_; ++i)
         os.put(os.widen('\n'));
      os.flush();
      return os;
   }
};
The constructor of mendl still receives the integral value that will later be passed as a function argument to the member function fct . But this time, it stores the integral value as a data member and uses this data member in the implementation of its non-static member function fct .

Implementing the New Manipulator Base Class

We've left open so far what the manipulator base class would then look like. Obviously it does not have to store the function argument any longer, since the mendl object knows itself how many times it must print the end-of-line character. All that the base class needs for performing the manipulation is a mendl object whose member function fct performs the manipulation. Instead of providing a function pointer to the base class, we pass the manipulator object itself to the base class:
class mendl : public 
manipBase
 {
public:
   mendl(unsigned int n) 
   : 
manipBase
(
*this
), how_many_(n) {}
private:
   const unsigned int how_many_;
public:
   template <class Ostream>
   Ostream& fct(Ostream& os) const;
};
The manipulator base class must have a constructor that receives the manipulator function object and stores it somewhere. We'll look into the details of storing and accessing the function object later.

Ideally, the manipulator base class should work with all types of manipulators, and this is easily achievable by making it a template that takes the manipulator type as a template parameter.

In addition, the manipulator base class must provide the required shift operator for manipulator objects. As before, we implement the shift operator in terms of a helper function that does the actual work. The helper function was formerly named do_manip ; we renamed it to manipulate for better distinction. The helper function performs the manipulation by invoking the manipulator function object, (that is, its member function fct ), which was previously done through the function pointer, and does all the other interesting IOStream-specific things exactly as before.

Here is tentative implementation of the manipulator base class manipBase :

template <class Manip> 
class manipBase
{public:
    manipBase(const Manip& m) 
    { // receive manipulator function object and store it somehow
      // details later ... 
    }
    template <class Stream>
    Stream& manipulate(Stream& str) const
    { // use the manipulator function object and call its fct() member function
      // details later ...
    }
 private:
    // private data considered later ...
};
template <class Ostream, class Manip>
Ostream& 
operator<< (Ostream& os, const manipBase<Manip>& m)
{ return m.manipulate(os); }
We introduced another minor change: we reduced the number of template parameters of the shift operator. Previously it had the character type and the character traits type as template parameters; now we use the stream type instead.

Accessing the Manipulator Function Object inside the Manipulator Base Class

Where do we store the manipulator function object? We could try to store it as a data member of the manipulator base class, like this:
template <class Manip> 
class manipBase
{public:
    manipBase(const Manip& m) : m_(m) {}

    template <class Stream>
    Stream& manipulate(Stream& str) const
    { // calls the manipulators's function fct()
      // ... m_.fct() ...
    }
 private:
    Manip m_;
};
This approach does not work because it introduces a circular type dependency. Since manipBase is a template, class mendl would be defined as:
class mendl : public 
manipBase<mendl>
 {
public:
   mendl(unsigned int n) 
   : manipBase<mendl>(*this), how_many_(n) {}
private:
   const unsigned int how_many_;
public:
   template <class Ostream>
   Ostream& fct(Ostream& os) const;
};
The derived manipulator class mendl uses class manipBase as its base class passing its own type as template argument to the base class. The compiler cannot resolve the circular dependency between the two types.

If we cannot store the manipulator function object as a data member and use delegation to the data member for performing the actual manipulation, we can alternatively take advantage of the fact that manipBase is a manipulator base class. Its sole purpose is to serve as a base class for manipulator types that have a public member function fct , which performs the actual manipulation.

It is not intended that we ever create objects of the base class type, because it does not provide any meaningful functionality by itself. Its purpose is to decorate the actual manipulation, which is provided by the derived manipulator type, with IOStream-specific error handling. For this reason, we can safely assume that all references to manipBase objects are actually references to derived manipulator objects. For instance, inside the shift operator, which takes a manipBase& argument, we would know that the reference refers to a concrete derived manipulator object. Under these circumstances, we can safely cast down from the base class reference to the concrete derived class reference. With this downcast, we get access to the actual manipulation functionality, namely member function fct . Here is what it looks like:

template <class Manip> class manipBase {
public:
 template <class Stream>
 Stream& manipulate(Stream& str) const
 { ...
   // call Manip::fct()
   
static_cast<const Manip&>(*this).fct(str);

   ...
 }
};
template <class Ostream, class Manip>
Ostream& 
operator<< (Ostream& os, const manipBase<Manip>& m)
{ return m.manipulate(os); }
The obvious concern here is: can this crash? After all this is not even a safe dynamic_cast , but an unsafe static downcast. The answer is: no, it is perfectly safe. The only location where manipBase 's member function manipulate is invoked is from within the shift operator. In this context, manipulate is invoked on a reference to an object of type manipBase itself or any type derived thereof. Otherwise the invocation of the shift operator would just not compile. If the object is of a derived type, then the downcast is safe.

Can it happen that the referenced object passed to the shift operator is just a base class object, and not a derived class object? Well, would that make sense? The only accessible functionality of class manipBase is its manipulate function, and this member function will only compile if the generic type Manip (the template type argument) has a fct member function. Theoretically, Manip could be a type that is unrelated to class manipBase and has a fct member function, but in that case a reference to an object of the Manip type cannot be passed to the shift operator, and the manipulate function will never be called. Unless you introduce evil reinterpret_cast s someplace, you cannot break this code.

Evaluation

Let's step back and see what we've done. We've been using two interesting techniques here: parameterized inheritance and function objects.

Parameterized Inheritance

Parameterized inheritance is also known as the Curiously Recurring Template pattern, which can be summarized as follows (quoted from the Pattern Almanac [5] ):
A class is derived from a base class instantiated from a template. The derived class is passed as a parameter to the template instantiation. This pattern captures a circular dependency using inheritance in one direction and templates in the other.
Our manipulator base class manipBase and its concrete derived classes, such as mendl are an example of this pattern: the derived manipulator type mendl passes itself as a template argument to its base class manipBase . The use of the safe static downcast is typical for implementation of this pattern.

You can also think of the Curiously Recurring Template pattern as a way of static polymorphism: Our manipulator base class manipBase exhibits polymorphic behavior (in its member function manipulate ) depending on the type that was used for instantiation. It's a static polymorphism because no virtual functions are involved.

Function Objects

Replacing function pointers with function objects greatly simplified the solution and added a lot of flexibility. For instance, all the knowledge regarding invocation of the actual manipulation is now encapsulated in the function object type only. Previously, the function signature (number and type of arguments) of the manipulation function was known to the base class, and we needed a different base class for every new signature. The same holds for the function arguments. Knowledge of the parameters that control the manipulation is now located in the function object type only. Previously, these parameters had to be passed to the base class. This better separation of concerns frees the base class from various responsibilities. In particular, the base class is no longer a template that depends in any way on the stream type.

Our initial goal was to find a more flexible solution for the manipulator base class that does not require that a manipulator type is a template and allows manipulators with an arbitrary number and type of arguments. The intended usage for manipulators implemented in terms of this new manipulator base class should ideally be as convenient as:

cout << mendl(5);
wcout << mendl(5);
cout << multi('*',80);
where multi is a manipulator with two arguments that insert a given character n times into an output stream.

Have we achieved this? Yes, we did. Recall the implementation of mendl using the new manipBase class:

class mendl : public manipBase<mendl> {
public:
   mendl(unsigned int n) 
   : manipBase<mendl>(*this), how_many_(n) {}
private:
   const unsigned int how_many_;
public:
   template <class Ostream>
   Ostream& fct(Ostream& os) const
   {
      for (unsigned int i=0; i<how_many_; ++i)
         os.put(os.widen('\n'));
      os.flush();
      return os;
   }
};
Class mendl does no longer depend on the stream type or its character type. Only its member function fct depends on the stream type. Before, the entire manipulator class depended on the stream type because the base class depended on it. This was inevitable because the base class had knowledge of the function signature of the manipulation function. With the improved encapsulation this dependency is eliminated.

Regarding improved flexibility: here is the implementation of the multi manipulator with its entirely different signature. (It takes an additional argument.) It can be easily implemented using the same manipulator base class because the base class is not concerned with the specifics of the manipulation any more:

class multi : public manipBase<multi> {
public:
   multi(char c, size_t n) 
   : manipBase<multi>(*this), how_many_(n), what_(c) {}
private:
   const size_t how_many_;
   const char what_;
public:
   template <class Ostream>
   Ostream& fct(Ostream& os) const
   {
      for (unsigned int i=0; i<how_many_; ++i)
         os.put(what_);
      os.flush();
      return os;
   }
};
With the original manipulator base class, we need an additional base class two_arg_manip_weh that can store two arguments instead of just one and accepts a different function pointer type. Again, better encapsulation of concerns simplifies the solution and adds flexibility too boot.

Summary

In this article, we have studied how modern programming techniques greatly simplify programming tasks such as implementation of stream manipulators. We replaced function pointers by function objects, thus gaining a great deal of flexibility, while simplifying the solution at the same time. And we did not lose a bit in efficiency, by the way. We replaced polymorphism via function pointers with static polymorphism via templates using the Curiously Recurring Template pattern. We haven't been using any exotic template features for achieving this. We don't get any more adventurous than using a member template (the base class's manipulate member function is a template), but this feature is supported by most compilers these days. In essence, nothing should prevent you from using this solution.

Acknowledgements

We would like to give credit to our fellow columnist Kevlin Henney, whose email inspired us to write this article. He sent us email last summer as a reaction to our two manipulator articles and suggested an approach very similar to the one discussed in this article, where he creatively mixed various state-of-the-art techniques, including member function templates, function object types, overloaded function call operators, and parameterized inheritance. For those who are interested in his ideas, they are available at http://www.AngelikaLanger.com/IOStreams/forum.htm .

References

[1] Klaus Kreft and Angelika Langer. "Implementing Manipulator With and Without Parameters Using the Standard IOStreams," C++ Report , April 2000.

[2] Klaus Kreft and Angelika Langer. "A More Refined Method for Implementing Manipulator with Parameters," C++ Report , June 2000.

[3] Angelika Langer and Klaus Kreft. Standard C++ IOStreams and Locales (Addison Wesley, January 2000).

[4] James O. Coplien. "Curiously Recurring Template Patterns," C++ Report , February 1995.

[5] Linda Rising. The Pattern Almanac 2000 (Addison-Wesley, 2000).
 
 
 

If you are interested to hear more about this and related topics you might want to check out the following seminar:
Seminar
 
Effective STL Programming - The Standard Template Library in Depth
4-day seminar (open enrollment and on-site)
IOStreams and Locales - Standard C++ IOStreams and Locales in Depth
5-day seminar (open enrollment and on-site)
 

  © Copyright 1995-2007 by Angelika Langer.  All Rights Reserved.    URL: < http://www.AngelikaLanger.com/Articles/Cuj/05.Manipulators/Manipulators.html  last update: 10 Aug 2007