AngelikaLanger.com - Combining OO Design and Generic Programming

Angelika Langer - Training & Consulting

	HOME


	OVERVIEW BY TOPIC JAVA C++ BY COLUMN EFFECTIVE JAVA EFFECTIVE STDLIB BY MAGAZINE JAVA MAGAZIN JAVA SPEKTRUM JAVA WORLD JAVA SOLUTIONS JAVA PRO C++ REPORT CUJ OTHER
	GENERICS
	LAMBDAS
	IOSTREAMS
	ABOUT
	CONTACT

Combining OO Design and Generic Programming

Combining OO Design and Generic Programming
Combining OO Design and Generic Programming

C++ Report, March 1997
Klaus Kreft & Angelika Langer

As an attentive reader of C++ Report it will not have escaped your attention that during the past months there have been three articles explaining the structures and techniques related to iterators in the Standard C++ Library. Well, those were feature articles preceding a regular column on the Standard C++ Library.

We have not yet had the opportunity to introduce you to the why-and-what of this column. Let's make up for this omission now. First of all - its name. We suggested to call the column "Effective Standard C++ Library" in hommage à Scott Meyers and his "Effective C++" books, which we do like quite a bit. Secondly, before we start to explain the intention of the column, let us have a short look at what we mean by Standard C++ Library. The diagram below is meant to be sort of a road map. It gives you an overview of the content and structure of the Standard C++ Library. It will reappear in future columns to help you keeping track of where we are, i.e. we want to prevent you from losing orientation in the library jungle.

So, here is what the Standard C++ Library comprises:

Data Structures and Algorithms are based on a proposal made by Alexander Stepanov and Meng Lee of Hewlett-Packard. The proposal was submitted to the ISO/ANSI committee and accepted as part of the standard in 1994. Early versions of it were made publicly available as the STL, the Standard Template Library. Standardization proceeded; many refinements were suggested and voted in. Today we have two diverging libraries: we have the standardized data structures and algorithms in the Standard C++ Library in parallel to the old, original STL that is available as public domain software. Both libraries comprise various containers like vector, list, map, and set, quite a number of algorithms, and the iterators, which we discussed at length in the preceding three articles. The design of these libraries is a demonstration of generic programming, a novel programming paradigm that separates data structures from algorithms. Function objects are another category of components in the Standard C++ Library; they are used as parameters to data structures and operations and are a powerful means in generic programming. Allocators were added later on, in order to make the data structures more flexible and adaptable to different memory allocation schemes. We will tell you more about them in the future.

Internationalization is supported by the Standard C++ Library by means of the locale class and its facets. This component was designed and suggested by Nathan Myers in 1994. Major design issues were discussed and resolved until 1996. Traditionally, internationalization support was offered by the standard C library. However, the C library is of somewhat limited use, because its locale is a global one, which introduces problems in multilingual environments where multiple locales are needed. The locale framework introduces the notion of facets . A facet maintains a certain set of internationalization services, e.g. numeric facets hold information about the representation of the decimal point, the thousand separator, etc., and knows how to parse and format numbers. The locale framework serves as a container of facets. The framework can be extended; new user defined facet types can be created and new facets objects can be added to a locale object.

Streams Input and Output is the next generation of IOStreams. Since the old days of C++, compilers have come with an IOStreams library. Cfront, which was one of the first C++ compilers, was delivered with an IOStreams library that was designed by Jerry Schwarz at AT&T. Other compiler vendors recognized this IOStreams library as a kind of de-facto standard and offered IOStreams libraries with similar interfaces and behavior. Naturally, Jerry Schwarz was one of the people who contributed substantially to the new IOStreams component for the Standard C++ Library. The new standard IOStreams is a templatized and internationalized component for text and binary input and output. It aimed for staying compliant with the notion and design of the traditional IOStreams. However, fundamental changes were introduced over the time. The most significant was the decision to templatize the IOStreams classes on the character type. The intent was to extend the IOStreams framework so that it does not only work with one-byte-character streams, but also with wide character streams. Another logical consequence was to equip streams with a locale object in order to enable streams to adjust their behavior to local conventions such a numeric formatting rules or character conversion necessities.

Miscellaneous comprises everything else. There is a string class. There is a hierarchy of exception classes, some of which are thrown by the runtime system of the language itself; the operator new() for instance throws a bad_alloc exception if it cannot make any memory available. Other standard exceptions are thrown by library components, such as out_of_range , which is raised by the string class's operator[]() for instance in case the index is greater than the string's size. The hierarchy itself already suggests how the exception should be handled and thus can serve as a base for your own exception classes. For example there is a distinction between run time errors and logical errors. Logical errors are bugs in your program's logic. Runtime errors cannot be anticipated, such as memory shortage for instance.

Another set of classes is for numeric problems; there is a complex number class template and a numeric array. The numeric array uses sophisticated template idioms, so-called expression templates, in order to serve as a high-performance building block for matrices and multidimensional arrays.

What will the column be about?

There is already a lot of printed material available that describes the functionality of the Standard C++ Library and the STL. For example the following books: [i] , [ii] , [iii] , [iv] , or [v] . The majority of those books are about the STL. [i] , [ii] , and [iii] come close to a user's guide and a reference manual to the STL; they show you the public interface of, say, the vector<> class template, or how to apply a merge algorithm. [iv] goes beyond this descriptive approach and demonstrates how one can build more complex data structures on top of the STL containers. [v] covers the entire Standard C++ Library. The focus still is on data structures and algorithms; stream input/output and internationalization are only described in brevity.

The intention of our column, however, is different. It will not be about the plain usage of the library's components, but will focus on explaining the idioms that are used in the design of the library. Take for example the traits technique that we described in one of our former articles [vi] . This is an idiom used in the Standard C++ Library, which can be applied whenever types should be associated to a class and built-in types. Such an idiom is used in the Standard C++ Library on the one hand, but is also helpful to any C++ designer.

Besides the explanation of idioms from the library we also want to show concrete tricks and techniques that are needed to effectively use and extend the Standard C++ Library framework. Our latest article [vii] , where we built a safe iterator, is a general assembly plan for an iterator that complies to the standard library framework. It explained how to extend the Standard C++ Library in adding a new iterator type.

Generic Programming and Its Impact on Software Design

However, before we will delve into the details of Standard C++ Library in future contributions to this column, let us first spend a couple of thoughts on how the philosophy and design of the library will influence the way we design and develop our applications. This time the considerations will focus on application software design rather than implementation techniques used in the Standard C++ Library. Let's restrict the discussion on the most popular part of the library, the data structures and algorithms from the Standard C++ Library and, in analogy, the STL.

Aspects of Generic Programming

The data structures and algorithms in the Standard C++ Library, and the STL, are a demonstration of a programming paradigm called generic programming . The key abstractions of the generic components in the Standard C++ Library are containers , algorithms , and iterators . Containers hold the data. Containers also export pointer-like objects, so-called iterators, that give access to the data elements held in a container. Algorithms do the work. They accept iterators and iterator ranges as arguments and use them to access a container and perform certain operations on the container. In principle there is a clear separation between data and operations on the data; the glue between both elements are the iterators.

Here's an example, a find algorithm applied to a list container:

list<string> Friends; // 
container


// populate the list

list<string>::iterator begin = Friends.begin() // 
iterator
 to the begin of the container
list<string>::iterator end = Friends.begin()   // 
iterator
 to the end of the container

// 
algorithm
 find() search for "Philippe" in the container
if (find(Friends.begin(),Friends.end(),"Philippe") != Friends.end())
cout << "Philippe is a friend of mine" << endl;

There are two types of genericity here.

First, there is the separation of data structures and algorithms. An algorithm like find() for instance is called a generic algorithm because you can apply it to any kind of container that fits into the Standard C++ Library or STL framework. Decoupling of containers and algorithms via iterators facilitates the generic behavior of the elements from the Standard C++ Library.
Secondly, generic components are implemented by using C++ templates. The find() algorithm for instance is a function template ; one of its template parameters is the iterator type. The list container is a class template ; one of its template parameters is the type of elements it holds. Most iterators are class templates too; they take the value type as one of their template arguments.

In sum, we achieve genericity by means of a design idea, i.e. separation of data structures from algorithms, and by use of programming techniques supported by the programming language, i.e. C++ class and function templates.

Although generic programming uses classical object-oriented C++ language features such as class (template) declarations, it is not object-oriented. How does generic programming contrast to object-oriented ideas?

In object-oriented programs abstractions are expressed by means of base classes.

In generic programs the abstractions are described in terms of formal, yet verbal requirements. Examples of such requirements are:

A container must provide certain iterators.
An iterator must provide certain operations, such as increment and dereference.
An algorithm must work on iterator ranges.

Most of these requirements are expressed in C++ by implicit requirements to the template arguments of a generic component. A call to find() for instance will not compile if you provide it with iterators that do not have an increment operation:

template <class Iterator, class T>
Iterator find(Iterator first, Iterator last, const T& value)
{
   while (first != last && *first != value)
          first++; 
// Iterator::operator++() used !

   return first;
}

We note that a significant part of the generic programming paradigm can be expressed in terms of C++ language constructs; other parts cannot be expressed by means of the programming language. There are requirements to containers, iterators, and algorithms that are not expressed by means of language features and hence will not even be detected at compile or template instantiation time. An example is the requirement that a pair of iterators is supposed to describe a contiguous range of elements, i.e. the second iterator must be reachable by incrementing the first iterator. Violations of this requirement are not detected until runtime.

Although both paradigms, object-oriented programming and generic programming, can be expressed in C++ they are fundamentally different. What does this difference mean for the design of applications that use generic components?

Design Elements of Object-Oriented Software

It is common wisdom that you do some design before you start coding whenever you develop software for a system that has a certain degree of complexity. Depending on various factors, such as your budget, the formal requirements for your project, the amount of people that have to understand the design, you can either scribble some notes on a piece of paper, or use a formal design method instead. In a project where numerous people have to understand each other's design, it is good practice to use one of the popular formal object-oriented design methods: Booch Method, Fusion, OMT, ...

What aspects of your software design do you usually model in a object-oriented design language? Typical elements are:

A static class model. It shows the classes, their interfaces, and their relationships (inheritance, containment, associations,...), other details such as the protected and public interface, cardinalities of containment relationships, roles of associations, and so on.
A dynamic model. It shows the interaction of objects in representative scenarios, i.e. it shows typical sequences of calls to (public) member functions.
A state diagram , if needed. It shows the potential states of a class and the state transitions triggered by events.

Software Design and Generic Programming

What do I have do design in a generic program in contrast to an object-oriented program? Can I use object-oriented design methods for modeling a generic program? Well, not really ....

First, not all elements of your generic program are classes. Algorithms for instance play an important role in generic programming, and they are function templates, not classes. If you want to reflect the existence of algorithms, you can be happy if your design method allows free code, as the Booch method does. In some design methods you will not even be able to express function templates in an appropriate way.
Secondly, classes in generic programming are mostly unrelated. There is no inheritance among containers, iterators, or algorithms. There are only few containment relationships among containers, iterators, and algorithms. It's true, containers have an allocator , and associative containers have a comparitor. However, the main classes do not contain each other
Also, class interfaces do not reflect all operations of a generic component. Take the interface of a vector for instance; you will notice that a vector provides iterators. But it is not obvious that you are allowed to use these iterators in a call to a sort -algorithm. Still, it is a property of a vector that it can be sorted via the sort-algorithm. More generally, you cannot see which operations (or algorithms) are applicable to a certain container. In an object-oriented program the sort-algorithm would most likely be a member function of the vector class. Hence the relationship between the container and the algorithm would be visible and could be expressed by means of an object-oriented design method. Since in generic programming many relationships are expressed in terms of implicit requirements to template arguments there is no way of modeling the relationships between elements in a generic program by means of an object-oriented design method.

In sum, object-oriented design is not ideal for generic programming. This does not really come as a surprise because object-oriented design methods essentially capture the semantics of object-oriented programming; that's what they are for.

Using Generic Programming for Application Development

Does that mean we have to wait until someone some day will invent a "generic programming design method", before we will be able to use the Standard C++ Library or the STL in a project that has a complexity which requires design? To answer this question let's have a broader look at the way we can use generic programming and generic components in a C++ project. We can conceive of the following scenarios:

Scenario 1 : Design your application as a generic program.

You would apply the generic programming paradigm to your entire application, i.e. you would design your application as a generic program that consists of various generic components that can be plugged together.

The C++ standard already defines a set of generic components - the data structures and algorithms from the Standard C++ Library. It would certainly be wise to make your abstractions compatible to the standard framework. Now, think of the main elements of generic programming in the Standard C++ Library - containers, algorithms, iterators. Adopting the generic programming paradigm from the Standard C++ Library for an entire application basically means that you have to categorize all significant elements of your software to be either an algorithm, or an iterator, or a container. Is it conceivable to design an entire application in terms of just 3 different abstractions?

It is not as restrictive as is sounds. The notion of these main elements can be broadened to some extent. Take the abstraction of a container for instance. A container in essence is an object that can provide iterators to let algorithms operate on itself. Hence, a container need not necessarily be a typical 'container' data structure such as a list or a queue. Generic algorithms can be applied to the istream from the Standard IOStreams via istream_iterator 's, too.

Also, the generic programming paradigm does not impose any limitations regarding the type or number of abstractions. You are free to invent novel generic abstractions. Allocators are an example of this; they were introduced to the STL in the process of making the STL part of the Standard C++ Library. Allocators added another dimension of genericity to containers, see []. You can also contrive a novel set of requirements to algorithms; you might want to add algorithms that accept containers as arguments rather than iterator ranges. Examples for that can be found in Graham Glass's column "STL in Action".

In principle, it is conceivable that you use the ideas of generic programming in your application design, broaden the standard abstractions and add new generic elements. Still, can you imagine that all elements of your software can be squeezed into the pattern of either algorithms, or iterators, or containers, and a few other generic elements? We think that this will be very rarely true. In some cases it is true, in particular for foundation libraries that provide data structures and algorithms. In those cases, where the generic programming paradigm fits, your design is more or less done, because you already know how all elements of your design can be classified into certain generic abstractions and how they will collaborate. You would not need to do any elaborated design anymore; you would just stick to a well-defined classification of generic components.

Scenario 2 : Design your application as an object-oriented program that uses the generic components .

In this case you will apply classical object-oriented design techniques, and then build this object-oriented application on top of the generic components from the Standard C++ Library or the STL. In principle there are two possibilities for doing this:

You can either use the generic components directly, or
you introduce a middle layer of object-oriented base components that are built on top of the generic components from the Standard C++ Library or the STL.

If you use the generic components directly, i.e. without a middle layer, these generic components will show up in your object-oriented design model. One reasonable way of integrating them is to model the standard containers as classes, and all standard algorithms that can be applied to a container as methods of the respective container class. (This is different from the generic approach where containers and algorithms are more decoupled.) Basically you would fake an object-oriented container model.

In your object-oriented design the application domain objects will have use- and containment-relationships to the faked object-oriented container classes.

However, for purpose of implementation, your application domain classes you will ultimately have to use the generic components as they are: as containers with separate algorithms.

An alternative possibility is to really implement the object-oriented containers from our design model above, i.e. you will implement a middle layer on top of the generic components.

In your object-oriented design the application domain objects will have use- and containment-relationships to the object-oriented container classes from the middle layer.

Different from the approach suggested above, you will also use these object-oriented containers in your implementation. You will not have to switch back to generic programming then.

The generic components from the Standard C++ Library will be hidden behind an object-oriented wrapper and will serve solely as a portability layer. They will be invisible both in design and implementation.

Conclusion

There is a mismatch between object-oriented design / object-oriented programming on the one hand and generic programming / use of generic components on the other hand. You cannot appropriately express generic programming in a an object-oriented design. Either you refrain from doing object-oriented design (scenario 1) and stick to the generic programming paradigm. Or you do object-oriented design, fake an object-oriented container model for that purpose, and accept that your implementation does not exactly match the design (scenario 2a). Or you hide the generic components under an object-oriented layer that you can include both in your design and implementation (scenario 2b). All three alternatives are viable approaches. However, generic programming seems to be suitable for a smaller number of application domains compared to object-oriented programming, which is a true general purpose programming paradigm that fits many needs. For this reason we doubt that basing larger projects solely on generic programming is reasonable. There is evidence that object-oriented programming helps mastering large projects and there is no reason for throwing over board this realization. In any case, C++ is a multi-paradigm programming language and techniques and components from its Standard Library are useful in many cases. We will continue exploring them in further columns.

Stereotypes - A Mechanism for Metaclassification

The Unified Modeling Language (UML), which is currently under definition at Rational Software, introduces a concept called stereotypes in its latest version, which is UML V0.91 (see [ix] for reference). Stereotypes address the problem of metaclassification of an element in UML. An example of an element that needs metaclassification is an exception class; it has the special property that it should generally only be thrown or caught, but cannot be used in a different context: No other use of an exception class is intended. In classifying an exception class as a «exception» stereotype allows to express these special semantics. There are predefined stereotypes in UML; «exception» is one of them. However, UML allows users to add new kinds of stereotypes along with a specification of their semantics. Stereotypes are a means that allows end users some degrees of freedom in tailoring the modeling language to their needs.

We can imagine that stereotypes will help modeling generic components. For instance, we can conceive of semantic specifications for «container», «iterator», and «algorithm» stereotypes, which then can be used for modeling the generic data structures and algorithms from the Standard C++ Library and the STL. That means it looks like UML can serve as a design language that can handle the mix of object oriented design and generic programming elements on one hand and, on the other hand, can be used for a pure generic programming design.

However, this is a look into the crystal ball; stereotypes cannot yet be considered common practice in software design since Unified Modeling Language V1.0 is not yet born. Still, stereotypes might become the "generic programming modeling language" as well as a way to integrate object oriented design and generic programming.

References

Books ... for those who enjoy some German reading once and a while ...

[i] Mark Nelson
C++ Programmer’s Guide
to the Standard Template Library
IDG Books Worldwide, 1995
ISBN 1-56884-314-3 [iv] Ulrich Breymann
Die C++ Standard Template Library
Addison Wesley Longman, 1996

[ii] David R. Musser and Atul Saini
STL Tutorial and Reference Guide
Addison-Wesley Publishing, 1996
ISBN 0-201-63398-1 [v] Nicolai Jossuttis
Die C++ Standardbibliothek
Addison Wesley Longman, 1996

[iii] Graham Glass and Brett Schuchart
The STL<Primer>
Prentice Hall, 1995

Articles URLs

[vi] Klaus Kreft and Angelika Langer
Iterators in the Standard C++ Library
C++ Report, Nov.-Dec. 1996 [ix] Rational Software, Inc.
Unified Modeling Language V0.91 Addendum
http://www.rational.com/ot/uml.html

[vii] Klaus Kreft and Angelika Langer
Building a Safe Iterator
C++ Report, February 1997

[viii] Bjarne Stroustrup
Making a vector fit for a standard
C++ Report, October 1994

If you are interested to hear more about this and related topics you might want to check out the following seminar:

Seminar

Effective STL Programming - The Standard Template Library in Depth
4-day seminar (open enrollment and on-site)

IOStreams and Locales - Standard C++ IOStreams and Locales in Depth
5-day seminar (open enrollment and on-site)