Iterators in the Standard C++ Library
Iterators in the Standard C++ Library
C++ Report, Nov./Dec. 1996
Klaus Kreft & Angelika Langer
THE ISO/ANSI STANDARD C++ library will contain a set of general-purpose
data structures and algorithms. They were proposed to the standards committee
by Alexander Stepanov and Meng Lee of Hewlett-Packard in 1994. The proposal
was based on what is known as the Standard Template Library (STL). The
committee scrutinized the proposal, found it sound and convincing, and
voted it in. Thus, the STL became part of the Standard C++ Library.
At the time the STL was proposed to the committee, it was made
available to the public, thanks to Hewlett-Packard. C++ programmers started
to experiment with it and use it. Independently, the standardization process
continued; changes and additions were made since 1994. Today the containers
and algorithms in the Standard C++ Library depart from the original they
stem from. Nowadays we have both: the original STL, in parallel to the
Standard C++ Library.
In a previous article
[1]
, we introduced you
to the design of iterators in the STL; it was a tutorial on compile-time
polymorphism. Compile-time polymorphism allows the C++ compiler to select
which of your functions to invoke based on specific properties of their
parameters without introducing the runtime overhead of inheritance and
dynamic binding.
Until March 1996, iterators in the STL and iterators in the Standard
C++ Library were almost identical. Today they differ. Bjarne Stroustrup
proposed an alternative design for iterators in the Standard C++ Library,
which was accepted by the standards committee.
In this article we will explain the new design of iterators in the Standard
C++ Library. We want to introduce you to several general-purpose generic
programming techniques and idioms that are used in the standard for building
iterators. However, these techniques and idioms are not limited to this
particular domain. In subsequent issues of the C++ Report, we'll continue
to cover STL and the Standard C++ library in our new column entitled "Effective
C++ Standard Library."
BACKGROUND
Let us start with a short recap: What are iterators
in STL and Standard C++ Library?
Iterators
Iterators are pointer-like objects that allow programs
to step through the elements of a container sequentially without exposing
the underlying representation. Iterators can be advanced from one element
to the next by incrementing them. Some iterators can also be decremented
or allow arbitrary jumps from one element to another, as we will see later.
When they are dereferenced, iterators yield a reference to a container
element. In addition, they can be compared to each other for equality or
inequality.
Iterators interact seamlessly with built-in C++ types. In particular,
native C++ pointers are treated as iterators to C++ arrays. Naturally,
all containers in the Standard C++ Library define an iterator type, i.e.,
a nested type iterator that represent their respective pointer-like type.
Iterator Categories
Iterators fall into categories. This is because
different algorithms impose different requirements on an iterator they
use. For example, the find() algorithms needs an iterator that can be advanced
by incrementing it, whereas the reverse() algorithm needs an iterator that
can be decremented as well, etc. Ultimately, there are five categories
of iterators in STL and Standard C++ Library:
-
input iterators
-
output iterators
-
forward iterators
-
bidirectional iterators
-
random access iterators
For detailed references consult an STL or C++ Standard Library
manual or take a look at one of the publications we have listed as references.
[2-6]
The first sidebar gives a brief overview of Iterator categories.
An iterator category is an abstraction. It represents a set
of requirements to an iterator.
Using the knowledge of an iterator's category one can provide
optimized implementations of an algorithm. The advance() operation is an
example. It increments (or decrements for negative n) an iterator.
template <class Iterator, class Distance>
inline void advance (Iterator& i, Distance n);
Obviously, there are many ways to do this. For a C++ array
one would simply perform pointer arithmetic, i.e., add n to the C++ pointer.
i += n;
For a list, Iterators must step through the sequence and advance
step-by-step.
if (n >= 0)
while (n--)
++i;
else
while (n++)
--i;
Observation 1
: The iterator category, which is an abstraction
that represents a set of requirements to an iterator, is information related
to an iterator. It is useful for providing optimized versions of an operation
like advance();
Certain type information is related to an iterator. One of the questions
that must be answered when designing the iterators in STL and the Standard
C++ Library is: how can information related to a type be expressed in C++?
Let's defer this discussion for a moment until we discuss other information
related to an iterator.
TYPES ASSOCIATED WITH AN ITERATOR
There are two types that might vary depending on the iterator type:
The Distance Type
An operation like advance() obviously needs
an argument that indicates how far to advance the iterator:
template <class Iterator, class Distance>
inline void advance (Iterator& i, Distance n);
The type of this distance argument must represent the distance between
any two iterators. Hence the distance type depends on the iterator type.
For C++ pointers the distance type is the C++ type ptrdiff_t, which can
represent the differenc between any two C++ pointers. Also, ptrdiff_t is
the distance type of all other iterators in STL and Standard Library. However,
the distance type in STL and the Standard C++ Library is not limited to
ptrdiff_t.
ITERATOR CATEGORIES
There are five categories of Iterators in STL and the Standard
C++ Library:
-
Input iterators allow algorithms to advance the iterator and give "read
only" access to the value.
-
Output iterators allow algorithms to advance the iterator and give "write
only" access to the value.
-
Forward iterators combine read and write access, but only in one direction
(i.e., forward).
-
Bidirectional iterators allow algorithms to traverse the sequence in both
directions, forward and backward.
-
Random access iterators allow jumps and "pointer arithmetics."
Each category adds new features to the previous one. The iterator
categories obey the following order:
The Value Type
An iterator can be dereferenced. It then
returns a reference to a value stored in a container. The type of this
referenced value also depends on the respective iterator. For example,
if the iterator refers to a container holding integers, the value type
will be int. More generally, if the iterator refers to a container that
stores elements of an arbitrary type T, the value type will be T.
Observation 2
: Each iterator has two related types, its value
type and its distance type.
So, we have found more information related to an iterator type. What
is this information used for?
Value type and distance type are sometimes needed to implement algorithms.
In STL and Standard C++ Library algorithms are separated from containers,
i.e., an algorithm takes an iterator and uses it to access the container.
No information about the container itself is available to an algorithm.
This clear separation of containers and algorithms is the basic idea of
Generic Programming, which is the key design idea behind the STL.
To implement algorithms only in terms of iterators it is often necessary
to infer both the value type and the distance type from an iterator. Here
is an example of a fictitious reverse algorithm:
template <class BidirectionalIterator>
void reverse(BidirectionalIterator first, BidirectionalIterator last)
{
distance_type n = distance(first, last);
--n;
while(n>0)
{ value_type tmp = *first;
*first++ = *--last;
*last = tmp;
n-=2;
}
}
Note that distance_type and value_type aren't declared anywhere.
How do we know what they are? We need to find a way to make available the
distance type and value type associated to the iterator type.
Also, we want to implement optimized versions of operations like advance()
for different iterator categories. Here we would want to make use of the
related iterator category information. Assuming the category could be represented
as a C++ type a tentative implementation could look like this:
template <class InputIterator, class Distance>
inline void advance (InputIterator& i, Distance n)
{
__advance(i, n, iterator_category);
}
template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i, Distance n, random_access_iterator_category)
{
i += n;
}
template <class BidirectionalIterator, class Distance>
inline void __advance (BidirectionalIterator& i, Distance n,
bidirectional_iterator_category)
{
if (n>=0)
while (n--) ++i;
else
while (n++) --i;
}
Again, it is not entirely obvious how we can represent and
eventually find the category information associated to an iterator type.
We'll explain how this is done shortly.
THE REQUIREMENTS
Basically, the requirements to the design of
iterators in the STL and the Standard C++ Library are:
Algorithms shall be generic
, i.e., they shall be implemented
only in terms of iterators. Hence we need to deduce type information associated
with an iterator type, like the value and distance type, directly from
the iterator type itself.
Operations shall be optimized if possible
. In other words, for
reasons of efficiency we want to implement polymorphic operations that
exhibit different behavior depending on some property of their argument
types. For example, different versions of the advance() function, optimized
for each iterator category, should be available.
There are additional requirements that stem from other design principles
of the STL and the Standard C++ Library:
Everything must be resolved at compile-time rather than at runtime
.
The goal is to make components as fast as possible at run-time. Hence,
the function dispatch of polymorphic operations mentioned above can be
performed at compile-time.
C++ pointers are iterators on a C++ array
. They stand on equal
footing with all other iterator types. Hence, whatever solution we choose,
it must work for C++ pointers as well.
The way in which information is associated to an iterator type is
generic
. For example, the fact that an iterator is associated to a
certain value type has nothing to do with the fact that the iterator falls
into a certain category.
In the July 1996 issue,
[1]
we explained the
solution to the above requirements that was used in the original STL. Now,
we'll explore the solution chosen for the Standard C++ Library.
THE SOLUTION CHOSEN FOR THE STANDARD C++ LIBRARY
In the Standard
C++ Library iterators have
traits
. The traits technique is a means
to associate information with class types as well as with nonclass types.
See the second sidebar for more information about the traits technique
in general. Here are the iterator traits:
template <class Iterator>
struct iterator_traits
{
typedef Iterator::distance_type distance_type;
typedef Iterator::value_type value_type;
typedef Iterator::iterator_category iterator_category;
};
The iterator traits are a class template to be instantiated
with an iterator type that provides all the information associated with
the respective iterator type as typedefs. Therefore, all iterators must
define nested types, like Iterator::value_type, for each piece of related
information. The value and distance type of an iterator are types in the
sense of the C++ type system. Similarly, the information about the iterator
category is represented by C++ types as well: There are
tag classes
for each of the iterator categories. They are empty structures named input_iterator_tag,
output_iterator_tag, and so on.
Since native C++ pointers can be treated as iterators, we need traits
for C++ pointers, as well. This is provided as a
partial specialization
of the iterator_traits template:
template <class T>
struct iterator_traits<T*>
{
typedef ptrdiff_t distance_type;
typedef T value_type;
typedef random_access_iterator_tag iterator_category;
};
Now let us see how iterator traits help to meet the requirements. Here
is our ctitious reverse function again:
template <class BidirectionalIterator>
void reverse(BidirectionalIterator first, BidirectionalIterator last)
{
distance_type n = distance(first, last);
--n;
while(n>0)
{ value_type tmp = *first;
*first++ = *--last;
*last = tmp;
n-=2;
}
}
Now that we have iterator traits it is easy to determine the mysterious
distance and value type from the iterator type: The distance type is iterator_traits
<BidirectionalIterator> ::distance_type, and the value type is iterator_traits
<BidirectionalIterator> ::value_type.
Also, we wanted to offer versions of the advance() function that are
optimized for each iterator category. Here is how this function dispatch
is done using iterator traits:
template <class InputIterator, class Distance>
inline void advance (InputIterator& i, Distance n)
{
__advance(i, n, iterator_traits<InputIterator>::iterator_category());
}
template <class RandomAccessIterator, class Distance>
inline void __advance (RandomAccessIterator& i, Distance n, random_access_iterator_tag)
{
i += n;
}
template <class BidirectionalIterator, class Distance>
inline void __advance (BidirectionalIterator& i, Distance n, bidirectional_iterator_tag)
{
if (n >= 0)
while (n--)
++i;
else
while (n++)
--i;
}
The global advance() function calls the overloaded versions of an __advance()
function, which takes an iterator tag as function argument. The iterator
tags are of different types. Each type of iterator tag represents an iterator
category. Consequently, there are five types of iterator tags:
-
input_iterator_tag
-
output_iterator_tag
-
forward_iterator
-
bidirectional_iterator_tag
-
random_access_iterator_tag
These are empty types; their only purpose it to allow the above
described function overloading.
Note that function overloading is always resolved at compile-time. No
runtime overhead is imposed.
In addition, note that there are iterator traits for C++ pointers. Thus,
the solution works for C++ pointers as well, as stated in the original
requirements we listed above.
COMPARING THE STL TECHNIQUE TO THE STANDARD LIBRARY TECHNIQUE
The solution using iterator traits meets the requirements listed above.
So did the solution in STL. What did we gain? Why was it changed? The iterator
traits have an edge over the old solution. Let's see why.
Consider the count algorithm:
template<class InputIterator, class T>
iterator_traits<InputIterator>::distance_type
count (InputIterator first, InputIterator last, const T& val)
{
iterator_traits<InputIterator>::distance_type n = 0;
while (first != last)
if (*first++ == val)
++n;
return n;
}
The distance type is used here as the return type of an algorithm. Without
iterator traits the count algorithm cannot deduce its return type from
its iterator type. This explains why the count algorithm in STL looks like
this:
template <class InputIterator, class T, class Distance>
void count (InputIterator first, InputIterator last, const T& value, Distance& n)
{
while (first != last)
if (*first++ == value)
++n;
}
The distance type is explicitly specified by an additional
template parameter. As template parameters can only be automatically deduced
from the function argument types and never from the function return type,
the result is returned via a fourth function argument. This way a call
to the count algorithm in STL is error-prone and looks counterintuitive:
list<string> lst;
int i = 0; // you'd better not forget to initialize !!!
count(lst.begin(),lst.end(),"Hello",i)
Compare to the call of the count algorithm in the Standard C++ Library:
list<string> lst;
int i = count(lst.begin(),lst.end(),"Hello")
Conclusion: The most convincing advantage of iterator traits compared to
the old technique is that types associated to an iterator type can be deduced
from the iterator alone. Hence associated types can be used as return types
of function arguments, which improves the count algorithm.*
Of course there is more to iterator traits. They improve the design
of iterators in the general. Consider for instance the way the iterator
category is retrieved in STL. It is via the global function
iterator_category().For each type of iterator, including C++ pointers,
there is an overloaded versions of iterator_category(). There is one version
taking C++ pointers, and five function templates taking iterator types
that are derived from certain iterator base classes. For sake of brevity,
we will spare you the details of those base classes. See our July '96 article
for reference
[1]
.
template <class T, class Distance>
inline input_iterator_tag iterator_category
(const input_iterator<T,Distance>&)
{
return input_iterator_tag();
}
// etc. for output_iterator, forward_iterator,
// and bidirectional_iterator
template <class T, class Distance>
inline random_access_iterator_tag iterator_category
(const random_access_iterator<T,Distance>&)
{
return random_access_iterator_tag();
}
template <class T>
inline random_access_iterator_tag iterator_category
(const T*)
{
return random_access_iterator_tag();
}
The iterator_category() functions are empty. Their only purpose is to allow
the selection of the associated iterator tag type, which is then used for
function overload resolution.
It is certainly more intuitive to do a compile-time function dispatch
via a type, i.e., iterator_traits <Iterator> ::itertor_category, opposed
to the return type of a function, iterator_category(). It somehow suggests
that the function has "functionality," i.e., performs something at runtime,
which in fact it doesn't.
SUMMARY
This article is the second in a series of articles on
the iterators in STL and the Standard Library. It described iterator traits,
which is the technique used to design iterators in the Standard C++ Library.
It is a general-purpose technique for association of types to other types;
it meets the following requirements:
The technique must allow a C++ compiler to
deduce the associated type only from the type, i.e., without requiring
any other information or mechanism,
perform the deduction at compile-time so that it can be used for function
overloading, and
associate types with nonclass types.
We compared this technique to the one used in the STL. The iterator
traits technique is more elegant and less baroque than the STL technique.
Also it allows us to improve the interface of certain algorithms.
Our forthcoming column in C++ Report will demonstrate the use of both
techniques in terms of an example-we will illustrate how you can build
a safe iterator adaptor.
THE TRAITS IDIOM
Idioms for the use of templates in C++ are still evolving.
One reason might be that, for a long time, templates were only seen as
a mechanism to implement C++ containers in a generic way. Coplien,
[7]
for instance, discusses only this usage. To our recollection, the rst book
that contained more advanced template idioms was by Barton and Nackman.
[8]
,
One idiom they describe is the nested type definitions for name commonality
idiom. Here is an example for this idiom from the STL:
template <class T>
class vector {
public:
typedef typename allocator_type::types<T>::pointer iterator;
typedef typename allocator_type::types<T>::const_reference
const_reference;
//
};
This code is a simplified excerpt from the class declaration of vector.
Like all other containers, vector establishes common names by using typedefs
in its public interface. The type iterator can be used for implementing
the algorithms, for instance:
template <class Container>
void delete (Container& c,
Container::iterator first,
Container::iterator last);
An alternative design without the nested type definitions for iterator
would require a third template parameter for delete: the type of the container's
iterator.
There are several benefits of using this idiom. First, it reduces the
number of template parameters. Second, it clearly expresses the design
relationship between the template parameter and its associated class(es),
which makes the code more understandable and maintainable. Third, it eliminates
a whole category of mistakes you could otherwise make inadvertently.
Without the nested type definition for a container's iterator the delete
function would look like this:
template <class Container, class Iterator>
void delete (Container& c, Iterator first, Iterator last);
Hence you can do something like this:
void foo(vector<int>& v, int* first, int* last)
{ delete(v,first,last); }
which works fine, because in most implementations of the STL
a vector's iterator type is a plain pointer to the vector's value type.
Now imagine you would change the type of container from a vector to a list
later on:
void foo(list<int>& v, int* first, int* last)
{ delete(v,first,last); }
The catch is that this will compile and crash at runtime. This is different
with the interface for the delete function that uses the nested typedef
for the container's iterator type; it would already cause a compile time
error instead.
Unfortunately the solution is limited: no built-in types can
be used as template parameters because it is not possible to put associated
types into the scope of a built-in type. For the iostream of the Standard
Library Nathan Myers developed an idiom called "traits" which overcomes
this limitation. He gave a detailed description of this idiom in the June
1995 C++ Report.
[9]
,
Basically it is the introduction of an additional helper class template
and its specializations for built-in types which contain the type definitions
of the associated types. In the iostream the associated types are the traits
of a certain character type, e.g. the type of the end-of-file character.
Hence the name traits. Here is an example how we can use this idiom to
enhance our example so that it can be used with built-in C++ arrays.
Here is the above mentioned helper class:
template <class Container>
class container_traits {
public:
typedef typename Container::iterator iterator;
};
and the partial specialization for built-in pointers:
template <class T>
class container_traits<T*> {
public:
typedef typename T* iterator;
};
Our algorithm can be improved and would now work for C++ arrays as well:
template <class Container>
void delete (Container& c,
container_traits<Container>::iterator first,
container_traits<Container>::iterator last);
The caveat with the traits technique is that it often requires partial
specialization of templates. No compiler we know of supports this language
feature at present.
References
1. Kreft, K., and A. Langer. "Iterators in the STL,"
C++ Report 8(7), July 1996.
2. Vilot, M. J. "An introduction to the Standard Template
Library," C++ Report 6(8), Oct. 1994.
3. Nelson, M. C++ Programmer's Guide to the Standard
Template Library, IDG Books, Foster City, CA, 1995.
4. Working Paper for Draft Proposed International Standard
for Information Systems Programming Language C++, Doc. No. X3J16/95-0185
- WG21/N0785, Sept. 26, 1995.
5. Glass, G., and B. Schuchert. The STL<Primer>,
Prentice Hall, Englewood Cliffs, NJ, 1995.
6. Musser, D. R., and A. Saini. STL Tutorial and Reference
Guide, Addison-Wesley, Reading, MA, 1996.
7. Coplien, J. O. Advanced C++ Programming Styles and
Idioms, Addison-Wesley, Reading, MA, 1992.
8. Barton, J. J., and L. R. Nackman. Scientific and
Engineering C++, Addison-Wesley, Reading, MA, 1994.
9. Myers, N. "A new and useful Template technique: 'Traits,'
" C++ Report, June 1995.
If you are interested to hear more about this
and related topics you might want to check out the following seminar:
|
Seminar
|
|