|HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | NEWSLETTER | CONTACT | | ||
Java Generics - Introduction
Language Features of Java Generics
Introduction and Overview
JavaPro Online, March 2004
Language Features - Overview and IntroductionJDK15 /). This new language feature, known as Java Generics (JG), is a major addition to the core language. In this article we will give an overview of the new feature. JDK15 /). Typically, the implementation of a collection of objects is independent of the type of the objects that the collection maintains. For this reason, it does not make sense to reimplement the same data structure over and over again, just because it will hold different types of elements. Instead, the goal is to have a single implementation of the collection and use it to hold elements of different types. In other words, rather than implementing a class IntList and StringList for holding integral values and strings respectively, we want to have one generic implementation List that can be used in either case.
In Java, this kind of generic programming is achieved today (in non-generic Java) by means of Object references: a generic list is implemented as a collection of Object references. Since Object is the superclass of all classes the list of Object references can hold references to any type of object. All collection classes in the Java platform libraries (see / JDK15 /) use this programming technique for achieving genericity.
As a side effect of this idiom we cannot have collections of values of primitive type, like a list of integral values of type int , because the primitive types are not subclasses of Object . This is not a major restriction because every primitive type has a corresponding reference type. We would convert int s to Integer s before we store them in a collection - a conversion that is known as boxing and that will be supported as an automatic conversion ( autoboxing ) in JDK 1.5 (see / BOX /).
A Java collection is very flexible; it can be used for holding reference to all types of objects. The collection need not even be homogeneous, that is, hold objects of the same type, but it can equally well be heterogeneous, that is, contain a mix of objects of different types. Using generic Java collections is straightforward. Elements are added to the collection by passing element reference to the collection. Each time we extract an object from a collection we receive an Object reference. Before we can effectively use the retrieved element, we must restore the element’s type information. For this purpose we cast the returned Object reference down to the element’s alleged type. Here is an example:
LinkedList list = new LinkedList();
We must cast the Object reference returned from method get() down to type Integer . The cast is safe because it is checked at runtime. If we tried a cast to a type different from the extracted element’s actual type, a ClassCastException would be raised, like in the example below:
String s = (String) list.get(0); // fine at compile-time, but fails at runtime with a ClassCastException
The lack of information about a collection’s element type and the resulting need for countless casts in all places where elements are extracted from a collection is the primary motivation for adding parameterized types to the Java programming language. The idea is to adorn the collection types with information about the type of elements that they contain. Instead of treating every collection as a collection of Object references, we would distinguish between collections of references to integers and collections of references to strings. A collection type would be a parameterized (or generic) type that has a type parameter, which would specify the element type. With a generic list, the previous example would look like this:
list = new LinkedList
Note, that the get() method of a generic list returns a reference to an object of a specific type, in our example of type Integer , in which case the cast from Object to Integer is not needed any longer. Also, use of the extracted element as though it were of a different type would now be caught at compile time already, rather than at runtime. The example below would simply not compile:
String s = list.get(0); // compile-time error
This way, Java Generics increase the expressive power of the language and increase the type safety of the language by enabling early static checks instead of late dynamic checks.
Java Generics do not only provide us with parameterized collection types
like the one we used in the example above, it also allows us to implement
generic types ourselves. In order to see, how we can use the new
language feature for our own Java programs let us explore Java Generics
in more depth. In the following we will briefly look at the syntax of the
definition of generic types and further language features related to Java
Parameterized types have type parameters. In our example they have exactly one parameter, namely A . In general, a parameterized type can have arbitrarily many parameters. In our example, the parameter A stands for the type of the elements contained in the collection. A parameter such as A is also called a type variable . Type variables can be imagined as placeholders that will later be replaced by a concrete type. For instance, when an instantiation of the generic type, such as LinkedList<String> , is used, A will be replaced by String .
Later in this article we will see that there are restrictions regarding
the use of type variables and we will realize that a type variable cannot
be used like a type, i.e. the analogy with a “placeholder for a type”
is not fully correct, just an approximation of what a type variable is.
But for the time being, let’s regard the type variable as a placeholder
for a type – the type of the elements contained in the collection, in our
Imagine we would want to implement a hash-based collection, like a hash
table. A hash-based collection needs to calculate the entries’ hash
codes. However, the element type is unknown in the implementation
of a parameterized hash table. Only the type variable representing
the element type is available. Listing 2 shows an excerpt of the implementation
of a parameterized hash table. It is a parameterized class that has
two type parameters for the key type and the associated value type.
As we can see, the implementation of the hash table does not only move around references to the entries, but also needs to invoke methods of the key type, namely hashCode() and equals() . Both methods are defined in class Object . Hence the hash table implementation requires that the type variables Key and Data be replaced by concrete types that are subtypes of Object . Later in this article we will see that this is always guaranteed, because primitive types are prohibited as type arguments to generics. A concrete type that replaces a type variable must be a reference type and for this reason we can safely assume that the key type has the required methods.
What if needed to invoke methods that are not defined in class
Consider the implementation of a tree-based collection. Tree-based
collections, like a
, require a sorting order for the contained
elements. Element types can provide the sorting order by means of
method, which is defined in the
interface. The implementation of a tree-based collection might therefore
want to invoke the element type’s
3 below is a first attempt of an implementation of a parameterized
The parameterized class
has two type parameters
; no requirements are imposed on either of these type
variables. With this implementation we could create an
even if the key type
did not implement the
interface and had no
method. The invocation
, or more precisely, the cast of the key object
to the type
for the incomparable key type
would then fail at runtime with a
In order to allow for an early compile-time check, Java Generics has a language feature named bounds : type variables of a parameterized type can have one or several bounds. Bounds are interfaces or superclasses that a type variable is required to implement or extend. If a parameterized type is instantiated with a concrete type argument that does not implement the required interface(s) or the required superclass, then the compiler will catch that violation of the requirements and will issue an error message.
In our example, we could require that the key type of our
must implement the interface
a bound for the type variable
. The modified implementation
is shown in Listing 5 below
Now the attempt of using a key type that does not implement the
interface will be rejected by the compiler, like in the example in Listing
The primary purpose of bounds is to enable early compile-time checks.
A type variable can have several bounds. The syntax is: TypeVariable extends Bound 1 & Bound 2 & ... & Bound n
Here is an example:
final class Pair<A extends Comparable<A> & Cloneable<A>,
As the example above suggests, type variables can appear in their bounds. For instance, the type variable A is used as type argument to the parameterized interface Comparable , whose instantiation Comparable<A> is a bound of A .
There is a restriction regarding bounds that are instantiations of a parameterized interface: the different bounds must not be instantiations of the same parameterized interface. The following would be illegal:
class SomeType<T extends
& Comparable<String> & Comparable<StringBuffer>
This restriction stems from the way Java Generics are implemented and will be explained later in this article.
Classes can be bounds, too. The concrete type argument is then required to be a subclass of the bounding class or it can be the same class as the bounding class. Even final classes are permitted as bounds. Bounding classes, like interfaces, give access to non-static methods that the concrete type argument inherits from its superclass. Bounding classes do not give access to constructors and static methods. The bounding superclass must appear as the first bound in a list of bounds. Hence the syntax for specification of bounds is:
TypeVariable implements Superclass & Interface
& ... & Interface
Listing 7 shows the example of a parameterized static method
Parameterized methods are invoked like regular non-generic methods.
The type parameters are inferred from the invocation context. In our example,
the compiler would automatically invoke
The type inference algorithm is significantly more complex than this simple
example suggests and exhaustive coverage of type inference is beyond the
scope of this article.
List<? extends Number> ref = new LinkedList<Integer>();
In this statement List<? extends Number> ist is a wildcard instantiation, while LinkedList<Integer> is a regular instantiation.
There are 3 types of wildcards: “ ? extends Type ”, “ ? super Type ” and “ ? ”. Each wildcard denotes a family of types. “ ? extends Number ” for instance is the family of subtypes of type Number , “ ? super Integer ” is the family of supertypes of type Integer , and “ ? ” is the set of all types. Correspondingly, the wildcard instantiation of a parmeterized type stands for a set of instantiations; e.g. List<? extends Number> refers to the set of instantiations of List for types that are subtypes of Number .
Wildcard instantiations can be used for declaration of reference variables, but they cannot be used for creation of objects. Reference variables of an wildcard instantiation type can refer to an object of a compatible type, though. Compatible in this sense are concrete instantiations from the family of instantiations denoted by the wildcard instantiation. In a way, this is similar to interfaces: we cannot create objects of an interface types, but a variable of an interface type can refer to an object of a compatible type, “compatible” meaning a type that implements the interface. Similarly, we cannot create objects of a wildcard instantiation type, but a variable of the wildcard instantiation type can refer to an object of a compatible type, “compatible” meaning a type from the corresponding family of instantiations.
Access to an object through a reference variable of a wildcard instantiation type is restricted. Through a wildcard instantiation with “extends“ we must not invoke methods that take arguments of the type that the wildcard stands for. Here is an example:
List<? extends Number> list = new LinkedList<Integer>();
The add() method of type List takes an argument of the element type, which is the type parameter of the parameterized List type. Through a wildcard instantiation such as List<? extends Number> it is not permitted to invoke the add() method. Similar restrictions apply to wildcards with “super“: methods where the return type is the type that the wildcard stands for are prohibited. And for reference variables with a “ ? “ wildcard both restrictions apply.
This brief overview of wildcard instantiations is far from comprehensive;
exhaustive coverage of wildcards is beyond the scope of this article.
In practice, wildcard instantiations will most frequently show up as argument
or return types in method declarations, and only rarely in the declaration
of variables. The most useful wildcard is the “extends” wildcard.
Examples for the use of this wildcard can be found in the J2SE 1.5 platform
libraries; an example is the method
extends ElementType> c)
allows addition of elements to a
of element type
where the elements are taken from a collection of elements that are of
a subtype of
There are many more details not covered here. We want to use the remainder of the article to explore some of the underlying principles of Java generics, in particular the translation of paramterized types and methods into Java byte code. While this sounds pretty technical and mainly like a compiler builder’s concern, an understanding of these principles aids understanding of many of the less obvious effects related to Java generics.
This is particularly wasteful in cases where the elements in a collection are references (or pointers), because all references (or pointers) are of the same size and internally have the same representation. There is no need for generation of mostly identical code for a list of references to integers and a list of references to strings. Both lists could internally be represented by a list of references to any type of object. The compiler just has to add a couple of casts whenever these references are passed in and out of the generic type or method. Since in Java most types are reference types, it deems natural that Java chooses code sharing as its technique for translation of generic types and methods. [C#, by the way, uses both translation techniques for its generic types: code specialization for the value types and code sharing for the reference types.]
One downside of code sharing is that it creates problems when primitive types are used as parameters of generic types or methods. Values of primitive type are of different size and require that different code is generated for a list of int and a list of double for instance. It’s not feasible to map both lists onto a single list implementation. There are several solutions to this problem:
The translation technique used by the Java compiler can be imagined as a translation from generic Java source code back into regular Java code. The translation technique is called type erasure : the compiler removes all occurrences of the type variables and replaces them by their leftmost bound or type Object , if no bound had been specified. For instance, the instantiations LinkedList<Integer> and a LinkedList<String> of our previous example (see Listing 1) would be translated into a LinkedList<Object> , or LinkedList for short, and the methods <Integer>max() and <String>max() (from Listing 7) would be translated to <Comparable>max() . In addition to removal of all type variables and replacing them by their leftmost bound the compiler inserts a couple of casts in certain places and adds so-called bridge methods where needed.
The translation from generic Java code into regular Java code was deliberately chosen by the Java designers. One key requirement to all new language features in Java 1.5 is their compatbility with previous versions of Java. In particular it is required that a pre-1.5 Java virtual machine must be capable of executing 1.5 Java code. This is only achievable if the byte code resulting from a 1.5 Java source looks like regular byte code resulting from pre-1.5 Java code. Type erasure meets this requirement: after type erasure there is no difference any more between a parameterized and a regular type or method.
For explanatory reasons we described the type erasure as a translation not from generic Java code into regular non-generic Java code. This is not exactly true; the translation is from generic Java code directly to Java byte code. Despite of that we will refer to the type erasure process as a translation from generic Java to non-generic Java for the subsequent explanations.
Listing 8 below illustrates the translation by type erasure; is shows
the type erasure of our previous example of generic types from Listing
As you can see, all occurrences of the type variable A are replaced by type Object . The implementation of our generic collection is now exactly like an implementation that uses the traditional Java technique for genericity, namely implementation in terms of Object references.
The sample code also gives an example of an automatically inserted cast: in the main() method, where a linked list of strings is used, the compiler added a cast from Object to String .
Listing 9 below shows the type erasure of our parameterized
from Listing 7.
Again, all occurrences of type variables are replaced by either type Object (in the Comparable interface) or the leftmost bound (type Comparable in method max() ). Again, we see the inserted cast from Object to Byte in the main() method where the generic method is invoked for a collection of Byte s. And we see an example of a bridge method in class Byte .
The compiler inserts bridge methods in subclasses to ensure overriding
works correctly. In the example, class
and must therefore override the superinterface’s
The compiler translates the
method of the generic
to a method that takes an
and translates the
method in class
a method that takes a
. After this translation, method
is no overriding version of method
longer, because the two methods have different signatures as a side effect
of translation by erasure. In order to enable overriding the compiler
adds a bridge method to the subclass. The bridge method has the same
signature as the superclass’s method that must be overridden and delegates
to the other methods in the derived class that was the result of translation
|© Copyright 1995-2012 by Angelika Langer. All Rights Reserved. URL: < http://www.AngelikaLanger.com/Articles/JavaPro/01.JavaGenericsIntroduction/JavaGenerics.html> last update: 4 Nov 2012|