Generics is a Java feature that was introduced with Java SE 5.0 and, few years after its release, I swear that every Java programmer out there not only heard about it, but used it. There are plenty of both free and commercial resources about Java generics and the best sources I used are:
Despite the wealth of information out there, sometimes it seems to me that many developers still don’t understand the meaning and the implications of Java generics. That’s why I’m trying to summarize the basic information developers need about generics in the simplest possible way.
The Motivation for Generics
The simplest way to think about Java generics is thinking about a sort of a syntactic sugar that might spare you some casting operation:
2 | Apple apple = box.get( 0 ); |
The previous code is self-speaking: box is a reference to a List of objects of type Apple. The get method returns an Apple instance an no casting is required. Without generics, this code would have been:
2 | Apple apple = (Apple) box.get( 0 ); |
Needless to say, the main advantage of generics is having the compiler keep track of types parameters, perform the type checks and the casting operations: the compiler guarantees that the casts will never fail.
Instead of relying on the programmer to keep track of object types and performing casts, which could lead to failures at runtime difficult to debug and solve, the compiler can now help the programmer enforce a greater number of type checks and detect more failures at compile time.
The Generics Facility
The generics facility introduced the concept of type variable. A type variable, according to the Java Language Specification, is an unqualified identifier introduced by:
- Generic class declarations
- Generic interface declarations
- Generic method declarations
- Generic constructor declarations.
Generic Classes and Interfaces
A class or an interface is generic if it has one or more type variable. Type variable are delimited by angle brackets and follow the class (or the interface) name:
1 | public interface List<T> extends Collection<T> { |
Roughly speaking, type variables act as parameters and provide the information the compiler needs to make its checks.
Many classes in the Java library, such as the entire Collections Framework, were modified to be generic. The List interface we’ve used in the first code snippet, for example, is now a generic class. In that snippet, box was a reference to a List<Apple> object, an instance of a class implementing the List interface with one type variable: Apple. The type variable is the parameter that the compiler uses when automatically casting the result of the get method to an Apple reference.
In fact, the new generic signature or the get method of the interface List is:
The method get returns indeed an object of type T, where T is the type variable specified in the List<T> declaration.
Generic Methods and Constructors
Pretty much the same way, methods and constructors can be generic if they declare one or more type variables.
1 | public static <t> T getFirst(List<T> list) |
This method will accept a reference to a List<T> and will return an object of type T.
Examples
You can take advantage of generics in both your own classes or the generic Java library classes.
Type Safety When Writing…
In the following code snippet, for example, we create an instance List<String> of populate it with some data:
1 | List<String> str = new ArrayList<String>(); |
If we tried to put some other kind of object into the List<String>, the compiler would raise an error:
… and When Reading
If we pass the List<String> reference around, we’re always guaranteed to retrieve a String object from it:
1 | String myString = str.get( 0 ); |
Iterating
Many classes in the library, such as Iterator<T>, have been enhanced and made generic. The iterator() method of the interface List<T> now returns an Iterator<T> that can be readily used without casting the objects it returns via its T next() method.
1 | for (Iterator<String> iter = str.iterator(); iter.hasNext();) { |
Using foreach
The for each syntax takes advantage of generics, too. The previous code snippet could be written as:
that is even easier to read and maintain.
Autoboxing and Autounboxing
The autoboxing/autounboxing features of the Java language are automatically used when dealing with generics, as shown in this code snippet:
1 | List<Integer> ints = new ArrayList<Integer>(); |
Be aware, however, that boxing and unboxing come with a performance penalty so the usual caveats and warnings apply.
Subtypes
In Java, as in other object-oriented typed languages, hierarchies of types can be built:
In Java, a subtype of a type T is either a type that extends T or a type that implements T (if T is an interface) directly or indirectly. Since “being subtype of” is a transitive relation, if a type A is a subtype of B and B is a subtype of C, then A will be a subtype of C too. In the figure above:
- FujiApple is a subtype of Apple
- Apple is a subtype of Fruit
- FujiApple is a subtype of Fruit.
Every Java type will also be subtype of Object.
Every subtype A of a type B may be assigned to a reference of type B:
Subtyping of Generic Types
If a reference of an Apple instance can be assigned to a reference of a Fruit, as seen above, then what’s the relation between, let’s say, a List<Apple> and a List<Fruit>? Which one is a subtype of which? More generally, if a type A is a subtype of a type B, how does C<A> and C<B> relate themselves?
Surprisingly, the answer is: in no way. In more formal words, the subtyping relation between generic types is invariant.
This means that the following code snippet is invalid:
1 | List<Apple> apples = ...; |
2 | List<Fruit> fruits = apples; |
and so does the following:
2 | List<Fruit> fruits = ...; |
But why? Is an apple is a fruit, a box of apples (a list) is also a box of fruits.
In some sense, it is, but types (classes) encapsulate state and operations. What would happen if a box of apples was a box of fruits?
1 | List<Apple> apples = ...; |
2 | List<Fruit> fruits = apples; |
3 | fruits.add( new Strawberry()); |
If it was, we could add other different subtypes of Fruit into the list and this must be forbidden.
The other way round is more intuitive: a box of fruits is not a box of apples, since it may be a box (List) of other kinds (subtypes) of fruits (Fruit), such as Strawberry.
Is It Really a Problem?
It should not be. The strongest reason for a Java developer to be surprised is the inconsistency between the behavior of arrays and generic types. While the subtyping relations of the latter is invariant, the subtyping relation of the former is covariant: if a type A is a subtype of type B, then A[] is a subtype of B[]:
2 | Fruit[] fruits = apples; |
But wait! If we repeat the argument exposed in the previous section, we might end up adding strawberries to an array of apples:
1 | Apple[] apples = new Apple[ 1 ]; |
2 | Fruit[] fruits = apples; |
3 | fruits[ 0 ] = new Strawberry(); |
The code indeed compiles, but the error will be raised at runtime as an
ArrayStoreException. Because of this behavior of arrays, during a store operation, the Java runtime needs to check that the types are compatible. The check, obviously, also adds a performance penalty that you should be aware of.
Once more, generics are safer to use and “correct” this type safety weakness of Java arrays.
In the case you’re now wondering why the subtyping relation for arrays is covariant, I’ll give you the answer that
Java Generics and Collections give: if it was invariant, there would be no way of passing a reference to an array of objects of an unknown type (without copying every time to an Object[]) to a method such as:
With the advent of generics, this characteristics of arrays is no longer necessary (as we’ll see in the next part of this post) and should indeed by avoided.
Wildcards
As we’ve seen in the previous part of this post, the subtyping relation of generic types is invariant. Sometimes, though, we’d like to use generic types in the same way we can use ordinary types:
- Narrowing a reference (covariance)
- Widening a reference (contravariance)
Covariance
Let’s suppose, for example, that we’ve got a set of boxes, each one of a different kind of fruit. We’d like to be able to write methods that could accept a any of them. More formally, given a subtype A of a type B, we’d like to find a way to use a reference (or a method parameter) of type C<B> that could accept instances of C<A>.
To accomplish this task we can use a wildcard with extends, such as in the following example:
1 | List<Apple> apples = new ArrayList<Apple>(); |
2 | List<? extends Fruit> fruits = apples; |
? extends reintroduces covariant subtyping for generics types: Apple is a subtype of Fruit and List<Apple> is a subtype of List<? extends Fruit>.
Contravariance
Let’s now introduce another wildcard: ? super. Given a supertype B of a type A, then C<B> is a subtype of C<? super A>:
1 | List<Fruit> fruits = new ArrayList<Fruit>(); |
2 | List<? super Apple> = fruits; |
How Can Wildcards Be Used?
Enough theory for now: how can we take advantage of these new constructs?
? extends
Let’s go back to the example we used in Part II when introducing Java array covariance:
1 | Apple[] apples = new Apple[ 1 ]; |
2 | Fruit[] fruits = apples; |
3 | fruits[ 0 ] = new Strawberry(); |
As we saw, this code compiles but results in a runtime exception when trying to add a Strawberry to an Apple array through a reference to a Fruit array.
Now we can use wildcards to translate this code to its generic counterpart: since Apple is a subtype of Fruit, we will use the ? extends wildcard to be able to assign a reference of a List<Apple> to a reference of a List<? extends Fruit> :
1 | List<Apple> apples = new ArrayList<Apple>(); |
2 | List<? extends Fruit> fruits = apples; |
3 | fruits.add( new Strawberry()); |
This time, the code won’t compile! The Java compiler now prevents us to add a strawberry to a list of fruits. We will detect the error at compile time and we won’t even need any runtime check (such as in the case of array stores) to ensure that we’re adding to the list a compatible type. The code won’t compile even if we try to add a Fruit instance into the list:
1 | fruits.add( new Fruit()); |
No way. It comes out that, indeed, you can’t put anything into a structure whose type uses the ? extends wildcard.
The reason is pretty simple, if we think about it: the ? extends T wildcard tells the compiler that we’re dealing with a subtype of the type T, but we cannot know which one. Since there’s no way to tell, and we need to guarantee type safety, you won’t be allowed to put anything inside such a structure. On the other hand, since we know that whichever type it might be, it will be a subtype of T, we can get data out of the structure with the guarantee that it will be a T instance:
1 | Fruit get = fruits.get( 0 ); |
? super
What’s the behavior of a type that’s using the ? super wildcard? Let’s start with this:
1 | List<Fruit> fruits = new ArrayList<Fruit>(); |
2 | List<? super Apple> = fruits; |
We know that fruits is a reference to a List of something that is a supertype of Apple. Again, we cannot know which supertype it is, but we know that Apple and any of its subtypes will be assignment compatible with it. Indeed, since such an unknown type will be both an Apple and a GreenApple supertype, we can write:
1 | fruits.add( new Apple()); |
2 | fruits.add( new GreenApple()); |
If we try to add whichever Apple supertype, the compiler will complain:
1 | fruits.add( new Fruit()); |
2 | fruits.add( new Object()); |
Since we cannot know which supertype it is, we aren’t allowed to add instances of any.
What about getting data out of such a type? It turns out that you the only thing you can get out of it will be Object instances: since we cannot know which supertype it is, the compiler can only guarantee that it will be a reference to an Object, since Object is the supertype of any Java type.
The Get and Put Principle or the PECS Rule
Summarizing the behavior of the ? extends and the ? super wildcards, we draw the following conclusion:
- Use the ? extends wildcard if you need to retrieve object from a data structure
- Use the ? super wildcard if you need to put objects in a data structure
- If you need to do both things, don’t use any wildcard.
Bloch’s mnemonic, PECS, comes from “Producer Extends, Consumer Super” and is probably easier to remember and use.
Wildcards in Method Signatures
As seen in Part II of this series, in Java (as in many other typed languages), the Substitution principle stands: a subtype can be assigned to a reference of any of its supertypes.
This applies during the assignment of whichever reference, that is, even when passing parameters to a function or storing its result. One of the advantages of this principle, then, is that when defining class hierarchies, “general purpose” methods can be written to handle entire sub-hierarchies, regardless of the class of the specific object instances time being handled. In the Fruit class hierarchy we’ve used so far, a function that accepts a Fruit as a parameter will accept any of its subtypes (such as Apple or Strawberry).
As seen in the previous post, wildcards restore covariant and contravariant subtyping for generic types: using wildcards, then, let the developer write functions that can take advantage of the benefits presented so far.
If, for example, a developer wanted to define a method eat that accepted a List of whichever fruit, it could use the following signature:
1 | void eat(List<? extends Fruit> fruits); |
Since a List of whichever subtype of the class Fruit is a subtype of List<? extends Fruit>, the previous method will accept any such list as a parameter. Note that, as explained in the previous section, the Get and Put Principle (or the PECS Rule) will allow you to retrieve objects from such list and assign them to a Fruit reference.
On the other hand, if you wanted to put instances on the list passed as a parameter, you should use the ? super wildcard:
1 | void store(List<? super Fruit> container); |
This way, a List of whichever supertype of Fruit could be passed in to the store function and you could safely put whichever Fruit subtype into it.
Bounded Type Variables
The flexibility of generics is greater than this, though. Type variables can be bounded, pretty much in the same way wildcards can be (as we’ve seen in Part II). However, type variables cannot be bounded with super, but only with extends. Look at the following signature:
1 | public static <T extends I<T>> void name(Collection<T> t); |
It takes a collections of objects whose type is bounded: it must satisfy the T extends I<T> condition. Using bounded type variables may not seem more powerful than wildcards at first, but we’ll detail the differences in a moment.
Let’s suppose some, but not all, fruits in your hierarchy can be juicy as in:
1 | public interface Juicy<T> { |
Juicy fruits will implement this interface and publish the squeeze method.
Now, you write a library method that takes a bunch of fruits and squeezes them all. The first signature you could write might be:
1 | <T> List<Juice<T>> squeeze(List<Juicy<T>> fruits); |
Using bounded type variables, you would write the following (which, indeed, has got the same erasure of the previous method):
1 | <T extends Juicy<T>> List<Juice<T>> squeeze(List<T> fruits); |
So far, so good. But limited. We could use the very same arguments used in the same posts and discover that the squeeze method is not going to work, for example, with a list of red oranges when:
1 | class Orange extends Fruit implements Juicy<Orange>; |
2 | class RedOrange extends Orange; |
Since we’ve already learned about the PECS principle, we’re going to change the method with:
1 | <T extends Juicy<? super T>> List<Juice<? super T>> squeezeSuperExtends(List<? extends T> fruits); |
This method accepts a list of objects whose type extends Juicy<? super T>, that is, in other words, that there must exist a type S such that T extends Juicy<S> and S super T.
Recursive Bounds
Maybe you feel like relaxing the T extends Juicy<? super T> bound. This kind of bound is called recursive bound because the bound that the type T must satisfy depends on T. You can use recursive bounds when needed and also mix-and-match them with other kinds of bounds.
Thus you can, for example, write generic methods with such bounds:
1 | <A extends B<A,C>, C extends D<T>> |
Please remember that these examples are only given to illustrate what generics can do. Bounds you’re going to use always depend on the constraints you’re putting into your type hierarchy.
Using Multiple Type Variables
Let’s suppose you want to relax the recursive bound we put on the last version of the squeeze method. Let’s then suppose that a type T might extend Juicy<S> although T itself does not extends S. The method signature could be:
1 | <T extends Juicy<S>, S> List<Juice<S>> squeezeSuperExtendsWithFruit(List<? extends T> fruits); |
This signature has pretty much equivalent to the previous one (since we’re only using T in the method arguments) but has got one slight advantage: since we’ve declared the generic type S, the method can return List<Juice<S> instead of List<? super T>, which can be useful in some situations, since the compiler will help you identify which type S is according to the method arguments you’ve passed. Since you’re returning a list, chances are you want your caller to be able to get something from it and, as you’ve learned in the previous part, you can only get Object instances from a list such as List<? super T>.
You can obviously add more bounds to S, if you need them, such as:
1 | <T extends Juicy<S>, S extends Fruit> List<Juice<S>> squeezeSuperExtendsWithFruit(List<? extends T> fruits); |
Multiple Bounds
What if you want to apply multiple bounds on the same type variable? It turns out that you can only write a bound per generic type variable. The following bounds are thus illegal:
1 | <T extends A, T extends B> |
The compiler will fail with a message such as:
T is already defined in…
Multiple bounds must be expressed with a different syntax, which turns out to be a pretty familiar notation:
- A type variable.
- A class.
- An interface type followed by further interface types.
This means that multiple bounds can only be expressed using interface types. There’s no way of using type variables in a multiple bound and the compiler will fail with a message such as:
A type variable may not be followed by other bounds.
This is not always clear in the documentation I’ve read.
References :
Happy Coding! Do not forget to share!
Byron
Related Articles: