|
||||||||||||||||||||||||||
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | | | | ||||||||||||||||||||||||||
|
Implementing equals() to allow Slice Comparison
|
|||||||||||||||||||||||||
Secrets of equals() - Part 2 How to implement a correct slice comparison in Java
JavaSolutions, August 2002
Implementing equals() To Allow Mixed-Type Comparison All objects in Java are equality-comparable via the equals() method because all classes inherit the equals() method from class Object . As a result, each time a programmer implements a new class, he or she must check whether the inherited version of equals() provides the correct semantics for the new class or whether the default implementation must be overridden. Naturally, any user-defined version of equals() should be correct. What does "correct" mean? It means meeting the requirements of the so-called equals contract, which is specified in JavaDoc of Object.equals . In a previous article we discussed at length how easy it is to inadvertently provide incorrect implementations of equals() . The most common mistake you can find in publications and real-world source code is implementations that do not meet the transitivity requirement imposed by the equals contract: if one object is equal to a second, and the second is equal to a third object, then the first and the third object must be equal as well. The core problem of a non-transitive implementation is that it attempts to allow comparison of objects that belong to the same class hierarchy. All the problems regarding transitivity go away as soon as you disallow mixed-type comparison and only allow comparison of objects of the exact same type. The same-type comparison is easy to implement: you check whether this object and the other object are of the same type by comparing their Class objects which you can obtain via getClass() . Here is an example: ... if (this.getClass() != other.getClass()) return false; else // these objects can be compared; go and compare their fields ... } The Need for Mixed-Type Comparison The idea of comparing objects from the same class hierarchy is not entirely off base. After all, objects from the same class hierarchy (and by class hierarchy we mean all classes derived from a common superclass other than Object ) have something in common, namely at least that superclass part. For instance, in a class hierarchy, where Employee and Student are subclasses of a Person , representing roles of a person, it may make sense to compare an Employee to a Student to see whether they are the same Person . With an implementation of equals() that only allows same-type comparison an Employee and a Student would not be comparable. On the other hand, if you have a class hierarchy with a Fruit superclass and Apple and Pear subclasses, then you might find the comparison of an Apple and a Pear nonsensical.
Which type of comparability makes sense for a given class hierarchy
depends entirely on the semantics of classes. Each type of comparability
has side effects, each of which may or may not be desired (see
sidebar
on "
equals()
and its impact on the rest of your program"). The
point is: same-type-comparison-only is easy to implement, but it may make
sense to allow mixed-type comparison for certain class hierarchies. The
question in these cases is: how can it be implemented correctly, so that
it meets the requirements of the equals contract? As we demonstrated in
a preceding article, providing a correct implementation is a non-trivial
task. In this article we will show one way of doing it.
The Semantics of a Mixed-Type Comparison The most common mistake in implementations of mixed-type comparison is the semantics of the comparison. A: If they have equal superclass parts. Q: How about the subclass-specific parts? The crux lies in the subclass-specific color field. A Point object should only compare equal to a ColoredPoint object if the color field contains a default value. Under this rule, the three objects from the example above do not compare equal any longer. Point s and ColoredPoint s only compare equal when the ColoredPoint has a default color, that is, Point(0,0) is equal to one and only one ColoredPoint with the coordinates (0,0) .
Transitivity is preserved under this rule. It does not matter what this
default color is. The only thing that counts is that for a transitive mixed-type
comparison is that all the fields that cannot be compared (which are the
subclass-specific fields) have well-defined default values. Hence the rule
is:
To make things a little more challenging, let us consider a slightly more complex class hierarchy and see whether the rule holds. Say, we have an additional subclass Point3D that is derived from class Point and that represents 3-dimensional points. Both ColoredPoint and Point3D belong to the same class hierarchy, class Point being the root class of that hierarchy. What does it mean to compare a ColoredPoint to a Point3D ? Under the rule above it means that a ColoredPoint and a Point3D compare equal if and only if they have the same Point part and default values for their subclass-specific parts, namely the color and the 3 rd coordinate. For instance, if black is the default color and 0 the default coordinate, then Point(0,0) and ColoredPoint(0,0,BLACK) and Point3D(0,0,0) would be equal, but neither ColoredPoint(0,0,RED) nor Point3D(0,0,1) would be equal to Point(0,0) . After these preliminaries let us implement it. Implementing the Mixed-Type Comparison The functionality that we aim to implement depends on the type of this object and the other object. There are four cases to distinguish between:
All fields can and must be compared.
The other object is smaller than this object and only the superclass fields can be compared for equality; the subclass-specific fields of this object must have default values. The other object is the larger object that has subclass-specific fields that must have default values. This can be implemented easiest if we let the other object do the work, which gets us back to case (2) with the roles of this and other being switched. Both objects have something in common, which must be compared for equality, and they both have subclass-specific fields, which must checked against the default values. This is the tricky part. First, we must identify the superclass that this and other have in common and second we must check the subclass-specific fields of both objects for default values. We will encapsulate the actual comparison of a class's own fields into a private helper method that every class in a class hierarchy must provide. Here is the sketch of an implementation of the helper method, which we named _compareFields() , for a class with two int fields: static final int field1Default = 0, field2Default = 0; private int field1 = field1Default; private int field2 = field2Default; ... private boolean _compareFields(Object other) { if (other instanceof MyClass) { // at least my type, check fields if ( field1 != ((MyClass)other). field1 || field2 != ((MyClass)other). field2 ) return false; } else { // higher type, check defaults if ( field1 != field1Default || field2 != field2Default ) return false; } return true; } } The implementation of the equals() method in principle calls the helper method and then delegates to super.equals() . However, matters are not that simple. This strategy would be fine for case (1) and (2) from the list above, but more work is needed to address case (3) and (4). Case (3), " this is of a supertype of other ", can best be addressed by passing the bucket on the other object's equals() method. By switching the roles of this and other we simply reduce case (3) to case (2). Case (4), " this and other are from different branches of the class hierarchy", is the really challenging situation, where we must identify the superclass that both objects have in common and must navigate the class tree in a more sophisticated way than simply going straight up from the subclass to the superclass. Before we look in the solution to case (4) let us take a look at the implementation of the equals() method itself. ... public boolean equals(Object other) { if (other == this) return true; // identical if (other == null) return false; // null if (!(other instanceof RootClass)) return false; // incomparable types return _navigateClassHierarchy(other); // subtype of root; go and compare } } The Tree Traversal For purpose of illustration let us consider the following class hierarchy:
There are four relevant cases to consider, which are listed below. For
each of the cases, the diagrams illustrate the location of
thisObject
and
theOtherObject
in the class hierarchy (see box around class
name) and the size of each object in terms of the class slices it conmprises
(see yellow box). For instance, in case (1),
thisObject
is an
object of type
MyClass
and contains a
MySuperClass
and
a
MyClass
part.
The algorithm is easy for the cases (1), " this and other are of the exact same type", and (2), " this is of a subtype of other " . In these cases we will either compare this 's fields to other 's fields (1) or check this 's field against the default values (2). This is exactly what method _compareFields() does. Afterwards we delegate to the superclass for comparison of the inherited fields and start the recursive traversal up the class tree. Here is a first incomplete snippet of the helper method _navigateClassHierarchy() : ... protected boolean _navigateClassHierarchy(Object other) { ... // compare my fields if(!_compareFields(other)) return false; // pass the buck up return super._navigateClassHierarchy(other); } } ... protected boolean _navigateClassHierarchy(Object other) { if (other instanceof MyClass) { // switch roles return ((MyClass)other)._navigateClassHierarchy(this); } else { // compare my fields if(!_compareFields(other)) return false; // pass the buck up return super._navigateClassHierarchy(other); } } } In our example of a case (4) situation, this is a MySubClass_1 object and other is a MySubSubClass object. Control flow in the tentative implementation of _navigateClassHierarchy() shown above will go to the else -branch because other is not an instance of MyClass . The _ compareFields() method will correctly check whether this 's field have default values. Then control flow delegate to the superclass, that is, MyClass._navigateClassHierarchy() will be invoked. This time other is an instance of MyClass , because MyClass happens to be superclass that this and other have in common. Note, we have not yet checked whether the other object has default values for its own subclass-specific fields; in order words, we must still traverse the branch of the class tree from which the other objects stems. This can be achieved by switching roles and calling MySubSubClass._navigateClassHierarchy() . This happens to be exactly what the then -branch does anyway. So far, so good. In MySubSubClass._navigateClassHierarchy() we will check for default values of the subsubclass-specific fields; we will move up the class tree and will process the subclass-specific fields, until we end up in MyClass._navigateClassHierarchy( ) again. With the implementation as it is, control flow would now descend down the other branch again, that we already traversed, that is, we will end up in an infinite loop. To break the cycle we need to memorize that we already traversed that other branch of the tree. For this purpose we introduce a boolean flag, which we name reverseOrder , to maintain this extra piece of information. Here is the complete implementation of the _navigateClassHierarchy() method: ... protected boolean _navigateClassHierarchy(Object other, boolean reverseOrder) { if (other instanceof MyClass && !reverseOrder) { // reverse order return ((MyClass)other)._navigateClassHierarchy(this,true); } else { // compare my fields if(!_compareFields(other)) return false; // pass the buck up return super._navigateClassHierarchy(other,reverseOrder); } } } That's it! This is a way of implementing a correct mixed-type comparison. It is certainly not he only conceivable implementation, but no matter how you achieve the goal, it will take more than a simple instanceof test to get it right. The complete source code for the root class and one of the subclasses can be found in listings 1 and 2. As with all implementations of equals() , no matter whether they are of the same-type-comparison-only flavor or of mixed-type-comparison implementations style, any new class in the class hierarchy must join the game and must employ the same mechanism for its implementation of equals() . In our case it means that all classes in the hierarchy must override the helper methods _navigateClassHierarchy() and _compareFields() . The equals() method itself is identical in all classes and need only be defined once in the root class. The implementations of the _compareFields() methods are entirely class-specific and fully depend on the number and types of the fields that the respective class adds to the hierarchy. Note, that _compareFields() is a private method in our sample implementation. This is intentionally so in order to make sure that the method remains class-specific and cannot be overridden in a subclass. Alternatively we could have implemented a final method in each class that has a class-specific name using a naming scheme like _compare< classname >Fields() . The implementations of _navigateClassHierarchy() are mostly identical. This is really boilerplate code, that you would copy and paste. The few modifications are: the class name in the instanceof test and the cast expression must be changed; it is the name of respective class. And each implementation invokes its class's ( private or final ) _compareFields() method. As a refinement of our implementation one could factor out the commonalities into a single root-class-version of _navigateClassHierarchy() , using reflection to invoke the _compareFields() method, so that only the _compareFields() methods need to be provided when a new classes that is added to the hierarchy. We leave the refinement as an exercise to the reader since a comprehensive discussion is beyond the scope of this article. Acknowledgements
The ideas discussed in this article were inspired by comments we received
from readers of a preceding article on
equals()
(see /
KRE
/). These readers were (in the order their emails arrived) Larry
Kuenning
larry.kuenning@qhpress.org
, Mark Davis
mark.davis@us.ibm.com
, and Frank Griswold
griswolf@roguewave.com
. All three pointed out that there
is
a way of implementing
equals()
so that it performs a transitive mixed-type comparison. From their tentative
first ideas and code snippets they sent we derived the solution that we
explained in this article. Mark Davis's suggestion are the main foundation
of this article; he precisely described the algorithm and provided most
of the implementation. If you're interested in reading his article about
equals()
from his "Durable Java" column in Java Report look up his website (see
/
DAV
/).
|
||||||||||||||||||||||||||
© Copyright 1995-2007 by Angelika Langer. All Rights Reserved. URL: < http://www.AngelikaLanger.com/Articles/JavaSolutions/SecretsOfEquals/Equals-2.html> last update: 10 Aug 2007 |