Angelika Langer - Training & Consulting
HOME | COURSES | TALKS | ARTICLES | GENERICS | LAMBDAS | IOSTREAMS | ABOUT | CONTACT | Twitter | Lanyrd | Linkedin
 
HOME 

  OVERVIEW

  BY TOPIC
    JAVA
    C++

  BY COLUMN
    EFFECTIVE JAVA
    EFFECTIVE STDLIB

  BY MAGAZINE
    JAVA MAGAZIN
    JAVA SPEKTRUM
    JAVA WORLD
    JAVA SOLUTIONS
    JAVA PRO
    C++ REPORT
    CUJ
    OTHER
 

GENERICS 
LAMBDAS 
IOSTREAMS 
ABOUT 
CONTACT 
Implementing equals() to allow Slice Comparison

Implementing equals() to allow Slice Comparison
Secrets of equals() - Part 2
How to implement a correct slice comparison in Java
 

JavaSolutions, August 2002
Angelika Langer & Klaus Kreft


 
 

Implementing equals() To Allow Mixed-Type Comparison

All objects in Java are equality-comparable via the equals() method because all classes inherit the equals() method from class Object . As a result, each time a programmer implements a new class, he or she must check whether the inherited version of equals() provides the correct semantics for the new class or whether the default implementation must be overridden. Naturally, any user-defined version of equals() should be correct. What does "correct" mean? It means meeting the requirements of the so-called equals contract, which is specified in JavaDoc of Object.equals .

In a previous article we discussed at length how easy it is to inadvertently provide incorrect implementations of equals() . The most common mistake you can find in publications and real-world source code is implementations that do not meet the transitivity requirement imposed by the equals contract: if one object is equal to a second, and the second is equal to a third object, then the first and the third object must be equal as well.

The core problem of a non-transitive implementation is that it attempts to allow comparison of objects that belong to the same class hierarchy. All the problems regarding transitivity go away as soon as you disallow mixed-type comparison and only allow comparison of objects of the exact same type. The same-type comparison is easy to implement: you check whether this object and the other object are of the same type by comparing their Class objects which you can obtain via getClass() . Here is an example:

public boolean equals(Object other) {
  ...
  if (this.getClass() != other.getClass())
    return false;
  else
    // these objects can be compared; go and compare their fields
  ...
}
This type of implementation of the equals() method is easy to get right; it is robust, easy to understand, easy to maintain, and for that reason the recommended way of implementing equals() - unless you need something else.
 
 

The Need for Mixed-Type Comparison

The idea of comparing objects from the same class hierarchy is not entirely off base. After all, objects from the same class hierarchy (and by class hierarchy we mean all classes derived from a common superclass other than Object ) have something in common, namely at least that superclass part. For instance, in a class hierarchy, where Employee and Student are subclasses of a Person , representing roles of a person, it may make sense to compare an Employee to a Student to see whether they are the same Person . With an implementation of equals() that only allows same-type comparison an Employee and a Student would not be comparable. On the other hand, if you have a class hierarchy with a Fruit superclass and Apple and Pear subclasses, then you might find the comparison of an Apple and a Pear nonsensical.

Which type of comparability makes sense for a given class hierarchy depends entirely on the semantics of classes. Each type of comparability has side effects, each of which may or may not be desired (see sidebar on " equals() and its impact on the rest of your program"). The point is: same-type-comparison-only is easy to implement, but it may make sense to allow mixed-type comparison for certain class hierarchies. The question in these cases is: how can it be implemented correctly, so that it meets the requirements of the equals contract? As we demonstrated in a preceding article, providing a correct implementation is a non-trivial task. In this article we will show one way of doing it.
 

 
equals() and its impact on the rest of your program
equals() is used by various other classes and its semantics, mixed-type comparison or same-type-comparison-only, influences the behavior of other components that you or others use in conjunction with your classes. 

For instance, equals() is closely related to the HashSet collection of the JDK, because its implementation relies on the elements' equals() methods for its internal organization. 

  • Same-type-comparison-only implementation of equals() . With such an implementation of equals() you can store an Employee("Hanni Hanuta") and a Student("Hanni Hanuta") into the same HashSet , but retrieval from the collection will not work as expected. You will not find any of these two contained objects when you ask the HashSet whether it contains a Person("Hanni Hanuta") , because all three object are unequal to each other. 
  • Mixed-type-comparison implementation of equals() . Conversely, with this type of equals() implementation you will have problems storing an Employee("Hanni Hanuta") and a Student("Hanni Hanuta") in the same HashSet . The HashSet will reject the second add() operation, because the collection already contains an element that compares equal to the new element.
Both effects might be surprising and this is not to say that any of them is incorrect or weird. It's just that the semantics of your equals() implementation has an impact on the behavior of other components that rely on the equals() method of your classes.

 

The Semantics of a Mixed-Type Comparison

The most common mistake in implementations of mixed-type comparison is the semantics of the comparison.

Q: Under which circumstances do we consider two objects of different type from the same class hierarchy as equal?

A: If they have equal superclass parts.

Q: How about the subclass-specific parts?

The problem is that the intransitive implementations of equals() do not care about the subclass-specific parts of the objects that they compare. Here is an example: say, we have a Point superclass and a ColoredPoint subclass. When we compare ColoredPoint(0,0,RED) to Point(0,0) , then the typical implementation says: yes, they are equals. If you then go on to compare Point(0,0) to a ColoredPoint(0,0,BLUE) , then they, too, will compare equal. But when we compare ColoredPoint(0,0,RED) and ColoredPoint(0,0,BLUE) , then they should compare equal because of the transitivity requirement of the equals contract, but in fact they compare unequal. What went wrong?

The crux lies in the subclass-specific color field. A Point object should only compare equal to a ColoredPoint object if the color field contains a default value. Under this rule, the three objects from the example above do not compare equal any longer. Point s and ColoredPoint s only compare equal when the ColoredPoint has a default color, that is, Point(0,0) is equal to one and only one ColoredPoint with the coordinates (0,0) .

Transitivity is preserved under this rule. It does not matter what this default color is. The only thing that counts is that for a transitive mixed-type comparison is that all the fields that cannot be compared (which are the subclass-specific fields) have well-defined default values. Hence the rule is:
 
 
Two objects from the same class hierarchy compare equal if and only if they have
    • equal values for all fields that they have in common (equal superclass part) 
and
    • default values for all other fields (default subclass-specific part).

To make things a little more challenging, let us consider a slightly more complex class hierarchy and see whether the rule holds. Say, we have an additional subclass Point3D that is derived from class Point and that represents 3-dimensional points. Both ColoredPoint and Point3D belong to the same class hierarchy, class Point being the root class of that hierarchy. What does it mean to compare a ColoredPoint to a Point3D ?

Under the rule above it means that a ColoredPoint and a Point3D compare equal if and only if they have the same Point part and default values for their subclass-specific parts, namely the color and the 3 rd coordinate. For instance, if black is the default color and 0 the default coordinate, then Point(0,0) and ColoredPoint(0,0,BLACK) and Point3D(0,0,0) would be equal, but neither ColoredPoint(0,0,RED) nor Point3D(0,0,1) would be equal to Point(0,0) .

After these preliminaries let us implement it.

Implementing the Mixed-Type Comparison

The functionality that we aim to implement depends on the type of this object and the other object. There are four cases to distinguish between:

  1. this and other are of the exact same type.

  2. All fields can and must be compared.
  1. this is of a subtype of other .

  2. The other object is smaller than this object and only the superclass fields can be compared for equality; the subclass-specific fields of this object must have default values.
     
  3. this is of a supertype of other .

  4. The other object is the larger object that has subclass-specific fields that must have default values. This can be implemented easiest if we let the other object do the work, which gets us back to case (2) with the roles of this and other being switched.
     
  5. this and other are from different branches of the class hierarchy.

  6. Both objects have something in common, which must be compared for equality, and they both have subclass-specific fields, which must checked against the default values. This is the tricky part. First, we must identify the superclass that this and other have in common and second we must check the subclass-specific fields of both objects for default values.
The principle of the implementation is the usual recursion: every class's implementation of equals() takes care of comparing its own fields and delegates to the superclass's implementation of equals() for comparison of the inherited fields. Comparing one's own fields comes in two flavors: either the other object has fields that this object can compare with its own fields, or not. The first case is relevant when this object is of the same type or of a subtype of the other object (see (1) and (2) in the list above); in this case compare this 's fields with other 's fields. The second case ("there is nothing to compare on this level because the other object is too different") happens in the cases (3) and (4) from the list above; in this case check that this 's fields have default values.

We will encapsulate the actual comparison of a class's own fields into a private helper method that every class in a class hierarchy must provide. Here is the sketch of an implementation of the helper method, which we named _compareFields() , for a class with two int fields:

public class MyClass extends MySuperClass {
  static final int
  field1Default = 0,
  field2Default = 0;
  private int field1 = field1Default;
  private int field2 = field2Default;
  ...
  private boolean _compareFields(Object other) {
    if (other instanceof MyClass) { // at least my type, check fields
    if ( field1 != ((MyClass)other). field1
      || field2 != ((MyClass)other). field2
       )
       return false;
    } else { // higher type, check defaults
       if ( field1 != field1Default
         || field2 != field2Default
          )
          return false;
       }
    return true;
  }
}
Note, that _compareFields() is a private method in our sample implementation. This is intentionally so in order to make sure that the method remains class-specific and cannot be overridden in a subclass.

The implementation of the equals() method in principle calls the helper method and then delegates to super.equals() . However, matters are not that simple. This strategy would be fine for case (1) and (2) from the list above, but more work is needed to address case (3) and (4). Case (3), " this is of a supertype of other ", can best be addressed by passing the bucket on the other object's equals() method. By switching the roles of this and other we simply reduce case (3) to case (2). Case (4), " this and other are from different branches of the class hierarchy", is the really challenging situation, where we must identify the superclass that both objects have in common and must navigate the class tree in a more sophisticated way than simply going straight up from the subclass to the superclass.

Before we look in the solution to case (4) let us take a look at the implementation of the equals() method itself.

public class MyClass extends MySuperClass {
  ...
  public boolean equals(Object other) {
    if (other == this) return true; // identical
    if (other == null) return false; // null
    if (!(other instanceof RootClass)) return false; // incomparable types
    return _navigateClassHierarchy(other); // subtype of root; go and compare
  }
}
The equals() method performs a couple of preliminary checks, like checking whether the other object is a null reference and whether the other object belongs to the same class hierarchy as this object. The assumption here is that the topmost superclass of the class hierarchy is a class named RootClass. After the preliminaries equals() delegates to another helper method named _navigateClassHierarchy() , which controls the class tree traversal and triggers the actual field comparison provided by the other helper method _compareFields() .
 
 

The Tree Traversal

For purpose of illustration let us consider the following class hierarchy:

There are four relevant cases to consider, which are listed below. For each of the cases, the diagrams illustrate the location of thisObject and theOtherObject in the class hierarchy (see box around class name) and the size of each object in terms of the class slices it conmprises (see yellow box). For instance, in case (1), thisObject is an object of type MyClass and contains a MySuperClass and a MyClass part.
 
 
 
MyClass thisObject = new MyClass();
MyClass theOtherObject = null;
 

 
 
 
 
 
  // (1) this and other are of the exact same type 
theOtherObject = new MyClass();
     ... thisObject.equals(theOtherObject) ...

 
 
 
  // (2) this is of a subtype of other
theOtherObject = new MySuperClass();
     ... thisObject.equals(theOtherObject) ...

 
 
 
  // (3) this is of a supertype of other
theOtherObject = new MySubClass_1 ();
     ... thisObject.equals(theOtherObject) ...

 
 
 
  // (4) this and other are from different
// branches of the class hierarchy
thisObject = new MySubClass_1();
theOtherObject = new MySubClass_2();
     ... thisObject.equals(theOtherObject) ...

 

The algorithm is easy for the cases (1), " this and other are of the exact same type", and (2), " this is of a subtype of other " . In these cases we will either compare this 's fields to other 's fields (1) or check this 's field against the default values (2). This is exactly what method _compareFields() does. Afterwards we delegate to the superclass for comparison of the inherited fields and start the recursive traversal up the class tree. Here is a first incomplete snippet of the helper method _navigateClassHierarchy() :

public class MyClass extends MySuperClass {
  ...
  protected boolean _navigateClassHierarchy(Object other) {
     ...
     // compare my fields
     if(!_compareFields(other)) return false;
        // pass the buck up
        return super._navigateClassHierarchy(other);
  }
}
Case (3), " this is of a supertype of other " , is similar to case (2), " this is of a subtype of other " . If we switch the roles of this and other we can use the solution of case (2), which means that we simply call other 's _navigateClassHierarchy() method and provide the this object as an argument. Here is a more elaborate implementation of the tree navigation method: public class MyClass extends MySuperClass {
  ...
  protected boolean _navigateClassHierarchy(Object other) {
    if (other instanceof MyClass)
    { // switch roles
      return ((MyClass)other)._navigateClassHierarchy(this);
    }
    else
    { // compare my fields
      if(!_compareFields(other)) return false;
      // pass the buck up
      return super._navigateClassHierarchy(other);
    }
  }
}
Note, this implementation is still incomplete, since we haven't addressed case (4), " this and other are from different branches of the class hierarchy", yet.

In our example of a case (4) situation, this is a MySubClass_1 object and other is a MySubSubClass object. Control flow in the tentative implementation of _navigateClassHierarchy() shown above will go to the else -branch because other is not an instance of MyClass . The _ compareFields() method will correctly check whether this 's field have default values. Then control flow delegate to the superclass, that is, MyClass._navigateClassHierarchy() will be invoked.

This time other is an instance of MyClass , because MyClass happens to be superclass that this and other have in common. Note, we have not yet checked whether the other object has default values for its own subclass-specific fields; in order words, we must still traverse the branch of the class tree from which the other objects stems. This can be achieved by switching roles and calling MySubSubClass._navigateClassHierarchy() . This happens to be exactly what the then -branch does anyway. So far, so good.

In MySubSubClass._navigateClassHierarchy() we will check for default values of the subsubclass-specific fields; we will move up the class tree and will process the subclass-specific fields, until we end up in MyClass._navigateClassHierarchy( ) again. With the implementation as it is, control flow would now descend down the other branch again, that we already traversed, that is, we will end up in an infinite loop.

To break the cycle we need to memorize that we already traversed that other branch of the tree. For this purpose we introduce a boolean flag, which we name reverseOrder , to maintain this extra piece of information. Here is the complete implementation of the _navigateClassHierarchy() method:

public class MyClass extends MySuperClass {
  ...
  protected boolean _navigateClassHierarchy(Object other, boolean reverseOrder) {
    if (other instanceof MyClass && !reverseOrder)
    { // reverse order
      return ((MyClass)other)._navigateClassHierarchy(this,true);
    }
    else
    { // compare my fields
      if(!_compareFields(other)) return false;
      // pass the buck up
      return super._navigateClassHierarchy(other,reverseOrder);
    }
  }
}
Before we switch the roles of this and other in order to traverse the other branch of the class tree we check whether we did it before. If so, we refrain from doing it again and instead start the normal straight ascent up the hierarchy from here on, since both subbranches have already been traversed. The boolean flag works like a so-called latch: it start with the initial value false , flips exactly once, namely when traversal of the other branch starts, and never changes its value again. It is handed from one _navigateClassHierarchy() method to the next each of which checks the flag in order to find out whether the other branch has already been traversed or not.

That's it! This is a way of implementing a correct mixed-type comparison. It is certainly not he only conceivable implementation, but no matter how you achieve the goal, it will take more than a simple instanceof test to get it right. The complete source code for the root class and one of the subclasses can be found in listings 1 and 2.

As with all implementations of equals() , no matter whether they are of the same-type-comparison-only flavor or of mixed-type-comparison implementations style, any new class in the class hierarchy must join the game and must employ the same mechanism for its implementation of equals() . In our case it means that all classes in the hierarchy must override the helper methods _navigateClassHierarchy() and _compareFields() . The equals() method itself is identical in all classes and need only be defined once in the root class.

The implementations of the _compareFields() methods are entirely class-specific and fully depend on the number and types of the fields that the respective class adds to the hierarchy. Note, that _compareFields() is a private method in our sample implementation. This is intentionally so in order to make sure that the method remains class-specific and cannot be overridden in a subclass. Alternatively we could have implemented a final method in each class that has a class-specific name using a naming scheme like _compare< classname >Fields() .

The implementations of _navigateClassHierarchy() are mostly identical. This is really boilerplate code, that you would copy and paste. The few modifications are: the class name in the instanceof test and the cast expression must be changed; it is the name of respective class. And each implementation invokes its class's ( private or final ) _compareFields() method. As a refinement of our implementation one could factor out the commonalities into a single root-class-version of _navigateClassHierarchy() , using reflection to invoke the _compareFields() method, so that only the _compareFields() methods need to be provided when a new classes that is added to the hierarchy. We leave the refinement as an exercise to the reader since a comprehensive discussion is beyond the scope of this article.

Acknowledgements

The ideas discussed in this article were inspired by comments we received from readers of a preceding article on equals() (see / KRE /). These readers were (in the order their emails arrived) Larry Kuenning larry.kuenning@qhpress.org , Mark Davis mark.davis@us.ibm.com , and Frank Griswold griswolf@roguewave.com . All three pointed out that there is a way of implementing equals() so that it performs a transitive mixed-type comparison. From their tentative first ideas and code snippets they sent we derived the solution that we explained in this article. Mark Davis's suggestion are the main foundation of this article; he precisely described the algorithm and provided most of the implementation. If you're interested in reading his article about equals() from his "Durable Java" column in Java Report look up his website (see / DAV /).
 
 
 
 
Listing 1: An implementation of a mixed-type comparison for the root of a class hierarchy
 
public class RootClass {
    static final int
        a1Default = 0,
        a2Default = 0;
    private int a1 = a1Default;
    private int a2 = a2Default;
 

    public boolean equals(Object other) {
        if (other == this) return true;                        // identical
        if (other == null) return false;                       // null
        if (! (other instanceof RootClass)) return false;      // incomparable types
        return _navigateClassHierarchy(other,false);
    }

    protected boolean _navigateClassHierarchy(Object other, boolean reverseOrder) {
        if (other instanceof RootClass && !reverseOrder) 
        {  // reverse order 
           return ((RootClass)other)._navigateClassHierarchy(this,true);
        }
        else 
        {   // compare my fields 
            if(!_compareFields(other)) return false;
            // since we are the root, succeed
            return true;
        }
    }

    private boolean _compareFields(Object other) {
        if (other instanceof RootClass) {    // at least my type, check fields
            RootClass myType = (RootClass)other;
            if (a1 != myType.a1
                || a2 != myType.a2) return false;
        } else {                            // check defaults
            if (a1 != a1Default
                || a2 != a2Default) return false;
        }
        return true;
    }
}


 
 
Listing 2: An implementation of a mixed-type comparison for a subclass in a class hierarchy
public class SubClass extends RootClass {
    static final int
        b1Default = 0,
        b2Default = 0;
    private int b1 = b1Default;
    private int b2 = b2Default;

    protected boolean _navigateClassHierarchy(Object other, boolean reverseOrder) {
        if (other instanceof SubClass && !reverseOrder) 
        {  // reverse order 
           return ((SubClass)other)._navigateClassHierarchy(this,true);
        }
        else 
        {   // compare my fields 
            if(!_compareFields(other)) return false;
            // pass the buck up
            return super._navigateClassHierarchy(other,reverseOrder);
        }
    }

    private boolean _compareFields(Object other) {
        if (other instanceof SubClass) {    // at least my type, check fields
            SubClass myType = (SubClass)other;
            if (b1 != myType.b1
                || b2 != myType.b2) return false;
        } else {                            // check defaults
            if (b1 != b1Default
                || b2 != b2Default) return false;
        }
        return true;
    }
}

  References
 
 
/KRE/  Secrets of equals
Klaus Kreft & Angelika Langer 
Java Solutions, April 2002
URL: http://www.AngelikaLanger.com/Articles/Java/SecretsOfEquals/Equals.html   or 
URL: http://www.cuj.com/java/articles/a19.htm?topic=java
/DAV/ Durable Java: Liberté, Égalité, Fraternité
Mark Davis 
Java Report, January 2000 
http://www.macchiato.com/columns/Durable5.html

 
 
 

If you are interested to hear more about this and related topics you might want to check out the following seminar:
Seminar
 
Effective Java - Advanced Java Programming Idioms 
5 day seminar (open enrollment and on-site)
 
  © Copyright 1995-2007 by Angelika Langer.  All Rights Reserved.    URL: < http://www.AngelikaLanger.com/Articles/JavaSolutions/SecretsOfEquals/Equals-2.html  last update: 10 Aug 2007