Cloning in Java

Aliasing occurs whenever two or more references refer to (i.e., point to) the same object. If the object in question is mutable (meaning that its state is able to be changed), aliasing is a dangerous situation.

Example 1: Imagine a class Customer having an instance variable account of type BankAccount and the following method

public BankAccount getAccount() { return account; }

Suppose also that BankAccount has public (mutator) methods makeDeposit() and makeWithdrawal(). Then a Customer object's bank account balance could be modified, in a "back door" kind of way, as follows:

   Customer cust = new Customer(...);
   ...
   ...
   BankAccount acc = cust.getAccount();
   acc.makeWithdrawal(100);

Suppose that it was the intent of the developers of Customer for its getAccount() method to provide, in effect, "read-only" access to the returned BankAccount object. There are various ways to accomplish this, one of which would be for the method to return a (reference to a) clone (i.e., duplicate) of that object, rather than (a reference to) the object itself. (Of course, there is nothing to prevent the clone from being modified by the client thereafter, but the point is that doing so would have no effect upon the original object.) End of Example 1.

Example 2: Imagine a class KeyPadLock whose instances emulate locking devices with keypads. If the correct sequence of digits is entered on its keypad, such a lock can then be opened. Suppose that a newly-created lock's "secret code" is provided to the class's constructor in the form of an array of int values. The class might look like this:

public class KeyPadLock {

   private int[] secretCode;
   ...
   ...

   public KeyPadLock(int[] code) { 
      secretCode = code;
      ...
   }

   ...
   ...
}

Consider the effect of the following code segment:

   int[] junk = { 1, 5, 0, 2 };
   KeyPadLock kpl = new KeyPadLock(junk);
   junk[1] = 9;
   ...

The assignment to junk[1] has the effect, with respect to the KeyPadLock object referenced by kpl, of changing its secret code! This is due to the fact that the constructor in KeyPadLock causes the instance variable secretCode of the newly-created object to be an alias for the junk variable in the client class. Hence, when the client changes the value of junk[1], the value of kpl.secretCode[1] changes, too. If, as is likely, the class developers intended for a KeyPadLock object's code never to be changed (or changed only via some mutator method, such as setSecretCode()), then this is a bad situation. To remedy it, the constructor in KeyPadLock should assign to secretCode (a reference to) a clone of (the object referenced by) code, rather than code itself. End of Example 2.

The examples illustrate that there are times when we should make a clone of an object rather than to allow multiple references to point to it. Let's explore this further by considering the Customer and BankAccount classes mentioned in Example 1 again.

public class Customer{

   private String name;
   private BankAccount account;

   ...
   ...
   public void setName(String newName) { ... }
   public void makeDeposit(int amount) { account.add(amount); }
   public void makeWithdrawal(int amount) { account.subtract(amount); }
   ...
   ...
}


public class BankAccount{ private int balance; ... ... public add(int num) { balance = balance + num; } public subtract(int num) { balance = balance - num; } }

Suppose that we want to give clients of Customer the ability to make a copy of such an instance of this class. Then we might include as a method within Customer the following:

public Customer clone() {
   Customer newCust = new Customer();
   newCust.name = this.name;
   newCust.account = this.account;
   return newCust;
}

Would this be adequate? NO! Why? Because what it returns is a shallow copy, by which we mean an object distinct from the original object, but whose instance variables have values identical with that of the original object. To illustrate, consider the code

    Customer a, b;
    a = new Customer("Chris", 39.62);
    b = a.clone();
    b.makeDeposit(6); 
Even though a and b refer to distinct Customer objects, their respective account fields refer to the same BankAccount object, as depicted in the diagram below. As a consequence, the invocation of b.makeDeposit() has the effect of changing a's account balance, too.

+-----------------+                                  +-----------------+       
|         +---+   |          +---------+             |   +---+         |
|    name | *-|---|--------->| "Chris" |<------------|---|-* | name    |
|         +---+   |          +---------+             |   +---+         |
|         +---+   |          +---------+             |   +---+         |
| account | *-|---|--------->| balance |<------------|---|-* | account |
|         +---+   |          | +-----+ |             |   +---+         |
+-----------------+          | |45.62| |             +-----------------+
          ^                  | +-----+ |                       ^
          |                  +---------+                       |
          |                                                    |
        +-|-+                                                +-|-+
      a | * |                                              b | * |
        +---+                                                +---+

Hence, what we really want is for clone() to produce a deep copy:

   public Customer clone() {
      Customer newCust = new Customer();
      newCust.name = name;  // aliasing OK because String is immutable
      newCust.account = account.clone();
      return newCust;
   }

This code assumes, of course, that BankAccount has its own clone() method. Notice that we don't bother to make a clone of the object referred to by name; that's because it is a String, which is immutable. (If you examine the String class, you will notice that none of its methods are mutators. Hence, there is no way to change the state of a String object.) Allowing multiple references to point to the same immutable object is acceptable, even encouraged, as it consumes less space than having multiple objects that are identical to each other.) Assuming the second version of clone(), the code above involving a and b would produce

+-----------------+                                  +-----------------+       
|         +---+   |          +---------+             |   +---+         |
|    name | *-|---|--------->| "Chris" |<------------|---|-* | name    |
|         +---+   |          +---------+             |   +---+         |
|         +---+   |          +---------+             |   +---+         |
| account | *-|---|--------->| balance |        +----|---|-* | account |
|         +---+   |          | +-----+ |        |    |   +---+         |
+-----------------+          | |39.62| |        |    +-----------------+
          ^                  | +-----+ |        |              ^
          |                  +---------+        |              |
          |                                     |              |
        +-|-+                +---------+        |            +-|-+
      a | * |                | balance |        |          b | * |
        +---+                | +-----+ |        |            +---+
                             | |45.62| |<-------+
                             | +-----+ |
                             +---------+
Java's developers foresaw the need for cloning and included a clone() method with signature

protected Object clone() throws CloneNotSupportedException

in the class Object. What this method does is to return a shallow copy of the object to which it is applied, which must be an instance of a class that implements the Cloneable interface. (Otherwise, a CloneNotSupportedException is thrown.) The Cloneable interface, found in the package java.lang (which is the package that's visible to every Java class, even if it is not explicitly imported), is a marker interface, meaning that it has no methods!! (Why it does not include the clone() method is not clear to this author.)


Design of a clone() method

Now we illustrate the recommended approach for designing a clone() method for a class A. The first thing that such a method should do is to invoke super.clone(), the effect of which should be to create a new instance of A all of whose inherited fields (refer to objects that) are clones of the (objects referred to by the) original object's corresponding fields and all of whose non-inherited fields are identical in value to the original object's corresponding fields. All that remains to do, then, is to make clones of any mutable objects referred to by non-inherited fields and to make the corresponding fields in the new instance of A refer to these clones. (This relies upon all those classes having clone() methods, of course.)

For example, A might be as follows:

public class A extends B implements Cloneable {

   private T1 f1;   // assume that T1 is mutable
   private T2 f2;   // assume that T2 is mutable
   private T3 f3;   // assume that T3 is immutable or primitive
   private T4 f4;   // assume that T4 is immutable or primitive

   ...
   ...

   @Override
   public Object clone() {
  
      try {

         A newObj = (A) super.clone();  // makes clones of fields inherited 
                                        // from class B and copies of rest
         newObj.f1 = (T1) f1.clone();   // make clones of mutable fields
         newObj.f2 = (T2) f2.clone();    
         return newObj;
      }
      catch (CloneNotSupportedException e) {
         // impossible!
         return null;
      }

   }

   ...
   ...

}

This strategy assumes that any superclass of A that introduced a field of a mutable type has an appropriate clone() method. One might ask why a type cast is applied to the result of each invocation of clone() (e.g., (T1) f1.clone()). The reason is (we are assuming here) that the return type of clone() is Object. (That is, we are assuming that all the relevant types have a clone() method that overrides the one in the class Object, the return type of which is Object, of course.)

None of the above is meant to suggest that aliasing (of mutable objects) is everywhere and always a bad thing. In some circumstances, it is absolutely necessary. As an example, consider a circular list of nodes that can be traversed in either direction or a tree of nodes that can be traversed in a parent-to-child or child-to-parent direction. Obviously, in these cases it is necessary for multiple references to "point to" the same objects. For example, a tree node's parent, as well as each of its children, will have a reference that "points to" that node.