Consider the following (mathematical) function, the domain of which is the natural numbers (i.e., nonnegative integers):
Given that addition is commutative, we could just as well have written this as
That is, for any natural number n, sumUpTo(n) is the sum of the integers in the range 0..n. For example, sumUpTo(4) = 4 + 3 + 2 + 1 + 0, or 10.
The given definition lacks rigor, because it relies upon an (intelligent) reader understanding what is meant by "...". Here is a better definition:
sumUpTo(n) = { | 0 | if n = 0 | (1) | |
n + sumUpTo(n-1) | otherwise (i.e., n > 0) | (2) |
This definition is based upon the observation that, for n>0, the "sum up to n" is just n plus the "sum up to n-1". In other words, the "sum up to n" is
in which the boldfaced (and parenthesized) part is the "sum up to n-1".
But is our second definition of sumUpTo any more rigorous than the first? In the first definition, the "..." was troublesome, but the second is suspicious in that the body of the definition refers to the function itself. Does this not lead to a "circularity" (as if field were defined as meadow and vice versa)? No! To demonstrate this, let's apply the function using the second definition:
sumUpTo(4) = 4 + sumUpTo(3) (by (2)) = 4 + 3 + sumUpTo(2) (by (2)) = 4 + 3 + 2 + sumUpTo(1) (by (2)) = 4 + 3 + 2 + 1 + sumUpTo(0) (by (2)) = 4 + 3 + 2 + 1 + 0 (by (1)) |
We got the correct result! Moreover, it should be clear to the reader that applying the recursive definition of sumUpTo to any other natural number would lead us, by a similar process, to a correct result.
In the definition, line (1) is referred to as the base case and line (2) is referred to as the recursive case. The distinction is that the base case does not refer to the function being defined, whereas the recursive case does. The key to making this work (i.e., to avoid circularity) is that any application of the function in the recursive case must be to a "smaller" argument (i.e., one that is closer to the base case). That way, during an evaluation of the function, we are certain to reach the base case, eventually (i.e., after finitely many applications of any recursive cases).
For this function, the pattern is obvious: An application of sumUpTo to n leads to an application of sumUpTo to n-1, which leads to an application of sumUpTo to n-2, etc., etc., until eventually it is applied to zero (the base case). This is typical.
Let's develop a more general function that "computes" the sum of any range u..v of integers. The informal definition is
Observing that (under the assumption u≤v) the sum of the integers in the range u..v is u plus the sum of the integers in the range u+1..v, we formulate the recursive definition
sumOfRange(u,v) = { | 0 | if u > v | (1) | |
u + sumOfRange(u+1,v) | otherwise (i.e., u ≤ v) | (2) |
Using this definition to evaluate sumOfRange(3,6), we get
sumOfRange(3,6) = 3 + sumOfRange(4,6) (by (2)) = 3 + 4 + sumOfRange(5,6) (by (2)) = 3 + 4 + 5 + sumOfRange(6,6) (by (2)) = 3 + 4 + 5 + 6 + sumOfRange(7,6) (by (2)) = 3 + 4 + 5 + 6 + 0 (by (1)) |
Alternatively, we could have observed (again assuming that u≤v) that the sum of the integers in the range u..v is the sum of the integers in the range u..v-1, plus v. So we could have written the recursive case in the definition as
To make this a little more interesting, let's define a function that computes the sum of the values in an array segment a[u..v]. (We assume that the elements of the array contain numbers of some kind.) Informally,
Using observations analogous to those we employed in developing our recursive definition of sumOfRange, we get
sumOfSeg(a,u,v) = { | 0 | if u > v | (1) | |
a[u] + sumOfSeg(a,u+1,v) | otherwise (i.e., 0≤u≤v<a.length) | (2) |
Using this definition to evaluate sumOfSeg(a,3,6), we get
sumOfSeg(a,3,6) = a[3] + sumOfSeg(a,4,6) (by (2)) = a[3] + a[4] + sumOfSeg(a,5,6) (by (2)) = a[3] + a[4] + a[5] + sumOfSeg(a,6,6) (by (2)) = a[3] + a[4] + a[5] + a[6] + sumOfSeg(a,7,6) (by (2)) = a[3] + a[4] + a[5] + a[6] + 0 (by (1)) |
How does this notion of recursion relate to computer programming? Well, just as it makes sense for the "body of" a function's definition to refer to the function itself, it makes sense for a method's body to refer to the method itself. In other words, in programming, recursion manifests itself in having methods that call themselves! Most modern programming languages, including Java, support this technique.
Here again is our recursive definition of the function sumOfRange:
sumOfRange(u,v) = { | 0 | if u > v | (1) | |
u + sumOfRange(u+1,v) | otherwise (i.e., u ≤ v) | (2) |
Adapting it into a Java method, we get what is shown in the left half of the figure below. The correspondence between the two should be quite clear. Notice that the base case and recursive case correspond to the two branches of the if-statement. Every properly written "recursive" method will be of a similar form (although there could be multiple branches for distinct base cases and/or distinct recursive cases). Of course, what is important is that if a client were to make the method call sumOfRange(3,6), for example, the value returned to it would be 3+4+5+6, or 18. And, indeed, this is the case.
/** Returns the sum of the integers in the range u..v */ public static int sumOfRange(int u, int v) { int result; if (u > v) { result = 0; } else { result = u + sumOfRange(u+1,v); } return result; } |
Incoming parameters value u v returned +-----+-----+-------------+ | 5 | 8 | 26 (5+21) | +-----+-----+-------------+ | 6 | 8 | 21 (6+15) | +-----+-----+-------------+ | 7 | 8 | 15 (7+8) | +-----+-----+-------------+ | 8 | 8 | 8 (8+0) | +-----+-----+-------------+ | 9 | 8 | 0 | +-----+-----+-------------+ |
The right half of the figure above is a table that shows a "trace" of the method's execution in response to the call sumOfRange(5,8). As we "manually" simulate execution of the method, we can fill in the columns showing the values of the parameters received by each instance of the method, going from the top row to the bottom row as the method keeps calling itself (i.e., the recursion descends).
When the base case is reached (depicted by the bottom row), we can fill in the "value returned" cell in that row. Here, that is zero. Then we can work from that row upwards to fill in the remaining cells in that column (as we simulate the ascent of the recursion).
Now, each non-base case instance of the method returns the result of adding the value of its parameter u to the value produced by its own call to itself. For example, the instance of the method that received parameters u:6 and v:8 (see second row) adds u (i.e., 6) to the value produced by the call it made to itself (i.e., 15), thereby producing 21.
Here again is our recursive definition of function sumOfSeg:
sumOfSeg(a,u,v) = { | 0 | if u > v | (1) | |
a[u] + sumOfSeg(a,u+1,v) | otherwise (i.e., 0≤u≤v<a.length) | (2) |
Expressing it as a Java method (and assuming that the array holds values of type int), we get
/** Returns the sum of the values in the segment a[u..v]. ** pre-condition: 0 <= u,v < a.length */ public static int sumOfSeg(int[] a, int u, int v) { int result; if (u > v) { result = 0; } else { result = a[u] + sumOfSeg(a,u+1,v); } return result; } |
It turns out —even if the examples we've done so far might not illustrate it!— that recursion is a powerful problem-solving technique. Indeed, if you have a problem for which the solution (to any non-trivial instance) can be expressed in terms of one or more "smaller" instances of the same problem, you can use recursion to solve it. In most of the examples above, the "problem" involved a range of integers u..v and we were able to express its solution in terms of a solution to the same problem with respect to the smaller range u+1..v. In particular, computing the sum of the elements in a[u..v] was expressed in terms of computing the sum of the elements in (the shorter segment) a[u+1..v].
Lest you think that the only kind of problem to be solved using recursion is one that asks for a sum to be calculated, consider the problem of identifying the maximum element in an array segment. In a segment of length one, the maximum value is, obviously, the lone element in the segment. In a longer segment, the maximum is the larger of the first value and the maximum value in the rest of the segment. Assuming that max is a function that yields the larger of its two arguments, we can express this as
maxInSeg(a,u,v) = { | a[u] | if u = v | (1) | |
max(a[u], maxInSeg(a,u+1,v) | otherwise (i.e., 0≤u<v<a.length) | (2) |
As a Java method, one translation is
/** Returns the maximum value in a[u..v] ** pre-condition: 0 <= u <= v < a.length */ public static int maxInSeg(int[] a, int u, int v) int result; if (u == v) { result = a[u]; } else { result = Math.max(a[u], maxInSeg(a,u+1,v)); } return result; } |
Had we forgotten about the max() method in the class java.Math, we might have written the else branch as follows:
int maxInRest = maxInSeg(a,u+1,v); if (a[u] > maxInRest) { result = a[u]; } else { result = maxInRest; } |
/** Returns the lowest-numbered location, within the segment a[u..v], ** that is occupied by the maximum value in that segment. */ public static int locOfMaxInSeg(int[] a, int u, int v) { ... } |
All the recursive methods that we have developed so far are observers, in the sense that they calculate results but don't change the state of any object. Let's remedy that by developing a method that reverses the order of the elements in an array segment. The vital observation is that to reverse an array segment, it suffices to swap its first and last elements and to reverse the segment in between them! That is, to reverse the segment a[u..v], we swap the values at locations u and v and reverse the segment a[u+1..v-1]. (In which order we do these does not matter, as neither operation "interferes" with the other.) The base case occurs when the segment's length is one or less, as then nothing needs to be done to effect a reversal. Here is the method:
/** Reverses the order of the elements in the segment a[u..v]. ** pre: 0≤u≤v+1≤a.length */ public static void reverseSeg(int[] a, int u, int v) { if (u >= v) { // segment length is <= 1, so nothing need be done } else { swap(a,u,v); // swap the values at locations u and v reverseSeg(a,u+1,v-1); // reverse elements in a[u+1..v-1] } } |
To sort an array segment a[u..v], we can employ Selection Sort, which can be expressed via pseudocode like this:
for each i in the range u..v { Let k be the location of the minimum value in a[i..v]; swap(a,i,k); } |
Employing pseudo-code again, but expressing ourselves in a recursive fashion, here is how to sort array segment a[u..v] using the Selection Sort approach:
if (u >= v) { // segment length is <= 1, so nothing needs to be done } else { Let k be the location of the smallest value in a[u..v]; swap(a,u,k); sort the segment a[u+1..v] } |
Exercise: Translate the pseudocode immediately above into a recursive Java method that sorts array segment a[u..v]
The binary search strategy (in which we find a "solution" within a range low..high by repeatedly cutting the search space in half) can be expressed not only using loops (such as we have done before) but also recursively. The following recursive method finds the unique location k, where low≤k≤high, such that every element in array segment a[low..k-1] is less than key, which is less than or equal to every element in a[k..high-1]. (Of course, a[low..high-1] is assumed to be in ascending order.)
/** Returns the unique value of k in the range low..high satisfying the ** condition a[low..k-1] < key ≤ a[k..high-1] ** pre: 0 ≤ low ≤ high ≤ a.length and the elements in a[low..high-1] ** are in ascending order */ int indexOfBinSearch(int[] a, int low, int high, int key) { if (low == high) { return low; } // the only result possible! else { int mid = (low+high) / 2; if (a[mid] < key) // result must be in the range mid+1..high { return indexOfBinSearch(a, mid+1, high, key); } else // a[mid] ≥ key, so result must be in the range low..mid { return indexOfBinSearch(a, low, mid, key); } } } |
In Java, one can use the Integer.toString() method to translate a value of type int into a String object that expresses that integer as a decimal numeral (or even a numeral in another base). For example, the call Integer.toString(5723) yields as its result (a reference to) the String "5723".
Even had we not known about the Integer.toString() method, we could have written our own method to do the same thing, like this:
public static String intToString(int m) { return "" + m; } |
This takes advantage of the fact that, when a numeric value is combined with a String via the concatenation operator, Java automatically converts the number to a String first (in effect, by implicitly calling a toString() method as just described).
As an academic exercise, let's design a recursive version of our method intToString().
Recall from above that a recursive solution to the problem of computing the sum 1 + 2 + 3 + ... + n of the first n positive integers becomes evident when we recognize that, for n > 0, this is the sum of the first n-1 positive integers (i.e., a smaller instance of the same problem), plus n.
Using 4257 as an example,
we can make a similar observation regarding
the problem under consideration:
the string corresponding to 4257 is the concatenation of
(1) the string corresponding to 425 (i.e., "425") with
(2) the character corresponding to 7 (i.e., '7').
To state it more generally, for any integer m,
where m ≥ 10, the string corresponding to m is the
concatenation of
(1) the string corresponding to m/10
(i.e., m with its ones digit chopped off) with
(2) the character corresponding to m % 10
(i.e., the ones digit of m).
Assuming that we can figure out how to compute the character corresponding to an integer less than 10 (e.g., '7' for 7), it will not be difficult to transform the plan above into a recursive method in which any value in the range 0..9 is covered by the base case and all greater numbers are covered by the recursive case.
Because there are a finite number of cases for it to handle, a method that translates an integer in the range 0..9 into its corresponding char value is not difficult. Here is a brute force approach:
/* Given an int value in the range 0..9, returns ** the corresponding value of type char. ** E.g., 4 maps to '4', 7 maps to '7'. ** pre: 0 ≤ dig < 10 */ public static char digitToChar(int dig) { if (dig == 0) { return '0'; } else if (dig == 1) { return '1'; } else if (dig == 2) { return '2'; } else if (dig == 3) { return '3'; } else if (dig == 4) { return '4'; } else if (dig == 5) { return '5'; } else if (dig == 6) { return '6'; } else if (dig == 7) { return '7'; } else if (dig == 8) { return '8'; } else if (dig == 9) { return '9'; } else { return 'X'; } // violates precondition } |
With a little knowledge about how values of type char are represented internally, we could have written the method above to be not only more concise, but also more efficient in terms of execution time. The knowledge we need is that the char values '0' through '9' are represented by consecutive integer values and that, in Java, if k is an expression of type int having value in the range 0..9, then the expression (char)('0' + k) is the value of type char corresponding to k. For example, (char)('0' + 5) has value equal to '5'.
Armed with this knowledge, we can rewrite our digitToChar() method as follows:
/* Given an int value in the range 0..9, returns ** the corresponding value of type char. ** E.g., 4 maps to '4', 7 maps to '7'. ** pre: 0 ≤ dig < 10 */ public static char digitToChar(int dig) { return (char)('0' + dig); } |
Using the method above, we can now write the method of interest:
/* Given a nonnegative int value, returns the corresponding decimal ** numeral in the form of a String. E.g., 5234 maps to "5234". */ public static String intToString(int m) { if (m < 10) { return "" + digitToChar(m); } else { return intToString(m / 10) + digitToChar(m % 10); } } |
Interestingly, any algorithm that can be expressed using loops can be expressed recursively, and vice versa. Hence, if we have one construct, we don't need the other. But having both at our disposal is advantageous because, depending upon the circumstance/context, one is sometimes preferable to the other. From the standpoint of efficiency (running time and memory usage), loops are usually superior. From the standpoint of elegance and readability, recursion is often superior. Indeed, some algorithms can be expressed concisely and elegantly using recursion but less so using loops. Perhaps the best-known example is Quick Sort, developed in about 1960 by C.A.R. Hoare.