SE 504
Developing an Iterative Program to Compute a Tail Recursive Function

A tail recursive function is one for which there exists a definition of the following form:

H.x  = { f.x if b.x
H.(g.x) otherwise (i.e., ¬b.x)

where b, f, and g are functions that can be defined without reference to H. For some types S and T, the signatures of these functions are

H : S ⟶ T, b : S ⟶ bool, f : S ⟶ T, and g : S ⟶ S

Note that either or both of the types S and T may themselves be cartesian products of types (e.g., S = S1 × S2) so that the definition of tail recursive function should be understood to apply not only to functions of one argument but also to multi-argument functions. If, for example, we had S = S1 × S2 and we preferred to view H —as well as f, b, and g— as being two-argument functions, we could write the definition of H as follows:

H.x1.x2 { f.x1.x2 if b.x1.x2
H.(g1.x1.x2).(g2.x1.x2) otherwise (i.e., ¬b.x1.x2)

(Here we have that g1 and g2 are such that g.x1.x2 = (g1.x1.x2, g2.x1.x2).)

The generalization of this to S = S1 × S2 × ... × Sk (k > 2) should be obvious.

What makes these definitions "tail" recursive is that, in the recursive case, the result is simply that of an application of the function being defined (with no operators needing to be applied to that result). Thus, when evaluating the function at a given value, the recursive "call" is the very last thing you do (i.e., the tail). Compare this to, say, the standard (non-tail) recursive definition of the factorial function:
Fact.n  = { 1 if n=0
n × Fact.(n-1) otherwise (i.e., n>0)

In evaluating Fact.k (for any k>0), we would (recursively) evaluate Fact.(k-1) and then multiply the result by k.

Exploring the function H described above, we find that

H.x =  H.(g.x)(assuming ¬b.x)
 =  H.(g.(g.x))(assuming ¬b.(g.x))
 =  H.(g.(g.(g.x)))(assuming ¬b.(g(g.x)))
 =  ......
 =  H.(g[K].x) (assuming ¬b.(g[i].x) for all i satisfying 0≤i<K
 =  f.(g[K].x) (assuming b.(g[K].x))

where by g[j].x we mean g.(g.(....(g.x)....)), where g occurs j times. (Formally, we define g[0].x = x and, for j≥0, g[j+1].x = g(g[j].x).) In other words, assuming that there exists some i≥0 for which b.(g[i].x) holds, we find that

H.x = f.(g[K].x)

where K is the minimum such i. That is, K = (min i | 0≤i ∧ b.(g[i].x) : i).

This suggests the following iterative program for establishing y = H.X.

|[con X : S;  // input 
  con K : int;  { K = (MIN i | 0≤i ∧ b.(g[i].X) : i) }
  var x : S;
  var j : int;
  var y : T;    // output variable
  x,j := X,0;
  { loop invariant I : x = g[j].X  ∧  0≤j≤K }
  { bound function t : K - j }
  do j ≠ K ⟶  x,j := g.x, j+1
  { assertion: x = g[j].X  ∧  j=K; hence, H.X = f.x }
  y := f.x;
  { y = H.X }

The code above suffers from the fact that it relies upon "knowing" (magically, apparently) the value of K. From the loop invariant (and our definition of K) it follows that the loop guard is equivalent to ¬b.x. By using this as our guard, we remove the program's dependence upon K.

Further observation of the code reveals that, except insofar as the loop invariant refers to it, the variable j is useless. It is worth investigating, then, whether we can restate the invariant so as not to mention j. It turns out that, indeed, we can, by observing that, as x assumes the values

X, g.X, g[2].X, g[3].X, ...

on successive iterations of the loop, the property H.x = H.X is preserved. Hence, we can state the loop invariant as I : H.x = H.X and we can omit the variable j from the program altogether. Let us prove that this I is, indeed, a loop invariant. To do so, it suffices to prove that the truth of I is established by the initialization code and that its truth is preserved by an arbitrary iteration of the loop. (These correspond to proof obligations (i) and (ii) in the loop checklist.) That I is established by the initialization is obvious (a proof of {true} x := X {H.x = H.X} is left to the reader); that I is preserved by each loop iteration is proved by showing the Hoare triple

{I ∧ ¬b.x} x := g.x {I}

which is equivalent to

[I ∧ ¬b.x ==> wp.(x := g.x).I]

Here is the proof:
Assume I (i.e., H.x = H.X) and ¬b.x
    wp.(x := g.x).I

 =    < wp assignment law >

    I(x := g.x)

 =    < defn of I >

    (H.x = H.X)(x := g.x)
 =    < textual sub >

    H.(g.x) = H.X

 =    < assumption ¬b.x; by defn. of H, ¬b.x implies H.x = H.(g.x) >

    H.x = H.X

 =    < assumption >


Unfortunately, the bound function still refers not only to j but also to K. Although it is not as elegant as the original, the bound function can be described as

t : (MIN i | 0<=i ∧ b.(g[i].x) : i)

In other words, t is the number of (additional) times that g must be applied to x in order for the result to satisfy b. Of course, whether or not such a number exists depends upon g, b, and x. In the typical case, x will be an integer, b will be true when x is sufficiently close to zero, and g.x will be a number closer to zero than is x.

The final version of the program is as follows:

|[con X : S;
  var x : S;
  var y : T;
  x := X;
  { loop invariant I : H.x = H.X }
  { bound function t : (MIN i | 0<=i ∧ b.(g[i].x) : i) }
  do ¬b.x ⟶  x := g.x
  { assertion: H.x = H.X  ∧  b.x ; hence, H.X = f.x }  
  y := f.x;
  { y = H.X }

As a concrete example of a tail recursive function definition, we offer this:
H.n  = { n if 0≤n≤1
H.(n-2)otherwise (i.e., n>1)

It should not take you long to recognize that this function, when applied to a natural number n, yields zero if n is even and one if n is odd.

In general, however, "natural" examples of tail recursive function definitions are not often encountered.

Pseudo-tail Recursive Functions

Few "natural" recursive function definitions (i.e., ones that someone is likely to devise via intuition in formally defining a function) are tail recursive. However, some non-tail recursive function definitions can be transformed to obtain a tail recursive definition of a function having one extra argument and in terms of which the original function can be defined directly. In particular, such a transformation can be applied to any definition having the following "pseudo tail recursive" form:
G.x  = { f.xif b.x
h.x ⊕ G.(g.x)otherwise (i.e., ¬b.x)

where ⊕ : T × T → T is an associative operator having an identity element e and where b, f, and g are functions as described before.

Note: The right-hand side of the recursive case in the definition of G could have been G.(g.x) ⊕ h.x (even if were not commutative/symmetric). This would simply mean that, in what follows, the two operands in every subexpression of the form a ⊕ b should be swapped. End of note.

Examples of pseudo-tail recursive function definitions:

Example 1: the classic factorial function

  Fact.n = { 1               if n=0
           { n × Fact.(n-1)  otherwise (i.e., n>0)

Example 2: a function that calculates the sum of the digits in the decimal (base ten) numeral describing a number. (In 462, for instance, the sum of the digits is 4+6+2 = 12.)

  digit_sum.n = { 0                              if n=0
                { (n mod 10) + digit_sum.(n/10)  otherwise

Example 3: a function that reports whether or not a specified value (x) occurs among the values in the prefix of a specified length (n) of a specified array (b). That is, it answers the question, Does x occur in b[0..n)?

  occurs_in.x.b.n = { false                                if n=0
                    { (b[n-1] = x) ∨ occurs_in.x.b.(n-1)  otherwise

We now show that, for any function that can be defined via a pseudo-tail recursive definition, there exists a "more general" function that can be defined via tail recursion.

Let G be the function defined via the pseudo-tail recursive definition template above. Define H as follows:

H.x.y = y ⊕ G.x

Notice that H has one "extra" argument, y. One might call this the "accumulating argument" in that (something close to) the result of the function application "accumulates" in it as we go deeper and deeper into the recursive applications of the function. This will become evident when we do a concrete example.

Consider the two cases b.x and ¬b.x:

Case b.x:


 =    < defn of H >

    y ⊕ G.x

 =    < defn of G; assumption b.x > 

    y ⊕ f.x
Case ¬b.x:


 =    < defn of H >

    y ⊕ G.x

 =    < defn of G; assumption ¬b.x > 

    y ⊕ (h.x ⊕ G.(g.x))

 =    < associativity of ⊕ >

    (y ⊕ h.x) ⊕ G.(g.x)

 =    < defn of H, with x,y := g.x, y ⊕ h.x >

    H.(g.x).(y ⊕ h.x)

What this establishes is that we may characterize H as follows:

H.x.y  =  { y ⊕ f.x if b.x
H.(g.x).(y ⊕ h.x) otherwise (i.e., ¬b.x)

But this has the format of a (two-argument) tail recursive function definition. Hence, H is tail recursive!

Taken together with the fact that G.x = e ⊕ G.x = H.x.e (recall that e denotes the identity element of ), we have that G is directly definable in terms of a tail recursive function. Summarizing, the figure below shows the pseudo-tail recursive function definition template of G.x, the corresponding (fully-)tail recursive function definition template of H.x.y (where H.x.y = y ⊕ G.x), and the abstract program that computes G.X (equivalently, H.X.e, where e is the identity element of ⊕).

G.x  = { f.xif b.x
h.x ⊕ G.(g.x)otherwise

|[con X : S;
  var x : S;
  var y,z : T;
  x,y := X,e;
  { loop invariant I : H.x.y = H.X.e 
        (equivalently, y ⊕ G.x = G.X) }
  do ¬b.x ⟶  x,y := g.x, y ⊕ h.x;
  { assertion: (H.x.y = H.X.e) ∧ b.x; 
               hence H.X.e = y ⊕ f.x }  
  { equivalently: (y ⊕ G.x = G.X) ∧ b.x; 
               hence y ⊕ f.x = G.X }
  z := y ⊕ f.x;
  { z = H.X.e; equivalently, z = G.X }
  {Q: z = G.X}
H.x.y  =  { y ⊕ f.x if b.x
H.(g.x).(y ⊕ h.x) otherwise

A Concrete Example

Let us carry out this transformation on a concrete example. Take the function digit_sum defined above:

  digit_sum.n = { 0                              if n=0
                { (n mod 10) + digit_sum.(n/10)  otherwise

In accord with the procedure suggested above, we define

digit_sum'.n.m = m + digit_sum.n

from which we derive (in a manner analogous to our analysis of H above, which is left to the reader) that

  digit_sum'.n.m = { m + 0                               if n=0
                   { digit_sum'.(n/10).(m + (n mod 10))  otherwise
(Of course, we can omit the "+ 0" in the base case.)

Using either the pseudo-tail recursive definition of digit_sum() or the (fully-)tail recursive definition of digit_sum'(), we derive this program:

|[con N : nat;
  var n : nat;
  var m : nat;
  var z : nat;
  n,m := N,0;
  { loop invariant I : digit_sum'.n.m = digit_sum'.N.0 }
  { equivalently: m + digit_sum.n = digit_sum.N }
  do n ≠ 0 ⟶  n,m := n/10, m + (n mod 10);
  { assertion: digit_sum'.n.m = digit_sum'.N.0  ∧  n=0;
    hence, digit_sum'.N.0 = m + 0 (= m) }  
  { equivalently: m + digit_sum.n = digit_sum.N ∧ n=0; 
                  hence, m+0 = digit_sum.N }  
  z := m + 0;  // obviously, the "+ 0" can be omitted
  { z = digit_sum'.N.0 }; 
  { equivalently: z = digit_sum.N }

To better appreciate that the "extra" argument that was introduced in transforming pseudo-tail recursive digit_sum into (fully-)tail recursive digit_sum' serves to accumulate the final result, consider this application of digit_sum:


=    < digit_sum.n = digit_sum'.n.0 for all n >

=    < defn of digit_sum', recursive case >


=    < defn of digit_sum', recursive case >


=    < defn of digit_sum', recursive case >


=    < defn of digit_sum', base case >


=    < arithmetic >


End of Concrete Example

Tail-recursive function definitions with more than two cases

Suppose that we have the following function definition:
H.x = { f0.x      if b0.x
      { f1.x      if b1.x
      { H.(g0.x)  if c0.x
      { H.(g1.x)  if c1.x
where [b0.x ∨ b1.x ∨ c0.x ∨ c1.x] (i.e., for every x, at least one of b0, b1, c0, or c1 holds).

This particular example has exactly two base cases and two recursive cases, but it can be easily generalized to any number of each. Strictly speaking, this does not qualify as a tail recursive definition. However, it can be shown that such a definition can be transformed into an equivalent one that is tail recursive. (Exactly how this is accomplished is beyond the scope of this document.) What is important to us is how to transform a function definition such as this into a program that computes the defined function. Well, here it is:

|[con X : S;
  var x : S;
  var y : T;
  x := X;
  { loop invariant I : H.x = H.X }
  do c0.x  ⟶  x := g0.x
  [] c1.x  ⟶  x := g1.x
  { assertion: H.x = H.X  ∧  (b0.x ∨ b1.x) }
  if b0.x  ⟶  {H.x = H.X  ∧  b0.x}  y := f0.x  {y = H.X}
  [] b1.x  ⟶  {H.x = H.X  ∧  b1.x}  y := f1.x  {y = H.X}
  { y = H.X }

Copyright Robert McCloskey 2004-2023