Note: This is an adaptation of Section 6.4 from Programming: The Derivation of Algorithms, by Anne Kaldewaij (Prentice Hall Europe, 1990).
|[ con W : set of T; var x : T; { P: (∃z | z∈W : S.z) } x := ? { Q: x∈W ∧ S.x } ]| |
|[ con W : set of T; var x : T; { P: S'∩W ≠ ∅ } x := ? { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ } ]| |
Clearly, we need a loop. To derive a candidate loop invariant, we first rewrite the postcondition (in the specification on the right) by replacing occurrences of {x} by fresh variable V (of type set of T) and adding the conjunct V = {x}. That gives us this postcondition:
Following the standard heuristic, we take the first two conjuncts of Q' as a candidate loop invariant and the negation of the third conjunct as the loop guard, giving us
|
At this point, we recognize that the assignment V := W will establish the proposed loop invariant. It is not at all clear how to initialize x. The whole point of the program is to "find" a correct value for x, after all. What we need to recognize is that the purpose of V is to act as the set of (remaining) candidates, the last of which will end up being assigned to x. During each loop iteration, one or more of those candidates should be removed (i.e., eliminated) —while preserving the truth of the loop invariant— until such time as V is left with only one member. As that member is guaranteed to satisfy S, it will suffice to make x equal to it.
With this insight, the program skeleton becomes
|[ con W : set of T; var x : T; var V : set of T; { P: S'∩W ≠ ∅ } V := W; { I: V⊆W ∧ S'∩V ≠ ∅ } { t: |V| } do |V| ≠ 1 ---> V := V - ?; od x := lone element in V; { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} } { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ } ]| |
Search by Elimination is based on the observation that, for any two elements in V, one of them can be removed from V without falsifying the loop invariant. On any given iteration, we simply need to be sure not to remove from V its lone remaining member that satisfies S, if indeed it has only one such member. This leads us to this refinement of the program:
|[ con W : set of T; var x : T; var V : set of T; var a,b : T; { P: S'∩W ≠ ∅ } V := W; { I: V⊆W ∧ S'∩V ≠ ∅ } { t: |V| } do |V| ≠ 1 ---> a, b := distinct elements of V; if B0 ---> V := V - {a}; [] B1 ---> V := V - {b}; fi od x := lone element in V; { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} } { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ } ]| |
What are appropriate guards for the selection command? For the first guarded command, we need to choose B0 that truthifies the Hoare Triple
By the Law relating Hoare Triples and wp, this is equivalent to
which by (3.65) (Shunting) is equivalent to
What this says is that, under the assumption that V contains at least two elements, at least one of which satisfies S, B0 must guarantee that a is not the only member of V that satisfies S. A good choice for B0 is thus ¬S.a ∨ S.b (or, equivalently, S.a ⇒ S.b). (By similar reasoning, a good choice for B1 simply reverses the roles of a and b.) That is, we can be sure that a is not the only member of V that satisfies S if either a does not satisfy S or b does!
This section is optional reading.
Let's calculate B0, based upon the formula derived above,
Assume I ∧ |V|≠1 B0 ⟹ wp.(V := V−{a}).I = < wp Assignment Law > B0 ⟹ I(V := V−{a}) = < defn of I and textual substitution > B0 ⟹ V−{a}⊆W ∧ S'∩(V−{a}) ≠ ∅ = < Assumption V⊆W guarantees V-{a} ⊆ W, (3.39) > B0 ⟹ S'∩(V−{a}) ≠ ∅ |
At this point, we wonder how the assumption can be used to rewrite the consequent. Well, the second conjunct of assumption I can be manipulated to expose the consequent of the expression above:
S'∩V ≠ ∅ = < a ∈ V, so V = {a} ∪ V−{a} > S' ∩ ({a} ∪ V−{a}) ≠ ∅ = < ∩ distributes over ∪ > (S'∩{a}) ∪ (S'∩(V−{a})) ≠ ∅ = < A ∪ B ≠ ∅ ≡ (A ≠ ∅ ∨ B ≠ ∅ > S'∩{a} ≠ ∅ ∨ S'∩(V−{a}) ≠ ∅ = < (A ∩ {x} ≠ ∅) ≡ x ∈ A > a∈S' ∨ S'∩(V−{a}) ≠ ∅ = < (3.59) > a∉S' ⟹ S'∩(V−{a}) ≠ ∅ = < (3.57) p⇒q ≡ p∨q ≡ q > (a∉S' ∨ S'∩(V−{a}) ≠ ∅) ≡ (S'∩(V−{a}) ≠ ∅) |
What this shows is that the second conjunct of assumption I is equivalent to a∉S' ∨ B ≡ B, where B is the consequent of the last expression in our derivation above. Thus, we are justified in replacing B by a∉S' ∨ B there. Continuing from where we had left off, and using the insight that the set V−{a} (which includes b, as b∈V and a≠b) has nonempty intersection with S' as long as b∈S':
B0 ⟹ S'∩(V−{a}) ≠ ∅ = < Second conjunct of assumption I, as explained above > B0 ⟹ a∉S' ∨ S'∩(V−{a}) ≠ ∅ <== < strengthening the consequent strengthens an implication; b ∈ V−{a} > B0 ⟹ a∉S' ∨ b∈S' = < x∈S' ≡ S.x > B0 ⟹ ¬S.a ∨ S.b |
We conclude that choosing B0 to be ¬S.a ∨ S.b works. By symmetry, a good choice for B1 (which reverses the roles of a and b) is ¬S.b ∨ S.a.
End of formal calculation of B0 and B1.
Having arrived at our choices for B0 and B1, we get the (still abstract) program
|[ con W : set of T; var x : T; var V : set of T; var a,b : T; { P: S'∩W ≠ ∅ } V := W; { I: V⊆W ∧ S'∩V ≠ ∅ } { t: |V| } do |V| ≠ 1 ---> a, b := distinct elements of V; if ¬S.a ∨ S.b ---> V := V - {a}; [] ¬S.b ∨ S.a ---> V := V - {b}; fi od x := lone element in V; { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} } { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ } ]| |
The abstract program we derived seems strange and over-complicated. Why don't we just iterate through the elements of W until such time as we find one that satisfies predicate S? Even if we keep faithful to the idea of "search by elimination" (rather than "by iteration"), the program could look like this:
a := some element of V; do ¬S.a ---> V := V - {a}; a := some element of V; od |
Ah, but the point is that in some circumstances it is much easier to determine the truth or falsity of ¬S.a ∨ S.b (or its equivalent S.a ⇒ S.b) than of ¬S.a.
A good example of that is in performing a search for the location of a maximum element in an array. Let S.k be the proposition that location k of array A[] contains the maximum value in A[]. By simply observing the value A.k, there is no way to tell whether or not S.k holds. However, if i and j are distinct locations in A[], the truth of A.i ≤ A.j guarantees the truth of S.i ⇒ S.j (i.e., if A.i is the maximum value in A, so is A.j). Thus, location i can be eliminated from the search.
In problems amenable to the Search by Elimination strategy, often the set W can be expressed as a range of integers [bottom..top]. In that case, it can be very convenient to make it so that V is maintained to be a range [low..high], with bottom≤low≤high≤top. On each iteration, the program chooses a to be low and b to be high, so that on every iteration either the lowest or highest value in V is eliminated from the search, preserving that V be in the form of a contiguous range of integers.
Given is a binary relation knows on people (who are identified by the integers in range [0..N)), where knows.i.j ≡ person i "knows" person j. A celebrity is defined to be a person who is known by everyone but who knows no one other than himself. (We assume that knows is reflexive, although it could be argued that celebrities, in particular, tend to be lacking in self-awareness!)
The problem is to identify the celebrity, if one exists, among the persons. To make things somewhat easier, for the moment let us assume that a celebrity exists among the persons. It is clear that there can be at most one, because if there were two of them, say persons a and b, we would need both knows.a.b (for b to be a celebrity) and knows.b.a (for a to be a celebrity) to be true. But the truth of knows.a.b contradicts a being a celebrity, and likewise knows.b.a contradicts b being one.
Imagining that the relation knows is provided in the form of a boolean matrix KNOWS, here is the specification:
|[ con KNOWS : array [0..N)×[0..N) of bool; var r : int; { P: (∃i | 0≤i<N : isCelebrity.i) } r := ?; { Q: isCelebrity.r } ]|where isCelebrity.i ::= (∀j | 0≤j<N ∧ i≠j : KNOWS[j][i] ∧ ¬KNOWS[i][j]) |
An insight to solving this problem is to recognize that observing the value of KNOWS[i][j], for any distinct i and j, tells us that either i is not a celebrity (in the case KNOWS[i][j]) or that j is not a celebrity (in the case ¬KNOWS[i][j]). Hence, using the heuristic that the set V is a range of integers represented by two integer variables low and high, we get the following program
|[ con N : int; { N≥1 } con KNOWS : array [0..N)×[0..N) of bool; var r : int; var low, high : int; { V = [low..high] } { P: (∃i | 0≤i<N : isCelebrity.i) } low, high := 0, N-1; { I: (∃i | low≤i≤high : isCelebrity.i) ∧ 0≤low≤high<N } { t: high - low } do low ≠ high ---> if KNOWS[low][high] ---> low := low + 1; [] ¬KNOWS[low][high] ---> high := high - 1; fi od r := low; { Q: isCelebrity.r } ]| |
For convenience, we assumed as a precondition that a celebrity existed. Without making that assumption, the appropriate postcondition would be
which says that, if a celebrity exists, it must be r. The correct loop invariant would be
which says that no one outside the range [low..high] is a celebrity. Thus, upon termination of the loop (at which time low = high), no person other than low could be a celebrity, so doing the assignment r := low ensures the postcondition.
To determine whether r really is a celebrity, it would be necessary to check KNOWS[r][0..N) (i.e., row r) to make sure that all its elements (except for KNOWS[r][r]) are false and KNOWS[0..N)[r] (column r) to make sure that all its elements are true.
The running time is O(N), as the program's loop iterates exactly N−1 times, with each iteration taking constant time. Checking to see if r is really a celebrity takes O(N) more time, so the total is O(N). Given that the input size is n = N2, an appropriate characterization of the program's running time is O(√n). A naive algorithm, which, say, examines each column in search of one that contains only occurrences of true and, upon finding it, checks the corresponding row to determine if it contains nothing but occurrences of false (except on the main diagonal), would take, in the worst case, time proportional to n = N2 (and thus would be linear time).
0 1 2 3 4 5 6 7 8 9 +-----------------------------+ 0 | 4 6 9 10 12 15 19 22 24 25| 1 | 5 10 11 12 13 18 21 24 26 27| 2 | 6 14 18 21 23 24 26 29 30 33| 3 |10 15 19 24 25 27 31 34 36 37| 4 |11 17 23 25 29 31 33 38 42 43| 5 |15 18 27 29 30 32 35 42 46 50| 6 |16 19 30 34 35 37 41 43 50 51| 7 |20 23 32 37 41 43 44 47 52 53| 8 |24 25 33 40 45 46 49 50 54 57| 9 |26 29 34 41 48 51 55 58 60 62| 10 |30 33 38 43 52 56 58 61 64 67| +-----------------------------+ |
A characteristic of efficient search algorithms is that each probe results in a significant decrease in the size of the search space. As you will recall, in the binary search algorithm, each time an array element is probed (i.e., accessed and compared to the search key), the search space is cut in half. That algorithm exploits the fact that the array elements are in ascending order. (Indeed, without the array having that property, binary search would not work.)
Here we have an M×N array in which the elements in each row and each column are increasing. This suggests the possibility that the search space (which here is initially the set of ordered pairs { i,j | 0≤i<M ∧ 0≤j<N : (i,j) } can be reduced in size quickly.
Where is the "correct" place to probe? If we look in the upper left corner (which necessarily contains the smallest value in the array) and find that the value there is less than the search key, we have reduced the search space by exactly one element!! A similar fate can await us if we probe the lower right corner, which contains the largest value in the array.
A more clever approach would be to probe either the upper right corner or the lower left. We arbitrarily choose the upper right. Referring to our example matrix, we would probe the element at location (0,9). Suppose that it is less than the search key. Since that element (25) is the largest in its row, we know that the search key cannot be in that row. Hence we eliminate that row from further consideration.
Had the value in location (0,9) been greater than the search key, we could eliminate column 9, because the remaining values in that column are even larger.
Taking this idea to its logical conclusion, we maintain variables m and n such that rows [0..m) and columns (n..N) have been eliminated from the search space. During each iteration, either m is increased by one (eliminating yet another row) or n is decreased by one (eliminating yet another column). The loop invariant, in diagram form, looks like this:
+---------------------------------------------+ 0 | | 1 | | . | E L I M I N A T E D F R O M | . | | +-----------------------------+ S E A R C H | m | | | | R E M A I N I N G | S P A C E | . | S E A R C H | | . | S P A C E | | M-1 | | | +-----------------------------+---------------+ 0 1 ... n N |
Here is the program:
|[ con M, N : int; { M>0 ∧ N>0 } con A : array [0..M)×[0..N) of int; var key : int; var m, n : int; // V = {i,j | m≤i<M ∧ 0≤j≤n : (i,j)} m, n := 0, N-1; { I: 0 ≤ m ≤ M ∧ -1 ≤ n < N ∧ (occursIn.key ⟹ (∃i,j | m≤i<M ∧ 0≤j≤n : A[i][j] = key)) } { t: M - m + n } do m ≠ M ∧ n ≠ -1 ∧ A[m][n] ≠ key ---> if A[m][n] < key ---> m := m+1; [] A[m][n] > key ---> n := n-1; fi od { Q: occursIn.key ⟹ A[m][n] = key } ]|where occursIn.key ::= (∃i,j | 0≤i<M ∧ 0≤j<N : A[i][j] = key) |
The program's asymptotic running time is clearly O(M+N). Assuming that M and N are "within a constant factor" of each other, this corresponds to O(√n), where n = M·N is the input size.