SE 504: Search by Elimination

Note: This is an adaptation of Section 6.4 from Programming: The Derivation of Algorithms, by Anne Kaldewaij (Prentice Hall Europe, 1990).

The Problem

We are given a finite set W (which is understood to be a subset of some universal set T) and a predicate S : W → bool such that S.z holds for some (i.e., at least one) z ∈ W. The task of the program is to find such a z. Below are two formal specifications; the one on the right is expressed in terms of the sets S' = { z  |  z∈W ∧ S.z  :  z } and {x} rather than in terms of S and x. (Note that S' is that subset of W containing all, and only, those elements of W that satisfy predicate S.)

|[ con W : set of T;
   var x : T;
   { P: (∃z | z∈W : S.z) }
   x := ?
   { Q: x∈W ∧ S.x }
]|
|[ con W : set of T;
   var x : T;
   { P: S'∩W ≠ ∅ }
   x := ?
   { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ }
]|

Clearly, we need a loop. To derive a candidate loop invariant, we first rewrite the postcondition (in the specification on the right) by replacing occurrences of {x} by fresh variable V (of type set of T) and adding the conjunct V = {x}. That gives us this postcondition:

Q': V ⊆ W  ∧  S' ∩ V ≠ ∅  ∧  V = {x}

Following the standard heuristic, we take the first two conjuncts of Q' as a candidate loop invariant and the negation of the third conjunct as the loop guard, giving us

|[ con W : set of T;
   var x : T;
   var V : set of T;
   { P: S'∩W ≠ ∅ }
   V, x := ?,?;
   { I: V⊆W ∧ S'∩V ≠ ∅ }
   do V ≠ {x} --->  
      V,x := ?,?;
   od
   { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} }
   { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ }
]|

At this point, we recognize that the assignment V := W will establish the proposed loop invariant. It is not at all clear how to initialize x. The whole point of the program is to "find" a correct value for x, after all. What we need to recognize is that the purpose of V is to act as the set of (remaining) candidates, the last of which will end up being assigned to x. During each loop iteration, one or more of those candidates should be removed (i.e., eliminated) —while preserving the truth of the loop invariant— until such time as V is left with only one member. As that member is guaranteed to satisfy S, it will suffice to make x equal to it.

With this insight, the program skeleton becomes

|[ con W : set of T;
   var x : T;
   var V : set of T;
   { P: S'∩W ≠ ∅ }
   V := W;
   { I: V⊆W ∧ S'∩V ≠ ∅ }
   { t: |V| }
   do |V| ≠ 1 --->  
      V := V - ?;
   od
   x := lone element in V;
   { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} }
   { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ }
]|

Search by Elimination is based on the observation that, for any two elements in V, one of them can be removed from V without falsifying the loop invariant. On any given iteration, we simply need to be sure not to remove from V its lone remaining member that satisfies S, if indeed it has only one such member. This leads us to this refinement of the program:

|[ con W : set of T;
   var x : T;
   var V : set of T;
   var a,b : T;
   { P: S'∩W ≠ ∅ }
   V := W;
   { I: V⊆W ∧ S'∩V ≠ ∅ }
   { t: |V| }
   do |V| ≠ 1 --->  
      a, b := distinct elements of V;
      if B0 --->  V := V - {a};
      [] B1 --->  V := V - {b};
      fi
   od
   x := lone element in V;
   { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} }
   { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ }
]|

What are appropriate guards for the selection command? For the first guarded command, we need to choose B0 that truthifies the Hoare Triple

{I ∧ |V|≠1 ∧ B0} V := V−{a} { I }

By the Law relating Hoare Triples and wp, this is equivalent to

[I ∧ |V|≠1 ∧ B0  ⟹  wp.(V := V−{a}).I],

which by (3.65) (Shunting) is equivalent to

[(I ∧ |V|≠1) ⟹ (B0 ⟹ wp.(V := V−{a}).I)]

What this says is that, under the assumption that V contains at least two elements, at least one of which satisfies S, B0 must guarantee that a is not the only member of V that satisfies S. A good choice for B0 is thus ¬S.a ∨ S.b (or, equivalently, S.a ⇒ S.b). (By similar reasoning, a good choice for B1 simply reverses the roles of a and b.) That is, we can be sure that a is not the only member of V that satisfies S if either a does not satisfy S or b does!


Formal Calculation of B0 and B1

This section is optional reading.

Let's calculate B0, based upon the formula derived above,

[(I ∧ |V|≠1) ⟹ (B0 ⟹ wp.(V := V−{a}).I)]

Assume I ∧ |V|≠1 

   B0 ⟹ wp.(V := V−{a}).I

=    < wp Assignment Law >

   B0 ⟹ I(V := V−{a})

=    < defn of I and textual substitution >

   B0 ⟹ V−{a}⊆W ∧ S'∩(V−{a}) ≠ ∅

=    < Assumption V⊆W guarantees V-{a} ⊆ W, (3.39) >

   B0 ⟹ S'∩(V−{a}) ≠ ∅

At this point, we wonder how the assumption can be used to rewrite the consequent. Well, the second conjunct of assumption I can be manipulated to expose the consequent of the expression above:

   S'∩V ≠ ∅

=    < a ∈ V, so V = {a} ∪ V−{a} >

   S' ∩ ({a} ∪ V−{a}) ≠ ∅

=    < ∩ distributes over ∪ >

   (S'∩{a}) ∪ (S'∩(V−{a})) ≠ ∅
   
=    < A ∪ B ≠ ∅ ≡ (A ≠ ∅  ∨  B ≠ ∅ >

   S'∩{a} ≠ ∅ ∨ S'∩(V−{a}) ≠ ∅

=    < (A ∩ {x} ≠ ∅) ≡ x ∈ A >

   a∈S' ∨ S'∩(V−{a}) ≠ ∅

=    < (3.59) >

   a∉S' ⟹ S'∩(V−{a}) ≠ ∅

=    < (3.57) p⇒q ≡ p∨q ≡ q >

   (a∉S' ∨ S'∩(V−{a}) ≠ ∅) ≡ (S'∩(V−{a}) ≠ ∅)

What this shows is that the second conjunct of assumption I is equivalent to a∉S' ∨ B  ≡  B, where B is the consequent of the last expression in our derivation above. Thus, we are justified in replacing B by a∉S' ∨ B there. Continuing from where we had left off, and using the insight that the set V−{a} (which includes b, as b∈V and a≠b) has nonempty intersection with S' as long as b∈S':

   B0 ⟹ S'∩(V−{a}) ≠ ∅

=    < Second conjunct of assumption I, as explained above >

   B0 ⟹ a∉S' ∨ S'∩(V−{a}) ≠ ∅

<==  < strengthening the consequent strengthens an implication; b ∈ V−{a} >

   B0 ⟹ a∉S' ∨ b∈S'

=    < x∈S' ≡ S.x >

   B0 ⟹ ¬S.a ∨ S.b

We conclude that choosing B0 to be ¬S.a ∨ S.b works. By symmetry, a good choice for B1 (which reverses the roles of a and b) is ¬S.b ∨ S.a.

End of formal calculation of B0 and B1.


Having arrived at our choices for B0 and B1, we get the (still abstract) program

|[ con W : set of T;
   var x : T;
   var V : set of T;
   var a,b : T;
   { P: S'∩W ≠ ∅ }
   V := W;
   { I: V⊆W ∧ S'∩V ≠ ∅ }
   { t: |V| }
   do |V| ≠ 1 ---> 
      a, b := distinct elements of V;
      if ¬S.a ∨ S.b --->  V := V - {a};
      [] ¬S.b ∨ S.a --->  V := V - {b};
      fi
   od
   x := lone element in V;
   { Q': V⊆W ∧ S'∩V ≠ ∅ ∧ V={x} }
   { Q: {x}⊆W ∧ S'∩{x} ≠ ∅ }
]|

The abstract program we derived seems strange and over-complicated. Why don't we just iterate through the elements of W until such time as we find one that satisfies predicate S? Even if we keep faithful to the idea of "search by elimination" (rather than "by iteration"), the program could look like this:

a := some element of V;
do ¬S.a ---> 
   V := V - {a};
   a := some element of V;
od

Ah, but the point is that in some circumstances it is much easier to determine the truth or falsity of ¬S.a ∨ S.b (or its equivalent S.a ⇒ S.b) than of ¬S.a.

A good example of that is in performing a search for the location of a maximum element in an array. Let S.k be the proposition that location k of array A[] contains the maximum value in A[]. By simply observing the value A.k, there is no way to tell whether or not S.k holds. However, if i and j are distinct locations in A[], the truth of A.i ≤ A.j guarantees the truth of S.i ⇒ S.j (i.e., if A.i is the maximum value in A, so is A.j). Thus, location i can be eliminated from the search.

In problems amenable to the Search by Elimination strategy, often the set W can be expressed as a range of integers [bottom..top]. In that case, it can be very convenient to make it so that V is maintained to be a range [low..high], with bottom≤low≤high≤top. On each iteration, the program chooses a to be low and b to be high, so that on every iteration either the lowest or highest value in V is eliminated from the search, preserving that V be in the form of a contiguous range of integers.


Concrete Problem: Identifying the Celebrity

Given is a binary relation knows on people (who are identified by the integers in range [0..N)), where knows.i.j ≡ person i "knows" person j. A celebrity is defined to be a person who is known by everyone but who knows no one other than himself. (We assume that knows is reflexive, although it could be argued that celebrities, in particular, tend to be lacking in self-awareness!)

The problem is to identify the celebrity, if one exists, among the persons. To make things somewhat easier, for the moment let us assume that a celebrity exists among the persons. It is clear that there can be at most one, because if there were two of them, say persons a and b, we would need both knows.a.b (for b to be a celebrity) and knows.b.a (for a to be a celebrity) to be true. But the truth of knows.a.b contradicts a being a celebrity, and likewise knows.b.a contradicts b being one.

Imagining that the relation knows is provided in the form of a boolean matrix KNOWS, here is the specification:

|[ con KNOWS : array [0..N)×[0..N) of bool;
   var r : int;
   { P: (∃i | 0≤i<N : isCelebrity.i) }
   r := ?;
   { Q: isCelebrity.r }
]| 
where isCelebrity.i ::= (∀j | 0≤j<N ∧ i≠j : KNOWS[j][i] ∧ ¬KNOWS[i][j])

An insight to solving this problem is to recognize that observing the value of KNOWS[i][j], for any distinct i and j, tells us that either i is not a celebrity (in the case KNOWS[i][j]) or that j is not a celebrity (in the case ¬KNOWS[i][j]). Hence, using the heuristic that the set V is a range of integers represented by two integer variables low and high, we get the following program

|[ con N : int;  { N≥1 }
   con KNOWS : array [0..N)×[0..N) of bool;
   var r : int;
   var low, high : int;  { V = [low..high] }
   { P: (∃i | 0≤i<N : isCelebrity.i) }
   low, high := 0, N-1;
   { I: (∃i | low≤i≤high : isCelebrity.i) ∧ 0≤low≤high<N }
   { t: high - low }
   do low ≠ high ---> 
      if KNOWS[low][high] --->  low := low + 1;
      [] ¬KNOWS[low][high] --->  high := high - 1;
      fi
   od
   r := low;
   { Q: isCelebrity.r }
]| 

For convenience, we assumed as a precondition that a celebrity existed. Without making that assumption, the appropriate postcondition would be

(∃i | 0≤i<N : isCelebrity.i) ⟹ isCelebrity.r

which says that, if a celebrity exists, it must be r. The correct loop invariant would be

(∀i | 0≤i<low ∨ high<i<N : ¬isCelebrity.i) ∧ 0≤low≤high<N

which says that no one outside the range [low..high] is a celebrity. Thus, upon termination of the loop (at which time low = high), no person other than low could be a celebrity, so doing the assignment r := low ensures the postcondition.

To determine whether r really is a celebrity, it would be necessary to check KNOWS[r][0..N) (i.e., row r) to make sure that all its elements (except for KNOWS[r][r]) are false and KNOWS[0..N)[r] (column r) to make sure that all its elements are true.

The running time is O(N), as the program's loop iterates exactly N−1 times, with each iteration taking constant time. Checking to see if r is really a celebrity takes O(N) more time, so the total is O(N). Given that the input size is n = N2, an appropriate characterization of the program's running time is O(√n). A naive algorithm, which, say, examines each column in search of one that contains only occurrences of true and, upon finding it, checks the corresponding row to determine if it contains nothing but occurrences of false (except on the main diagonal), would take, in the worst case, time proportional to n = N2 (and thus would be linear time).


Another Concrete Problem: Saddleback Search

     0  1  2  3  4  5  6  7  8  9
   +-----------------------------+
 0 | 4  6  9 10 12 15 19 22 24 25|
 1 | 5 10 11 12 13 18 21 24 26 27|
 2 | 6 14 18 21 23 24 26 29 30 33|
 3 |10 15 19 24 25 27 31 34 36 37|
 4 |11 17 23 25 29 31 33 38 42 43|
 5 |15 18 27 29 30 32 35 42 46 50|
 6 |16 19 30 34 35 37 41 43 50 51|
 7 |20 23 32 37 41 43 44 47 52 53|
 8 |24 25 33 40 45 46 49 50 54 57|
 9 |26 29 34 41 48 51 55 58 60 62|
10 |30 33 38 43 52 56 58 61 64 67|
   +-----------------------------+
You are given as input a two-dimensional array (with, say, M rows and N columns) whose values are in increasing order along each row and each column. An example (with 11 rows and 10 columns) is to the right. You are also given as input a value which is to be found within the array. What is an efficient way to search for it?

A characteristic of efficient search algorithms is that each probe results in a significant decrease in the size of the search space. As you will recall, in the binary search algorithm, each time an array element is probed (i.e., accessed and compared to the search key), the search space is cut in half. That algorithm exploits the fact that the array elements are in ascending order. (Indeed, without the array having that property, binary search would not work.)

Here we have an M×N array in which the elements in each row and each column are increasing. This suggests the possibility that the search space (which here is initially the set of ordered pairs { i,j | 0≤i<M ∧ 0≤j<N : (i,j) } can be reduced in size quickly.

Where is the "correct" place to probe? If we look in the upper left corner (which necessarily contains the smallest value in the array) and find that the value there is less than the search key, we have reduced the search space by exactly one element!! A similar fate can await us if we probe the lower right corner, which contains the largest value in the array.

A more clever approach would be to probe either the upper right corner or the lower left. We arbitrarily choose the upper right. Referring to our example matrix, we would probe the element at location (0,9). Suppose that it is less than the search key. Since that element (25) is the largest in its row, we know that the search key cannot be in that row. Hence we eliminate that row from further consideration.

Had the value in location (0,9) been greater than the search key, we could eliminate column 9, because the remaining values in that column are even larger.

Taking this idea to its logical conclusion, we maintain variables m and n such that rows [0..m) and columns (n..N) have been eliminated from the search space. During each iteration, either m is increased by one (eliminating yet another row) or n is decreased by one (eliminating yet another column). The loop invariant, in diagram form, looks like this:

       +---------------------------------------------+
   0   |                                             |
   1   |                                             |
   .   |           E L I M I N A T E D   F R O M     |
   .   |                                             |
       +-----------------------------+  S E A R C H  |
   m   |                             |               |
       |      R E M A I N I N G      |   S P A C E   |
   .   |         S E A R C H         |               |
   .   |          S P A C E          |               |
   M-1 |                             |               |
       +-----------------------------+---------------+
         0 1    ...                 n                 N
 

Here is the program:

|[ con M, N : int;  { M>0 ∧ N>0 }
   con A : array [0..M)×[0..N) of int;
   var key : int;
   var m, n : int;  // V = {i,j | m≤i<M ∧ 0≤j≤n : (i,j)} 
   m, n := 0, N-1;
   { I: 0 ≤ m ≤ M ∧ -1 ≤ n < N ∧
        (occursIn.key ⟹ (∃i,j | m≤i<M ∧ 0≤j≤n : A[i][j] = key)) }
   { t: M - m + n }
   do m ≠ M ∧ n ≠ -1 ∧ A[m][n] ≠ key ---> 
      if A[m][n] < key --->  m := m+1;
      [] A[m][n] > key --->  n := n-1;
      fi
   od
   { Q: occursIn.key ⟹ A[m][n] = key }
]| 
where occursIn.key ::= (∃i,j | 0≤i<M ∧ 0≤j<N : A[i][j] = key)

The program's asymptotic running time is clearly O(M+N). Assuming that M and N are "within a constant factor" of each other, this corresponds to O(√n), where n = M·N is the input size.