CMPS 260 Spring 2019
Notes on Topics from Linz's Chapters 9-11

Review


Chapter 11: Recursive and RE Languages

Defn: A language is said to be recursively enumerable (RE) if there exists a TM that accepts it. (A TM accepts a string if, when given that string as input, the TM gets to a final halting state.)

Defn: A language is said to be recursive if there exists a TM that accepts it and that halts on every input string.

One useful way to look at it is that, for a language to be recursive, there must exist a membership algorithm, while to be RE requires only that there is a membership semi-algorithm (meaning one that is not guaranteed to halt when the answer to the question "Is w in L?" is NO).

What does the definition of RE have to do with "enumeration"? Answer: A language is RE iff there exists a (necessarily never-ending, in the case of an infinite language) procedure that enumerates (i.e., prints) its members. To qualify as an enumerating procedure for a language L, the procedure must have the property that, for every w ∈ L, w will be printed within a finite amount of time. This does not preclude the possibility that some elements of L will be printed multiple times.

Given a TM/algorithm that accepts L (but perhaps does not halt in some cases when presented with a non-member of L), how can we use it to devise a TM/algorithm that enumerates the members of L?

Answer:
for k = 1 ... ∞ {
   for i = 1 .. k {
      w = wi; // the i-th string in some ordering of Σ*
      if (M accepts w in k or fewer steps) {
         print w;
      }
   }
}

For every value of k, this enumeration procedure will print every string that is accepted by M in k or fewer steps, and thus every string in L will be printed over and over again. This is fine, as what is required is only that every string accepted by M is eventually printed at least once, which it will be.

Notice that this procedure assumes that there is an algorithmic way of producing all the strings w1, w2, ..., over any alphabet Σ But there is.

For a recursive language, devising an enumeration procedure is simpler, due to the fact that we can use a membership algorithm (as opposed to a semi-algorithm):

for i = 1 ...  {
   w = wi; // the i-th string in some ordering of Σ*
   If M accepts w { print w; }
}

Do there exist languages that are not RE?? YES!

In order to show this, we first need to introduce the notion of a countable set.

Defn: A set is countable iff it is either finite or can be placed into one-to-one correspondence with the natural numbers.

Examples of countable sets:

Theorem 11.1: If S is an infinite countable set, then its powerset ℘(S) is not countable.
Note: For the sake of simplicity, it may help to simply take S to be the set ℕ (of natural numbers). The more general result then is quite plausible (and follows easily).

Proof: (Cantor's Diagonalization) Assume that S = {s0, s1, s2, ... } and suppose that we could form a 1-1 mapping between ℘(S) and ℕ. That would mean that the members of ℘(S) could be listed, exhaustively, as S0, S1, S2, etc., etc.

Take S' to be the set { si : si ∉ Si}. In other words, for each i, si ∈ S' iff si ∉ Si. Clearly, S' ⊆ S and thus there must exist k such that Sk = S'. Now consider whether or not sk ∈ Sk. We have

    sk ∈ Sk

=       < assumption Sk = S' >

    sk ∈ S'

=       < by definition of S' >

    sk ∉ Sk

What we have just proved is that, under the assumption that ℘(S) is countable, sk ∈ Sk is equal to its own negation! Hence, that assumption must be false.


Using this result, we can show that RE languages exist, simply because there are more languages than there are TM's that can accept them.

Theorem 11.2: For any nonempty alphabet Σ, there exist languages over Σ that are not RE.

Proof: The languages over Σ correspond to the subsets of Σ*, but Theorem 11.1 assures us that there are an uncountable # of such languages. Meanwhile, the set of all TM's is countable, because every TM can be described by a string in {0,1}*, and that set is countable. ■

The above is a nonconstructive proof in that it fails to identify any particular language as being non-RE. The next theorem describes such a language.

Theorem 11.3: There is an RE language (over the alphabet {a}) whose complement is not RE.

Proof: (diagonalization) Let Mi be the i-th TM. Define L = { ai : ai ∈ L(Mi)}

First we argue that L is RE. This is so because we can devise a procedure that, upon reading ai, constructs Mi and then simulates the computation of Mi on ai. Our procedure will accept ai iff Mi does.

Now to argue that the complement of L is not RE. The complement of L is L' = { ai : ai ∉ L(Mi)}. Another way to describe L' is to say that (for all i) ai ∈ L' iff ai ∉ L(Mi).

Assume, contrary to what is to be proved, that L' is RE. Then, by definition of RE, there exists k such that L(Mk) = L'. What happens when Mk is given ak as input? There are two possibilities, each of which leads to a contradiction:

  1. Case 1: Mk accepts ak (i.e., ak ∈ L(Mk)). But then, by definition of L', ak ∉ L', which contradicts the assumption that L(Mk) = L'.
  2. Case 2: Mk fails to accept ak (i.e., ak ∉ L(Mk)). But then, by definition of L', ak ∈ L', which contradicts the assumption that L(Mk) = L'.

Or, to state the argument in another way, consider the expression ak ∈ L(Mk). We have

    ak ∈ L(Mk)

=       < assumption L(Mk) = L' >

    ak ∈ L'

=       < by definition of L' >

    ak ∉ L(Mk)

Our assumption that L' is RE allowed us to prove that ak ∈ L(Mk) is equal to its own negation. Hence, that assumption must be false. ■

Theorem 11.2 demonstrates that non-RE languages exist, and Theorem 11.3 gave us a specific example. We have yet to show that there is any difference between the class of RE languages and the class of recursive languages. Theorem 11.4, in conjunction with Theorem 11.3, tells us that, indeed, the recursive languages form a proper subclass of the RE languages.

Theorem 11.4 A language L is recursive iff both it and its complement L' are RE.

Proof: The implication from left-to-right is easy: By the definitions of recursive and RE, every recursive language is RE. Thus, L being recursive implies that it is RE. It remains to show only that the complement L' of L is also RE. But it is obvious that the recursive languages are closed under complement. (A decision algorithm for L can be transformed into a decision algorithm for L' simply by negating the "return value".) Thus, L' is recursive and hence RE.

To show implication from right-to-left is more difficult. Suppose that both L and its complement L' are RE. We must show that there is a decision algorithm for L. Let ML and ML' be TM's that accept L and L', respectively. (Such TM's exist if L and L' are RE.) Given a string w, to decide whether it is a member of L, give w to both ML and ML' and execute them, in parallel. Because w is member of one of L or L', eventually one of the TM's will accept it. If w is accepted by ML, answer YES. If w is accepted by ML', answer NO. ■

Theorem 11.5: There exist RE languages that are not recursive; hence, the recursive languages form a proper subset of the RE languages.

Proof: Follows immediately from Theorem 11.4, which identified an RE language whose complement is not RE, and Theorem 11.5, which tells us that any such language is not recursive.


Section 11.2: Unrestricted Grammars

The connection between automata/machines and grammars also applies to TM's in that it can be shown (see Theorems 11.6 and 11.7) that a language is RE iff it is generated by some unrestricted grammar, which is one in which the only restriction on the left-hand side of a production is that it not be λ.

Section 11.3: Context-sensitive Grammars and Languages

Section 11.4: Chomsky Hierarchy

See figures in Linz, pages 306 and 307.

Chapter 12: Limits of Algorithmic Computation

Interestingly, it is possible, in studying the limits of what computers can do, to focus on decision problems, as other kinds of problems (e.g., optimization) can be, in a sense, reduced to decision problems.

Let us consider the so-called Halting Problem: Given a (string encoding a) TM (or a computer program in some language) M and a input w, will M halt when given w as input?

Does a decision algorithm solving this problem exist? The naive non-solution is to simulate M on input w and see what happens. If the computation halts, we answer YES. The problem with this is that, if M fails to halt, our decision algorithm will never provide a verdict. Now, perhaps our algorithm is "smart enough" to detect when M's computation repeats the same configuration for a second time, in which case we "know" that it is in an infinite loop and an answer of NO can be given. But not every infinite loop results in the same configuration being repeated, and so there will be cases in which our decision algorithm never gives an answer, which means that it is not really an algorithm.

The discussion above points out the difficulty of devising a solution to the Halting Problem, but it does not prove that no solution exists. What follows is such a proof.

Imagine the following Java program, where we assume that the body of the halts() method is a solution to the Halting Problem.

public class Halting1 {

   /* Prints YES if the Java program provided via the first command-line
   ** argument, when applied to the input provided via the second command-line
   ** argument, halts.  Otherwise, prints NO.
   */
   public static void main(String[] args) {
      if (halts(args[0], args[1]) { 
         System.out.println("YES");
      }
      else { 
         System.out.println("NO");
      }
   }
   
   /* Returns true if the Java program M, when given w as input, halts,
   ** and false otherwise.
   */
   public static boolean halts(String M, String w) {
      ...
   }
}

Now suppose we modify the main() method of the program to get this one:

public class Halting2 {

   /* Goes into an infinite loop if the Java program provided via the 
   ** first command-line argument, when applied to the input provided via
   ** the second command-line argument, halts.  Otherwise, prints NO.
   */
   public static void main(String[] args) {
      if (halts(args[0], args[1]) { 
         while (true) {
            // infinite loop!
         }
      }
      else { 
         System.out.println("NO");
      }
   }
   
   /* Returns true if the Java program M, when given w as input, halts,
   ** and false otherwise.
   */
   public static boolean halts(String M, String w) {
      ...
   }
}

We can characterize the behavior of Halting2 as follows: Given as inputs a Java program M and a string w, Halting2 halts iff M, when applied to input w, would fail to halt.

Let's modify the program again, almost imperceptibly:

public class Halting3 {

   /* Goes into an infinite loop if the Java program provided via the 
   ** first command-line argument, when applied to itself as input, halts.
   ** Otherwise, prints NO.
   */
   public static void main(String[] args) {
      if (halts(args[0], args[0]) {   // <--- change is here
         while (true) {
            // just keep iterating
         }
      }
      else { 
         System.out.println("NO");
      }
   }
   
   
   /* Returns true if the Java program M, when given w as input, halts,
   ** and false otherwise.
   */
   public static boolean halts(String M, String w) {
      ...
   }
}

We can characterize the behavior of Halting3 as follows: Given as input a Java program M, Halting3 halts iff M, when applied to itself, would fail to halt. Let us define a Java program to be self-convergent if, when given its own source code as input, halts. Meanwhile, we define a self-divergent program to be one that is not self-convergent. Then a more concise characterization of Halting3's behavior is this:

Given as input a Java program M, Halting3 halts iff M is self-divergent.

Now we can "prove" a contradiction, which shows that our assumption (which is that the method halts() exists) is false:

    Halting3 is self-divergent

=      < Halting3 halts iff it is given as input a self-divergent program >

    Halting3 halts when given Halting3 as input

=      < definition of self-convergent >

    Halting3 is self-convergent