CMPS 260 Spring 2019
Notes on Topics from Linz's Chapters 9-11

Review

Review idea that using a more complex storage medium (e.g., multi-tape, two-dimensional tape, etc.) does not increase the "power" of a TM (in the sense of what languages can be accepted or what functions can be computed).
Nondeterministic TM: If we allow TM's to be nondeterministic, does that expand the class of languages that can be accepted? NO. Because an NTM can be "simulated" by a DTM. In effect, you can have a DTM do a breadth-first traversal of the tree of possible computations of an NTM. If any of the branches leads to an accepting configuration, the DTM can halt and accept.
Universal TM: This is a TM that, given as input a description of a TM M and a string w, "simulates" the computation of M on w. This is what a real-life digital computer does: when you run a Java (or C, Python, whatever) program P, you are giving the computer the bytecode, or executable, or source code (in the case of interpretation) to the computer as input, together with, say, a file containing input data, and the computer is behaving like a universal TM.
Linear Bounded Automata: A TM whose tape does not extend beyond those cells occupied by the given input string. This kind of automaton gives rise to the Context-Sensitive languages, so-called because of the form of productions in the equivalent grammars. E.g., bABd ⟶ baABd is a production that replaces A by aA but can be applied only when A occurs with a b to its immediate left and Bd to its immediate right. (Thus, each production indicates not only an allowable replacement of a variable by a string but also limits the contexts in which such a replacement is allowed to occur.)

Chapter 11: Recursive and RE Languages

Defn: A language is said to be recursively enumerable (RE) if there exists a TM that accepts it. (A TM accepts a string if, when given that string as input, the TM gets to a final halting state.)

Defn: A language is said to be recursive if there exists a TM that accepts it and that halts on every input string.

One useful way to look at it is that, for a language to be recursive, there must exist a membership algorithm, while to be RE requires only that there is a membership semi-algorithm (meaning one that is not guaranteed to halt when the answer to the question "Is w in L?" is NO).

What does the definition of RE have to do with "enumeration"? Answer: A language is RE iff there exists a (necessarily never-ending, in the case of an infinite language) procedure that enumerates (i.e., prints) its members. To qualify as an enumerating procedure for a language L, the procedure must have the property that, for every w ∈ L, w will be printed within a finite amount of time. This does not preclude the possibility that some elements of L will be printed multiple times.

Given a TM/algorithm that accepts L (but perhaps does not halt in some cases when presented with a non-member of L), how can we use it to devise a TM/algorithm that enumerates the members of L?

Answer:

for k = 1 ... ∞ { for i = 1 .. k { w = w_i; // the i-th string in some ordering of Σ^* if (M accepts w in k or fewer steps) { print w; } } }

For every value of k, this enumeration procedure will print every string that is accepted by M in k or fewer steps, and thus every string in L will be printed over and over again. This is fine, as what is required is only that every string accepted by M is eventually printed at least once, which it will be.

Notice that this procedure assumes that there is an algorithmic way of producing all the strings w₁, w₂, ..., over any alphabet Σ But there is.

For a recursive language, devising an enumeration procedure is simpler, due to the fact that we can use a membership algorithm (as opposed to a semi-algorithm):

for i = 1 ... { w = w_i; // the i-th string in some ordering of Σ^* If M accepts w { print w; } }

Do there exist languages that are not RE?? YES!

In order to show this, we first need to introduce the notion of a countable set.

Defn: A set is countable iff it is either finite or can be placed into one-to-one correspondence with the natural numbers.

Examples of countable sets:

Ordered pairs of natural numbers
Ordered pairs of integers
rational numbers
Σ^* for any finite alphabet Σ

Theorem 11.1: If S is an infinite countable set, then its powerset ℘(S) is not countable.
Note: For the sake of simplicity, it may help to simply take S to be the set ℕ (of natural numbers). The more general result then is quite plausible (and follows easily).

Proof: (Cantor's Diagonalization) Assume that S = {s₀, s₁, s₂, ... } and suppose that we could form a 1-1 mapping between ℘(S) and ℕ. That would mean that the members of ℘(S) could be listed, exhaustively, as S₀, S₁, S₂, etc., etc.

Take S' to be the set { s_i : s_i ∉ S_i}. In other words, for each i, s_i ∈ S' iff s_i ∉ S_i. Clearly, S' ⊆ S and thus there must exist k such that S_k = S'. Now consider whether or not s_k ∈ S_k. We have

s_k ∈ S_k = < assumption S_k = S' > s_k ∈ S' = < by definition of S' > s_k ∉ S_k

What we have just proved is that, under the assumption that ℘(S) is countable, s_k ∈ S_k is equal to its own negation! Hence, that assumption must be false. ■

Using this result, we can show that RE languages exist, simply because there are more languages than there are TM's that can accept them.

Theorem 11.2: For any nonempty alphabet Σ, there exist languages over Σ that are not RE.

Proof: The languages over Σ correspond to the subsets of Σ^*, but Theorem 11.1 assures us that there are an uncountable # of such languages. Meanwhile, the set of all TM's is countable, because every TM can be described by a string in {0,1}^*, and that set is countable. ■

The above is a nonconstructive proof in that it fails to identify any particular language as being non-RE. The next theorem describes such a language.

Theorem 11.3: There is an RE language (over the alphabet {a}) whose complement is not RE.

Proof: (diagonalization) Let M_i be the i-th TM. Define L = { aⁱ : aⁱ ∈ L(M_i)}

First we argue that L is RE. This is so because we can devise a procedure that, upon reading aⁱ, constructs M_i and then simulates the computation of M_i on aⁱ. Our procedure will accept aⁱ iff M_i does.

Now to argue that the complement of L is not RE. The complement of L is L' = { aⁱ : aⁱ ∉ L(M_i)}. Another way to describe L' is to say that (for all i) aⁱ ∈ L' iff aⁱ ∉ L(M_i).

Assume, contrary to what is to be proved, that L' is RE. Then, by definition of RE, there exists k such that L(M_k) = L'. What happens when M_k is given a^k as input? There are two possibilities, each of which leads to a contradiction:

Case 1: M_k accepts a^k (i.e., a^k ∈ L(M_k)). But then, by definition of L', a^k ∉ L', which contradicts the assumption that L(M_k) = L'.
Case 2: M_k fails to accept a^k (i.e., a^k ∉ L(M_k)). But then, by definition of L', a^k ∈ L', which contradicts the assumption that L(M_k) = L'.

Or, to state the argument in another way, consider the expression a^k ∈ L(M_k). We have

a^k ∈ L(M_k) = < assumption L(M_k) = L' > a^k ∈ L' = < by definition of L' > a^k ∉ L(M_k)

Our assumption that L' is RE allowed us to prove that a^k ∈ L(M_k) is equal to its own negation. Hence, that assumption must be false. ■

Theorem 11.2 demonstrates that non-RE languages exist, and Theorem 11.3 gave us a specific example. We have yet to show that there is any difference between the class of RE languages and the class of recursive languages. Theorem 11.4, in conjunction with Theorem 11.3, tells us that, indeed, the recursive languages form a proper subclass of the RE languages.

Theorem 11.4 A language L is recursive iff both it and its complement L' are RE.

Proof: The implication from left-to-right is easy: By the definitions of recursive and RE, every recursive language is RE. Thus, L being recursive implies that it is RE. It remains to show only that the complement L' of L is also RE. But it is obvious that the recursive languages are closed under complement. (A decision algorithm for L can be transformed into a decision algorithm for L' simply by negating the "return value".) Thus, L' is recursive and hence RE.

To show implication from right-to-left is more difficult. Suppose that both L and its complement L' are RE. We must show that there is a decision algorithm for L. Let M_L and M_L' be TM's that accept L and L', respectively. (Such TM's exist if L and L' are RE.) Given a string w, to decide whether it is a member of L, give w to both M_L and M_L' and execute them, in parallel. Because w is member of one of L or L', eventually one of the TM's will accept it. If w is accepted by M_L, answer YES. If w is accepted by M_L', answer NO. ■

Theorem 11.5: There exist RE languages that are not recursive; hence, the recursive languages form a proper subset of the RE languages.

Proof: Follows immediately from Theorem 11.4, which identified an RE language whose complement is not RE, and Theorem 11.5, which tells us that any such language is not recursive.

Section 11.2: Unrestricted Grammars

The connection between automata/machines and grammars also applies to TM's in that it can be shown (see Theorems 11.6 and 11.7) that a language is RE iff it is generated by some unrestricted grammar, which is one in which the only restriction on the left-hand side of a production is that it not be λ.

Section 11.3: Context-sensitive Grammars and Languages

Section 11.4: Chomsky Hierarchy

See figures in Linz, pages 306 and 307.

Chapter 12: Limits of Algorithmic Computation

Interestingly, it is possible, in studying the limits of what computers can do, to focus on decision problems, as other kinds of problems (e.g., optimization) can be, in a sense, reduced to decision problems.

Let us consider the so-called Halting Problem: Given a (string encoding a) TM (or a computer program in some language) M and a input w, will M halt when given w as input?

Does a decision algorithm solving this problem exist? The naive non-solution is to simulate M on input w and see what happens. If the computation halts, we answer YES. The problem with this is that, if M fails to halt, our decision algorithm will never provide a verdict. Now, perhaps our algorithm is "smart enough" to detect when M's computation repeats the same configuration for a second time, in which case we "know" that it is in an infinite loop and an answer of NO can be given. But not every infinite loop results in the same configuration being repeated, and so there will be cases in which our decision algorithm never gives an answer, which means that it is not really an algorithm.

The discussion above points out the difficulty of devising a solution to the Halting Problem, but it does not prove that no solution exists. What follows is such a proof.

Imagine the following Java program, where we assume that the body of the halts() method is a solution to the Halting Problem.

public class Halting1 { /* Prints YES if the Java program provided via the first command-line ** argument, when applied to the input provided via the second command-line ** argument, halts. Otherwise, prints NO. */ public static void main(String[] args) { if (halts(args[0], args[1]) { System.out.println("YES"); } else { System.out.println("NO"); } } /* Returns true if the Java program M, when given w as input, halts, ** and false otherwise. */ public static boolean halts(String M, String w) { ... } }

Now suppose we modify the main() method of the program to get this one:

public class Halting2 { /* Goes into an infinite loop if the Java program provided via the ** first command-line argument, when applied to the input provided via ** the second command-line argument, halts. Otherwise, prints NO. */ public static void main(String[] args) { if (halts(args[0], args[1]) { while (true) { // infinite loop! } } else { System.out.println("NO"); } } /* Returns true if the Java program M, when given w as input, halts, ** and false otherwise. */ public static boolean halts(String M, String w) { ... } }

We can characterize the behavior of Halting2 as follows: Given as inputs a Java program M and a string w, Halting2 halts iff M, when applied to input w, would fail to halt.

Let's modify the program again, almost imperceptibly:

public class Halting3 { /* Goes into an infinite loop if the Java program provided via the ** first command-line argument, when applied to itself as input, halts. ** Otherwise, prints NO. */ public static void main(String[] args) { if (halts(args[0], args[0]) { // <--- change is here while (true) { // just keep iterating } } else { System.out.println("NO"); } } /* Returns true if the Java program M, when given w as input, halts, ** and false otherwise. */ public static boolean halts(String M, String w) { ... } }

We can characterize the behavior of Halting3 as follows: Given as input a Java program M, Halting3 halts iff M, when applied to itself, would fail to halt. Let us define a Java program to be self-convergent if, when given its own source code as input, halts. Meanwhile, we define a self-divergent program to be one that is not self-convergent. Then a more concise characterization of Halting3's behavior is this:

Given as input a Java program M, Halting3 halts iff M is self-divergent.

Now we can "prove" a contradiction, which shows that our assumption (which is that the method halts() exists) is false:

Halting3 is self-divergent = < Halting3 halts iff it is given as input a self-divergent program > Halting3 halts when given Halting3 as input = < definition of self-convergent > Halting3 is self-convergent