CMPS 340 Fall 2008
Programming Assignment #2: LZW Compression/Decompression
Due: 9am, Monday, December 15 (absolute deadline)

Description

Given to you are the following Java classes, all but the last of which are complete:

Here is javadoc-generated documentation for the classes listed above.

Other classes that you are recommended to use in completing development of LZWForText are java.util.HashMap and java.util.ArrayList. Specifically, an instance of HashMap<String,Integer> is well-suited for representing the dictionary (viewed as a mapping from strings to natural numbers) during execution of compress(), and an instance of ArrayList<String> is well-suited for representing the dictionary (viewed as a mapping from natural numbers to strings) during execution of decompress().

Deliverables

Submit via e-mail (to mccloske@cs.uofs.edu) the completed version of LZWForText, together with any other classes that you introduced and that it needs. (You are discouraged from introducing such new classes, because none should be necessary.) If you modify any of the classes BitInputStream, BitOutputStream, NaturalNumBitIO, or Log2Calc, rename it and submit it. This is discouraged, however, as the methods in these classes should be adequate. The LZWForTextTester class is provided simply for your convenience, and you can use it (or ignore it) as you see fit. If you modify it, or develop a different class for doing testing, do not submit it.

The "Simulation" Option

You may find it easier to develop and debug your code if you begin by implementing only a simulation of LZW compression/decompression.

Specifically, design your compress() method so that, each time it writes a number, it does so in the form of text (e.g., "157") rather than in the form of a "raw" bit string. To be able to write text (as opposed to bytes) to compress()'s 2nd argument, which is an instance of OutputStream, use that argument to construct a PrintWriter object, such as is done already in the decompress() method. The resulting object's print() method can be used to write numbers in the form of strings. (Better yet, use println() so that only one number appears on each line, which will make it easier for decompress().)

Analogously, design decompress() so that it "expects" its input to be in the form of integers, in textual form, one per line. To adapt the method's first argument, input (an instance of InputStream), to be able to read text from it, first construct an InputStreamReader object, and from that object construct a BufferedReader object. For example, one might write

BufferedReader br = new BufferedReader(new InputStreamReader(input));

Using br, we can read a line of text into a String by invoking br.readLine() and then determine the integer in that String by applying Integer.parseInt() to it.

After getting compress() and decompress() to work in the manner just described (which amounts to a simulation of LZW compression/decompression), modify the former so that it writes numbers in binary form (using NaturalNumBitIO.writeNatural())) and modify the latter so that, each time it reads a number, it does so by using NaturalNumBitIO.readNatural().

If you are able to get the simulation version of LZWForText working, but not the "true" version, submit both.

Flushing Output

When no more output is to be produced, invoke the flush() method upon whatever object was being used for writing. (This flushes the output buffer so that the data there will get written to their intended destination.)

Sample Data

Use small files to start with, but after you think your code is correct, try it out on a big file, such as one containing The Adventures of Tom Sawyer.