CMPS 134   Fall 2022
Programming Assignment #4: Calendar Date Lexer
Due: 11:59pm, Wednesday, October 26

Background: Lexical Analysis

In computer science, the term lexical analysis refers to the process of identifying, within a string of characters, its smallest meaningful component parts, called lexemes. When (the source code of) a Java class is provided as input to a Java compiler, the compiler's first action is to apply its lexical analyzer (or lexer, for short) for the purpose of identifying each occurrence of a keyword, identifier, comment, literal (numeric or otherwise), operator (arithmetic, relational), etc., etc.

The next phase of compilation is syntactic analysis, in which the sequence of lexemes identified during lexical analysis is processed in order to determine whether or not the original source code is free of syntax errors and, if so, to build an intermediate representation of the program that can be converted into executable code (called bytecode in the case of Java).

To make an analogy, lexical analysis in the context of analyzing sentences written in English would correspond to finding the individual words and punctuation symbols and in determining each word's part of speech (e.g., noun, verb, adjective). Syntactic analysis would then correspond to diagramming each sentence (e.g., by identifying subjects, verbs, direct objects, prepositional phrases, etc.).

Background: Calendar Dates

Calendar dates are expressed in several different formats, but they all convey the same three pieces of information: month, day, and year. Several formats in common use are listed in Wikipedia's Calendar_date entry.

For this particular programming assignment, we will be interested in four calendar date formats, which we illustrate here using examples:

Format NameExample 1Example 2Example 3
Y_MONTH_D "2013 SEPTEMBER 5" "55 MARCH 15""1964 APRIL 10"
M/D/Y "11/29/2019" "2/5/1915" "7/4/576"
Y-M-D "1974-04-08" "0002-05-23" "1678-12-19"
DMonY "7Dec1954" "26Apr674" "12May1825"

The double quotes surrounding each date are not part of the date itself, but are there simply to emphasize that each date is a string with no leading or trailing spaces.

Notice that, among the four formats of interest to us, only Y-M-D requires all three components (year, month, and day) to be expressed using fixed-length fields, including leading zeros for padding if necessary. (We will assume that only years in the range 0..9999 are of interest to us, so that four digits are always sufficient.) The only other field of fixed length is the month within the DMonY format, which has length three.

Given a calendar date in a particular format, identifying its component parts (i.e., lexically analyzing it) is fairly easy. This is especially so for the Y-M-D format, because each of its fields is fixed in length. The Y_MONTH_D and M/D/Y formats are easy, too, because in each one the fields are separated by a particular character that acts as a delimiter: space in the former and slash in the latter. The DMonY format is slightly more difficult, because there is no delimiter. Rather, the month field begins with the first occurrence of a letter and ends with the last. (Or, perhaps better: it begins with the first occurrence of a letter and has length three).

Requirements

For this assignment, you are to finish the development of an incomplete Java class, called CalendarDateLexer. As its name suggests, its purpose is to act as a lexer (i.e., lexical analyzer) on strings that are (assumed to be) calendar dates.

Specifically, the class includes twelve public methods, three for each of the four calendar date formats introduced above. The three methods devoted to dates in the Y-M-D format are yearFromYMD(), monthFromYMD(), and dayFromYMD(). The methods devoted to the other formats are similarly named, with YMD replaced in an obvious way.

As the method names suggest, each one has as its purpose to return the appropriate substring of the calendar date that it receives via its formal parameter. For example, the call monthOfYMD("1974-04-08") should evaluate to "04".

In order to complete the class, you will have to develop the bodies of several of the methods. Each one is well marked by a comment that says "STUB!". (The term "stub" refers to a method whose body is essentially empty.)

Keep in mind that the methods in CalendarDateLexer are not responsible for verifying the validity of the calendar dates passed to them as parameters. They are responsible for providing "meaningful" results only when given valid dates. Thus, when developing the code for each method, assume that the String passed to it via its formal parameter is a valid calendar date in the assumed format.
Input
M/D/Y 7/14/1975
Y_MONTH_D 2018 JANUARY 7
DMonY 25Dec1856
Y-M-D 0987-05-11
Output
M/D/Y: 7/14/1975
  Year: 1975
  Month: 7
  Day: 14

Y_MONTH_D: 2018 JANUARY 7
  Year: 2018
  Month: JANUARY
  Day: 7

DMonY: 25Dec1856
  Year: 1856
  Month: Dec
  Day: 25

Y-M-D: 0987-05-11
  Year: 0987
  Month: 05
  Day: 11

Testing Your Work

In order to make it convenient to test your work, provided is a Java application program CalendarDateLexerTester and an accompanying data file. The program reads data from a file, under the assumption that each line of data contains a calendar date format identifier followed by a calendar date in that format. For each line of input data, the program echos the indicated format and date, followed by (on successive lines) the year, month, and day components of the date. Of course, it obtains the year, month, and day components by calling the appropriate methods of the CalendarDateLexer class.

To the right appears sample input data (as could appear in the data file) and the output that the program should produce if fed that input data. The output that it actually produces will depend upon the methods in the CalendarDateLexer class that the student is responsible for completing.

How does the application program "know" from which data file to read its input? Answer:

The next likely question is: How is a run argument specified when a Java program is run via jGrasp? Answer:

  1. Make the application program to be run the current class (by clicking on the window containing its source code).
  2. Click on Build (in the Menu Bar) so that the associated menu appears.
  3. Click on the Run Arguments box on the menu so that it has a checkmark in it. A text box should immediately appear above the application program's source code.
  4. Enter into that text box one or more run arguments, separated by spaces. For this program, you would enter cal_dates.txt (or the name of whatever file you want to use as input).
  5. Click on the Run icon.

A sample input data file is provided, as indicated above. But you are encouraged to make up your own input data, or to modify that which was given. A good choice of a text editor for this purpose is the one provided by jGrasp. To make a new file, click on File, then follow the path through New, Other, and Plain Text.


Program Submission

Submit your source code file (CalendarDateLexer.java) from the course web page using the usual "dropbox" utility. (Again, submit only the CalendarDateLexer.java file, not the corresponding .class file and not any other .java file.) Make sure to augment the comments in the given program so that you identify yourself, acknowledge any persons who aided you in developing your solution, and point out any flaws that you know about.

Be aware that you can submit more than one time. Hence, if, after submitting, you improve your program (e.g., by fixing logic errors), you should submit the newer version.