COBOL

Overview

History, Significance, Purpose

The original version of COBOL was designed in 1959 (by a committee) and was intended to be used for data processing applications (as opposed to scientific applications). Hence, its name: COmmon Business Oriented Language. In 1968, the organization that would later become known as ANSI (the American National Standards Institute) certified the first official COBOL standard. Revised standards appeared in 1974 and 1985. Object-oriented versions of COBOL have appeared in recent years, and in 2002 ISO appproved a COBOL standard supporting OO features.

In view of its intended purpose, one must conclude that COBOL has been wildly successful. It has been estimated that, of all computer programming code written in history (i.e., during the past 55 years), over half of it is in COBOL! That much of this code is still in use is evidenced by the fact that, during the recent Y2K "crisis", thousands of programmers were hired to patch old COBOL programs in order to make them "Y2K-compliant". We see, then, that COBOL is still an important programming language. See the March/April 2000 issue of IEEE Software (Vol. 17, No. 2) for several articles that speculate upon what the future holds for COBOL.

Despite COBOL's long-standing success in industry, it long ago became passe among the snobbish element of the computer science community, who view it (not without reason) as a "dinosaur". It is not clear, however, with respect to building data processing applications, whether any of the more modern languages (such as Ada, Java, or C++) is superior to, or even the equal of, COBOL. Robert Glass argues precisely this point in his Practical Programmer column (titled "COBOL -- A Contradiction and an Enigma") in the September 1997 issue of Communications of the ACM. Also see Glass's Loyal Opposition column (titled "COBOL: A Historic Past, a Vital Future?") in the July/August issue of IEEE Software (Vol. 16, No. 4).

One of the persons often cited, inaccurately, as a co-designer of COBOL is Grace Murray Hopper (1906-1992), who rose to the rank of admiral in the U.S. Navy. It is true that Flow-Matic, the first programming language geared towards business data processing and the development of which was led by Hopper, served as a model to those who designed COBOL, but she had no direct participation in the creation of COBOL. (See "The Real Creators of COBOL", by Jean E. Sammet, IEEE Software, March/April 2000, pp. 30-32.) Also, Hopper's work on compilers in the 1950's was instrumental in convincing those in the computing industry of the viability of using high-level languages (rather than only assembly or machine languages), which helped to pave the way for COBOL to become widely used.

For more information, see pages 1-4 of Comprehensive COBOL, by Philippakis and Kazmier.

Syntactic Characteristics (and Oddities)

Given your programming experience (probably with languages such as Java), you might find COBOL's syntax a bit strange, in at least two ways. One is that its statements tend to resemble (imperative) sentences written in English (e.g., ADD Num-Items TO Total-Items). This makes COBOL code readable, to some degree, by non-programmers. It also tends to make COBOL programs annoyingly verbose. Another aspect of COBOL syntax that seems odd, at least to young programmers, is that, unlike more modern languages, it is not "free format". That is, there are strict rules regarding where, on each line, code may be placed. For example, an executable statement may not begin before the fifth position on a line. In earlier versions of the language, the first six positions on each line were reserved for line numbers, and executable statements could not begin before the 12th position on a line.

You should keep in mind that COBOL's syntax reflects the typical computing environment that existed at the time it was designed. Back then, programmers did not enter source code into a file using text editors (e.g., vi, emacs, MS Notepad) or word processors (e.g., MS Word), but rather they used a card punch machine to punch holes in cards. Each card corresponded to a single line of code in a program. To compile/execute a program, the programmer had to take a stack of cards (comprising the program) to the computer operator, who would then feed them into a card reader, which was the standard input device by which computers were given the programs they were to execute. If you were to accidently drop your stack of cards onto the floor, you would have had a very hard time putting them back into correct order were it not for the line numbers encoded on the (first six columns of the) cards!

The hierarchy of syntactic units in COBOL is, from highest to lowest: (sub)program, division, section, paragraph, sentence, statement, clause, word, and character. That is, (sub)programs are composed of divisions, divisions are composed of sections, sections are composed of paragraphs, etc., etc. Luckily, you needn't memorize much of this!

There are approximately 300 reserved words in COBOL (e.g., IF, ADD, PERFORM, ZERO, etc.). Our convention will be to write these entirely in upper case. Words introduced by the programmer, such as names of data items (called data-names), will usually be written as a hyphenated sequence of words, with the first letter in each word in upper case. Examples are Gross-Pay, Counter, Num-Overtime-Hours. A data-name may be up to 30 characters in length and may include alphabetic and numeric characters, as well as hyphens. A data-name may not begin with a hyphen, however. (Note: Some compilers do not distinguish between hyphens and underscores used within a data-name.)

For more information, see pages 26-28 of Comprehensive COBOL.

The Four Divisions of a COBOL (Sub)Program

Every COBOL (sub)program is composed of four divisions, occuring in this order: IDENTIFICATION, ENVIRONMENT, DATA, and PROCEDURE. Their roles are as follows:

For more information, see page 19 of Comprehensive COBOL.