SE 507
Notes on Deriving Specifications from Requirements: An Example
by Michael Jackson and Pamela Zave

This web page follows the material presented in the paper mentioned in the title, hereafter referred to by DSR, which appeared on pages 15-24 of the Proceedings of ICSE'95 (ACM's 1995 International Conference on Software Engineering). It also draws some information from other papers co-authored by Jackson and Zave, including

First we review some terms that Jackson uses, which are summarized on page 24 of FDC.

Given a requirement, we progress to a specification by purging the requirement of all features that would preclude implementation by the machine. What features? Any that refer to environment phenomena that are not accessible to the machine or that impose constraints upon phenomena that the machine does not control.


Phenomena: Issues of Visibility/Sharing and Control

In Jackson's way of looking at things (see RMRS), phenomena are partitioned into four categories:

  1. eh: controlled (or initiated) by the environment and hidden from (i.e., invisible to) the machine.
    Example: The property of there being 4 people on the elevator car, or the "event" in which a person entering the elevator car.
  2. ev: controlled by the environment but visible to the machine.
    Example: the state of the sensor that indicates whether the elevator car is at the 3rd floor, or the event of someone pushing the DOWN button on floor 8.
  3. sv: controlled by the machine but visible to the environment.
    Example: the action of turning on the elevator's motor.
  4. sh: controlled by the machine and hidden from the environment.
    Example: the value of program variable currentFloor.

The phenomena in ev and sv, being those that are visible to both environment and machine, are at the interface between the two. (See Venn diagram on page 38 of RMRS.)

Domain knowledge involves only phenomena in e (which is shorthand for the union of eh and ev).

Requirements resulting from the "requirements analysis" phase of software development typically involve only phenomena from eh, although they might mention some from ev as well.

A specification describes intended behavior of the machine at its interface with the environment —and hence mentions only phenomena in ev and sv— sufficient for ensuring that the requirements are satisfied, provided that the domain knowledge that has been established is accurate. Furthermore, any constraints mentioned must be on phenomena in sv, or else the machine would not be capable of implementing it for the reason that only phenomena in s are directly controlled by the machine.

From the point of view of the machine, a specification is a starting point for programming. That is, it describes how a program must behave. From the point of view of the environment, a specification is a special case of a requirement.

The main idea of this paper is to illustrate the idea of deriving a specification from a requirement by purging it (the requirement) of all features that would preclude implementation by the machine.

Doing this relies upon the use of domain knowledge, which allows us to reason about constraints that the machine can impose upon environment-controlled phenomena (i.e., those in e), indirectly, by constraining those in sv

The Example: Zoo Turnstile

Our problem involves a zoo to which customers gain entry by inserting a coin/token into a slot, which allows them to pass through a turnstile. This mechanical apparatus has already been chosen, setting the limits of the environment for which the requirements must be stated. Hence, the requirements cannot be about the profitability of the zoo, or the best use of the real estate that it occupies, or what price should be charged to enter the zoo, or whether the owner would be better off if he sold the zoo and took up yoga. All of these are outside the scope of the problem.

The goal is to develop software that controls the turnstile in such a way that no one can enter without paying and anyone who has paid is allowed to enter. (More precisely, the number of people who enter must not exceed the number of payments.) It is assumed that there is an electrical interface through which the machine can sense coins being dropped into the coin slot and through which it can control (i.e., lock and unlock) the turnstile.

Designating Environment Phenomena

Jackson emphasizes (especially in MR) the idea that a small-as-possible collection of ground terms should be designated to enable us to write descriptions of (and constraints upon) environment phenomena. A designation associates a ground term (typically a predicate) to an informal, but precise, description of the environment phenomenon to which it refers.

Ground terms serve to fix the relationship between the description and what it describes.

All other terms will be defined in terms of the ground terms. Definitions are simply for the purpose of providing shorter and simpler descriptions, but they add nothing in terms of descriptive capabilities.

Example from pages 13-14 of MR paper: the set of ground terms {male, female, parent} is sufficient for describing family relationships (aunt, sister, 2nd cousin, etc.). (Note: if we assume that male(x) = !female(x), only one among male and female is needed as a ground term, as either one can be defined in terms of the other!)

For the Zoo problem, we use the common technique of designating actions/events by predicates. That is, for each different kind of action/event, we designate a predicate that recognizes precisely that kind of action/event.

These all happen to be one-argument predicates; in general, the designated terms will tend to include multi-argument predicates, too.

Classifying the just-designated terms, we have that

Because the occurrence of events through the passage of time is what is at issue here, we also make the following designations, which assume that events and intervals between events are "individuals" that strictly alternate. That is, each event "ends" some interval and "begins" another. This is suggested by the following "timeline" picture, in which vertical bars indicate events and dashed line segments are the intervals between them.

-----------|---|------|----------|---|--|------|--- ...

The assumption is that Earlier() defines a total ordering on events, meaning that, viewed as a binary relation on events, Earlier is transitive and has the property that, for every pair of distinct events e and f, exactly one of Earlier(e,f) and Earlier(f,e) is true.

Note that if we were not assuming events to be instantaneous, we'd have to introduce phenomena relating to the delay between, for example, the coin being inserted into the slot and the machine sensing it. (In that case, a coin being inserted would not be an event visible to the machine, but the "coin entered" signal that reaches it later would be.)


IND1

Let's begin by describing an indicative safety property of the environment, which is an example of domain knowledge. (A safety property refers to one that prohibits certain events from happening in certain states).

Note: The names of properties in boldface are taken from DSR; the ones in parentheses are from FDC, which also discusses the zoo problem, but not in its entirety. (The Buchi automata appearing in FDC are better insofar as final states are indicated.)

IND1 (G1) says that Push and Enter events must strictly alternate, beginning with a Push. This is a property of the turnstile mechanism. (See Buchi automaton on page 18 of DSR or page 14 of FDC.)

Note that the states PE0 and PE1 appearing in the automaton are, in effect, being defined in terms of (the designated terms) Push and Enter. Informally, we can give the definitions

PE0(v) = among all events to occur prior to interval v, either none was a Push or else an Enter occurred sometime after the last Push.

PE1(v) = among all events to occur prior to interval v, at least one was a Push and no Enter occurred after the last occurrence of Push.

We leave it to the reader to provide formal definitions of the predicates PE0 and PE1 in terms of the designated terms. Having defined these, we can express IND1 as follows:

IND1: (∀e,v | PE0(v) ∧ Ends(e,v)  ⇒  ¬Enter(e))   ∧   (∀e,v | PE1(v) ∧ Ends(e,v)  ⇒  ¬Push(e))

Notice that the formal version of IND1 says that, if the turnstile is in state PE0 (respectively, PE1), the next event to occur cannot be an Enter (respectively, Push). IND1 does not say that, if the turnstile is in state PE0 (respectively, PE1), the next event to occur must be a Push (respectively, Enter). Why not?


IND2

The second indicative property, IND2 (G4) (which is also a safety property), says that if Lock and Unlock events have alternated in the past, beginning with Unlock, then a Push cannot occur if a Lock occurred more recently than the most recent occurrence of Unlock.

The reason for stating this property as an implication (if ... then ...) is that we don't know how the turnstile will behave (i.e., whether or not it will allow a Push to occur) if it ever happens that two Lock events occur without an intervening Unlock or two Unlock events occur without an intervening Lock.

In order to simplify the description IND2, we first define the states LU0 (locked), LU1 (unlocked), and LU2 (unknown) in terms of the sequence of Lock and Unlock events that have occurred. This definition is called DEF1 and is done by means of a Buchi automaton. (See bottom right of page 18 in DSR or the automaton labeled G4 on page 14 of FDC.)

Notice that DEF1 is purely definitional, in the sense that it asserts no safety or liveness constraints. Its only purpose is to define the three states LU0, LU1, and LU2. We leave it to the reader to provide formal definitions of the predicates corresponding to these states.

Using the states defined in DEF1, we can express IND2 as follows:

IND2 (G4): When the turnstile is locked (i.e., is in state LU0), a Push cannot occur.

More formally,

IND2: (∀e,v | Event(e) ∧ Interval(v) : LU0(v) &and Ends(e,v)  ⇒  ¬Push(e))


Note: Interestingly, from the indicative statements in DSFR it is not possible to conclude that every Push is followed (at some later time) by an Enter. (Or is it? Explore this.) On the other hand, FDC includes an indicative statement, G2, which says that, assuming that Push and Enter events have alternated so far, beginning with a Push, every Push must be followed (at some point in time) by either a Push or an Enter.

From IND1 (G1) and G2, one can conclude that every Push is followed (at some point in time) by an Enter, giving us a liveness property. End of note.

The Requirements

There are basically two:

  1. No one should enter without paying.
  2. Anyone who has paid should be allowed to enter.

It would be too constraining to require that Coin and Enter events strictly alternate. (Imagine a scenario in which one person drops several coins into the slot in order to pay for a group of people behind her in the turnstile queue.) Hence, the machine should allow payments to be made "in advance". A more precise way to state the two requirements, then, is

  1. At any moment, the number of Enter events to have occurred must not exceed the number of Coin events to have occurred.
  2. At any moment at which the number of Coin events to have occurred exceeds the number of Enter events to have occurred, an Enter event must be allowed to occur.

To make the descriptions of the requirements more concise, Jackson first defines predicates Push#(v,n), Enter#(v,n), and Coin#(v,n) meaning, respectively, that the number of Push, Enter, and Coin events, respectively, occurring before interval v was n.

It's not clear to me why we should prefer to use these three predicates rather than the integer functions #Push(v), #Enter(v), and #Coin(v) having the property

(#Push(v) = n)  =  Push#(v,n)

and similarly for #Enter and #Coin.

Using these integer functions, we can state requirement 1 as OPT1 (see bottom of left column on page 19 in DSR), (also see the Buchi automaton G5 on page 14 of FDC).

OPT1: (∀v | Interval(v) : #Enter(v) ≤ #Coin(v))

Requirement 2, OPT2, is that a visitor whose entry fee has been paid will not be prevented from entering. Partially formally, this can be stated as

OPT2: (∀v | Interval(v) : #Enter(v) < #Coin(v)  ⇒  Enter event is allowed to occur)

Later we will make OPT2 more precise.


Specifications

Both requirements and specifications are expressed in terms of environmental phenomena. A requirement describes a desired relationship among environmental phenomena (typically ones that are of direct interest to customers/users); a specification describes a desired, implementable behavior of the machine in the environment (for the purpose of ensuring the satisfaction of some requirement). Hence, all specifications are requirements, but not all requirements are specifications.

As Jackson explains in the right column on page 19 of DSR, to be a specification, a requirement must be such that

  1. all environment phenomena mentioned are shared with the machine (i.e., are in either ev or sv).
  2. all phenomena that are constrained are machine controlled (i.e., in sv).
  3. all constraints upon events are expressed in terms of preceding events or states in preceding intervals. That is, the conditions for causing, or preventing from happening, an event can be evaluated in the current state, without the need for making use of any (possible) future state.

By this definition, neither OPT1 nor OPT2 is a specification!

Why? Because both are expressed in terms of Enter events, which are not machine controlled. Hence, (1) is violated.

To realize OPT1, the machine must either force Coin events to occur or prevent Enter events from occurring. If we adopt the former interpretation, (2) is violated, as we are constraining a type of event (namely, Coin) that is not machine-controlled.

Jackson explains that OPT1 also fails to satisfy (3):

OPT1 also constrains the state in every interval, including those that are still in the future. When the machine executes, or refrains from executing, any event, it must ensure that OPT1 will hold afterwards. A requirement based in this way on a future state ... cannot be a specification...
Exactly how one determines whether a requirement violates (3) is not clear to me. Jackson says nothing about OPT2 violating (3), yet I see no particular difference in the forms of OPT1 and OPT2 that distinguishes them with respect to this issue.

Refining Requirements to Obtain Specifications

To obtain a specification from the requirements, we make use of the indicative environment properties (i.e., domain knowledge).

Letting R be the requirements and E the indicative environment properties, we want to find S (specifications) (meeting (1), (2), and (3) above) such that

S, E ⇒ R

meaning that in any state in which both S and E hold, so must R. This is actually an oversimplification of what we really need, a topic that is explored in some depth in RMRS.

Refinement of OPT1

Let's try to refine OPT1, which says that the number of Enter events can never exceed the number of Coin events. As there is no way for the machine to force Coin events to occur, it must act to prevent Enter events from occurring in certain situations. The machine does have some (indirect) control over Enter events, because of the properties asserted in IND1 and IND2. Specifically, IND2 says that Push cannot occur if the turnstile is in the locked state, LU1, which the machine is able to cause (if necessary) by instigating a Lock event. (This assumes that the machine is careful never to cause the turnstile to enter the "unpredictable" state LU2 by sending two Lock commands without an intervening Unlock, or vice versa.) Also, IND1 says that Pushes and Enters alternate strictly. Hence, if Push is prevented by the turnstile being locked, so is Enter!

Because we need to ensure (as pointed out in the previous paragraph) that Locks and Unlocks alternate strictly, in order to be sure that each Lock brings the turnstile to state LU1 (locked, from which Push is impossible), we state it as an optative property:

OPT3 (G3): Lock and Unlock events strictly alternate, beginning with the latter.

We leave it to the reader to provide a formal definition of OPT3.

From IND1 we obtain

IND3: (∀v | Interval(v) : #Push(v)-1 ≤ #Enter(v) ≤ #Push(v))

This property allows us to express OPT1 in terms of #Push rather than #Enter:

OPT1a: (∀v | Interval(v)  :  #Push(v) ≤ #Coin(v))

Jackson notes that, of the three conditions that a requirement must satisfy to be a specification (recall discussion above), OPT1a satisfies conditions (1) and (2), but not (3), because it refers to future states. Again, it's not clear to me exactly why.

Specifically to obtain requirements that satisfy (3), Jackson refines OPT1a into a safety property (OPT4) and a liveness property (OPT7). OPT4 says that Unlock must not occur when the turnstile is locked (i.e., in state LU0) and the next visitor has not yet been paid for. This, of course, will prevent the number of Enter events from exceeding the number of Coin events, at least when the turnstile is in state LU0.

OPT4: (∀v,e  |  LU0(v)  ∧  #Push(v) = #Coin(v)  :  Ends(e,v)  ⇒  ¬Unlock(e))

It is not clear to me exactly why OPT4 satisifies condition (3) whereas OPT1a did not.

OPT7 says that, in certain situations, a Lock event must occur in time to prevent a subsequent Push (which of course inevitably leads to an Enter), namely when the turnstile is unlocked (i.e., in state LU1) and the next visitor has not yet been paid for. These situations are formally defined as follows:

(DEF2)   ReqLock(v)  :  LU1(v)  ∧  #Push(v) = #Coin(v)

Investigation of the (documentation of the) turnstile mechanism reveals that hydraulic damping guarantees a delay of at least 750 msecs between a Push and an Enter and an additional 10 msec between an Enter and the next Push. That is,

IND4: Duration[PE0] ≥ 10 msec   ∧   Duration[PE1] ≥ 750 msec

Hence, between each Push event and the next, a minimum of 760 msecs will pass. This means that the machine has less than this much time to lock the turnstile in the case that ReqLock() holds. This gives us OPT7:

OPT7: Duration[ReqLock] < 760 msec

Refinement of OPT2

Recall that OPT2 says, informally, that if the next customer's fee has already been paid (i.e., the number of Coin events exceeds the number of Enter events), the machine should not prevent another Enter event. It, like OPT1, is refined into a safety property and a liveness property. (See page 21.)

The safety property, OPT5, which isn't absolutely necessary but which is intended to reduce wear and tear on the turnstile locking mechanism, says that a Lock event should not occur in the case that the turnstile is not locked (i.e., is in state LU1) and the next customer's fee has been paid:

OPT5: (∀v,e  |  LU1(v)  ∧  #Push(v) < #Coin(v)  :  Ends(e,v)  ⇒  ¬Lock(e))

The liveness property, OPT6, says that, if the turnstile is locked but the next customer's fee has been paid, an Unlock event must occur, and soon! This set of states is formalized by:

(DEF3)   ReqUnlock(v)  :  LU0(v)  ∧  #Push(v) < #Coin(v))

Now we can formally give OPT6:

OPT6: Duration[ReqUnlock] < 250 msec

Jackson gives no particular reason for choosing the number 250; I suppose it is based upon a judgement that any delay larger than that would give customers the impression that the turnstile is annoyingly sluggish.

Class of
constrained action
Indicative Optative
eh IND1 (G1) (safety) (Enter)
IND1 (G1) ∧ G2 (liveness) (Enter)
OPT1 (G5) (safety) (Enter)
OPT2 (liveness) (Enter)
ev IND1 (G1) (safety) (Push)
IND2 (G4) (safety) (Push)
sv OPT3 (G3) (safety) (Lock, Unlock)
OPT4 (G7) (safety) (Unlock)
OPT5 (G7) (safety) (Lock)
OPT6 (G6) (liveness) (Unlock)
OPT7 (G6) (liveness) (Lock)