Derivation of Distributed Programs in the Theory of Owicki and Gries: An Example

This paper describes the derivation of a program for the propagation of information over a network, with acknowledgement (feedback) when the computation is complete. The derivation is carried out in the theory of Owicki and Gries. The paper therefore illustrates the use of this theory for the derivation, as opposed merely to the verification, of distributed multiprograms. Notable is that the derivation, while calculational in style, is carried out with a minimum of formal machinery, e.g., there is no temporal logic. The derivation also serves as a concrete illustration of program reuse. A theory that is based on a shared variable model of communication is shown to manage the design of distributed multiprograms quite well.


INTRODUCTION
This paper applies the theory of Owicki and Gries [6,2,3] to a problem of distributed multiprogramming.The purpose is to show how the theory can be used as a method of distributed multiprogram derivation.To this extent it is a companion to [4] which was written to highlight the usefulness of the theory for deriving multiprograms as opposed to merely verifying them.The theory is simple -proofs are carried out within the programming language using the assertional Hoare style -and old -it was developed in the 1970s.However, recent developments by Feijen and van Gasteren [3] have shown us how to derive multiprograms, rather than merely to verify them.What is more, the design heuristics that are used in these derivations are attractively simple.This exercise is motivated by [5] where Petri nets are used as a means to verify distributed programs.Important differences with [5] are that: here a program is derived and not merely verified; reasoning is applied to program statements, rather than to some other program model such as a Petri net; and temporal logic is avoided.Noteworthy too is that the exercise shows how distributed programs can be derived within a programming model that is based on communication using shared variables.
Section 2 describes the problem: to design a program for the propagation of information over a network, with acknowledgement when the computation is complete.Sections 3 and 4 describe the derivation and Sections 3.1 and 4.1 draw conclusions from the exercise.Finally, Section 4.3 defines the atomicity assumptions of the programming model in a way that confirms the foregoing derivation.The rest of this section briefly describes the programming model and the theory of Owicki and Gries.This is meant to be sufficient to follow the derivation, but see [3] for an impeccable treatment of the theory of Owicki and Gries.
A multiprogram consists of more than one component program (henceforth, called a component) to be executed at the same time.Instead of saying that more than one component is 'executing at the same time', it is equivalent (and simpler) to say that at most one component is executing an The Sixth International Workshop in Formal Methods (IWFM'03) atomic action, but the choice of executing component is not determined by the multiprogram text (see, for example, [7](p3) and [3](p3)).However, the choice is assumed to be fair, in the sense that no component that can execute an action is forever prevented from doing so.The programming language is Dijkstra's guarded commands [1], which supports condition synchronisation with the blocking conditional if B → skip fi, where program control reaches skip if B is evaluated when true and it remains at the guard if B is evaluated when false.The instruction must itself be fair, in the sense that a guard can not remain unevaluated forever.
The theory of Owicki and Gries is based on the idea of a correct multiprogram annotation (in the style of Hoare logic).An annotation is correct when it is both locally correct and globally correct.For action S in component p annotated as a Hoare triple {P } S {Q}, postcondition Q is locally correct when precondition P establishes the weakest liberal precondition of S and Q [1], i.e., when P ⇒ wlp.S.Q Furthermore, assertion Q is globally correct when no action A with precondition R in another component q is capable of falsifying Q, i.e., when and similarly for P : P in component p is globally correct when R ∧ P ⇒ wlp.A.P In the derivation to come, familiarity with Hoare logic will be assumed and the proof of local correctness will largely be left to the reader.This will allow greater attention to be paid to proof of global correctness, the novel aspect of the theory.
A distinction important to the derivation to come is that of shared and private variables.A private variable is a variable written by just one component, and so private to that component.A shared variable is a variable written by more than one component.Private variables, as we shall see, have the attractive property of the [3](p43) Rule of Private Variables: An assertion in a component that only depends on private variables of that component is globally correct.
Since no other component can change the variables, no other component can falsify the assertion that depends upon them.

SPECIFICATION OF A MULTIPROGRAM TO PROPAGATE INFORMATION WITH FEEDBACK (PIF)
The program will be derived in an abstract way and it will later be shown (Section 3.1) how to interpret it as propagating information with feedback.The program is distributed over a fixed, connected network of N > 0 nodes, of which one node is distinguished as the starter, all others being followers.The requirements are (0) Each node can only communicate with its neighbours.
(1) The starter starts before any follower starts.
(2) The starter terminates after all followers terminate.
Requirements (0) and ( 1) are requirements of program topology.(0) may even seem too obvious to write down.However, in the theory of Owicki and Gries, components communicate using program variables and we do well to include an explicit check that these variables can only be shared between network neighbours.Requirement ( 1) is also a requirement on the shape of a solution: the multiprogram starts at one starter node.All the work of the derivation lies in requirements (2) and ( 3). ( 2) is the safety requirement that the starter can not terminate until all other components terminate.It is a safety property because it does not mean that all other components do terminate.Requirement (3) is the progress requirement that the starter does terminate, from which it follows, by (2), that the multiprogram as a whole terminates.

DERIVATION 1
Let relation n define the network of N nodes, such that n.p.q = node p is connected to node q n is symmetric and antireflexive.
The derivation begins with a program ST to compute a network spanning tree.The program is taken from [3](p303).Component R is the code of the root node.The other N − 1 components are symmetric and are represented by component p.Note that each component q has a shared variable v.q, which is used to communicate with its neighbours, and a private variable f.q which records its father in the spanning tree.The types of v.q and f.q are 'node identifier' plus ⊥, a constant distinct from the node identifiers.For example, since the node identifiers range over the set 0 ≤ q < N , ⊥ can take the value N .
Note that a quantified assertion such as ∀q : P : Q is read 'Every q that satisfies P , satisfies Q', so that Further, since all quantified variables q range over the set 0 ≤ q < N , we will abbreviate ∀q : 0 ≤ q < N : Q to ∀q :: Q.Finally, the program itself is sugared so that, for example, code 'do v.q := p for all neighbours q of p' is abbreviated to (∀q : n.q.p : v.q := p).
Note that ST already satisfies three requirements of the PIF program.(0) is met because all communication is through array v and assignment by p to v is restricted to the neighbours of p.
(1) is met if Comp.R is the starter.And (3) is met because of ( 5).This leaves only (2).Anticipating later needs, and by way of illustrating the idea of a correct annotation, Comp.p is annotated Pre: ∀q :: v.q = ⊥ ∧ f.q = ⊥ Comp.R: (∀q : n.q.R : v.q := R) The annotation is made locally correct by strengthening the ST precondition with f.q = ⊥.v.p = ⊥ is globally correct by the rule of widening (v.p is never assigned ⊥), and f.p = ⊥ and f.p = ⊥ are globally correct by the Rule of Private Variables.Note that no changes have been made to the ST code, and no changes will be made to it, so the code can now be abbreviated Comp.R: RST Requirement ( 2) is now met by introducing a private variable t to each component, under invariant T

Inv:
T : ∀q :: q = R ⇒ (t.q = Comp.qhas terminated) The annotation now shows Comp.R to wait for all other nodes to terminate Pre: ∀q :: v.q = ⊥ ∧ f.q = ⊥ ∧ ¬t.q Comp.R: RST if ∀q : q = R : t.q → skip fi {∀q : q = R : t.q} Comp.p: pST t.p := true {t.p} Local correctness of T is proved by the location of t.p := true.Global correctness is automatic because T is invariant.Global correctness of the assertion in Comp.R is by the rule of widening (t.q is never falsified).Or, in other terms, by T , because a terminated component can not be unterminated!This change to the code introduces a common situation in multiprogramming whereby one requirement is met at the cost of another.Topology Requirement (0) is now true only if every non-root node is a neighbour of Comp.R.However, given that the ST code computes a spanning tree, (2): Comp.R terminates after all other nodes ⇐ {(4,5): Comp.R is the root of a spanning tree} Each node terminates after its children , it is enough to arrange each node to terminate after its children (a set of its neighbours).This leads to investigating the conditions for assertion 'q is a child of p that has terminated' to be computed.It is true when q has a father and if the father is p then q has terminated CT.p : ∀q :: CT.q.p CT.q.p : (n.q.p ∧ q = R) ⇒ (f.q = ⊥ ∧ (f.q = p ⇒ t.q))This gives a new annotation for Comp.pComp.p: pST (p = R) if CT.p → skip fi {CT.p} t.p := true and CT.p is globally correct when CT.q.p is globally correct for all q, which follows from the following instance of the global correctness rule (refer to Section 1), where the action is assignment f.q := v.q in Comp.q, and its precondition is f.q = ⊥ f.q = ⊥ ∧ CT.q.p = f.q= ⊥ ∧ (n.q.p ∧ q = R ⇒ f.q = ⊥ ∧ (f.q = p ⇒ t.q)) ⇒ ¬(n.q.p ∧ q = R) ⇒ n.q.p ∧ q = R ⇒ v.q = ⊥ ∧ (v.q = p ⇒ t.q)) ⇒ wlp.(f.q := v.q).(CT.q.p) In words, if a father is assigned to q, it is not p because q is not in the neighbourhood of p.

= {(4,5): f define a finite tree} f alse
In words, the assumption that Comp.R does not terminate allows one to construct an infinite branch in a tree.Since the network is finite, Comp.R must terminate.

Conclusions from derivation 1
The goal was to derive a PIF program, in which a starter node propagates a message M over a network and then waits for acknowledgement that all other nodes have received it.Comp.R is the starter with M .Associate the action 'p sends M to neighbour q' with v.q := p.All nodes receive M because Comp.p does not send a message before receiving M .Associate the action 'acknowledge M ' with t.p := true.Comp.p does not acknowledge M before acknowledgement from its children.When Comp.R, at the root of the tree, receives acknowledgement from its children, all other nodes have acknowledged M .
Choosing ST as the starting point of the derivation is not cheating.Rather, it is an example of what software engineers call 'program reuse' and what mathematicians call 'using a lemma'.
Given that a PIF program must reach all nodes in a network, the computation of a spanning tree is a natural starting point.Indeed, the PIF program reflects a very effective separation of concerns: in a program to (a) propagate information over a network and (b) wait for an acknowledgement, it makes good sense to tackle these two aspects in isolation from each other.Aspect (a) was solved at once with an 'off the shelf' solution courtesy of [3], and the strategy has paid off, particularly in proving the progress Requirement (3), which is the area where the theory of Owicki and Gries is at its weakest.Given that Program ST terminates, then individual deadlock can only arise at the CT.q guard.

Conclusions from derivation 2
Initial effort at this exercise was spent in trying to analyse the echo program in [8](p194) in order to construct a verifying proof of its correctness in the theory of Owicki and Gries.Failure at this is one bit of subjective evidence that multiprograms are easier to derive than to verify.In this derivation Program (1) is used quite effectively as a stepping stone, and the key to this is the knowledge that, in the solution, the termination of a node is preceded by the receipt of a message from all of its neighbours.This knowledge paves the way for the coordinate transformation that replaces variables v and t by s and replaces guard CT.p by N R.p.It is satisfying that important aspects of p → skip fi {N R.p}{CT.p}s.p.(f.p) := true It remains to check Requirements (2) and (3), for which purpose it will be assumed that the first half of Program (2) computes a spanning tree.(This is proved in the postscript below.)Call this code ST b.Under this assumption Program (2) meets Requirement (2).Comp.R also terminates (3).