Object-Oriented Reﬁnement and Proof using Behaviour Functions

This paper proposes a new calculus for expressing the behaviour of object-oriented systems. The semantics of the calculus is given in terms of operators from computational category theory. The calculus aims to span the gulf between abstract speciﬁcation and concrete implementation of object-oriented systems using mathematically veriﬁable properties and transformations. The calculus is compositional and can be used to express the behaviour of partial system views. The calculus is used to specify, analyse and reﬁne a simple case study.


Introduction
In [Gog75], [Ehr91] and [Gog90] Goguen et al. propose an abstract model of object systems based on standard constructions in Category Theory.They show how to use the constructions to build systems but do not propose a calculus for expressing and reasoning about them.In [Cla99a], [Cla99b] and [Cla99c] a calculus is proposed for expressing object systems based on Goguen's work.The calculus was shown to support incremental system development based on features of Computational Category Theory [Ryd88].The calculus does not have a formal semantics and therefore its link to the abstract object model is weak.This paper develops a formal semantics for the o -calculus.The semantics encodes the required categorical constructions as builtin operators and then uses them to express a number of features of object-oriented systems development including: (under-)specification; refinement; encapsulation; invariant properties.This work contributes to the area of object-oriented systems development by providing a rigorous framework within which aspects of development can be defined and explored.In particular the o -calculus aims to span the gulf between abstract specification and concrete implementation using mathematically verifiable refinement transformations.The o -calculus can express partial views of a system and is therefore suitable as the basis for a semantics of UML [UML98] and as such can be seen as an extension of, or complimentary to, [Cla97] [Eva98] [Eva99] and [Lan98].
The builtin operators of the o -calculus arise from Computational Category Theory.The reader is directed to [Bar90], [Ryd88] and [Gog89] for definitions of the appropriate constructs and to [Cla99a] for a discussion of how these constructs are used in the development of object-oriented systems.This work differs from other approaches with similar aims.The Object Calculus [Bic97] uses similar categorical constructs but uses a logic rather than a -calculus to express models.Following Goguen, we propose that the behaviour of a system is a limit on a diagram of behaviours; diagrams are also used in [Ken99] [Ken97] where the aim is to express logical properties of data.Other calculi have been proposed as the basis for object-oriented systems, notably those defined in [Aba98].The o -calculus differs from other calculi in that it can express partial views of a system, incorporates non-determinism, solve constraints via equalizers and has a builtin notion of refinement via refinement morphisms.
The paper is structured as follows: section 2 gives an overview of the semantic model used for object-oriented systems; section 3 defines the o -calculus used to express the model; section 4 defines a simple system requirements Rigorous Object-Oriented Methods 2000 that is used to demonstrate features of object-oriented system development using the o -calculus in the rest of the paper; section 5 shows how a system invariant is verified; section 6 shows how mutual constraints can be achieved by composing sub-systems; section 7 shows how object-oriented encapsulation can be achieved using the refinement; finally sections 8 and 9 show how refinement achieves concrete data representation and an implementation in Java.

Behavioural Object-Oriented Model
Systems are constructed as a collection of objects.Each object is a separate computational system with its own state modified in response to handling messages.A message is a package of information sent from one object to another.The computation performed when a message is handled by an object depends on the object's current state and causes the object to change state and produce output messages.If we observed an object over a period of time we would see a sequence of messages and state changes: : : : 1 (I1 O1) 7 ;! 2 (I2 O2) 7 ;! 3 : : : where each j is an object state, I j are input messages, and O j are output messages.Such a sequence is an object calculation and describes a single object in state j receiving messages I j causing a state change to j+1 and producing output messages O j .
A message consists of a source object, a target object and some message data.The source and target objects are identified by their object identity tags.For a given object system, the data items which can be passed as messages will be defined for each type of target object.A message, whether input or output, is represented as (t 1 t 2 v ) where t 1 identifies the source object, t 2 identifies the target object and v is the message data.Where any of the message components may be inferred from context they are elided.
Object systems are constructed from multiple objects interacting by passing messages.The state of an object system is a set of object states S. Computation in an object system occurs when the messages in set I are sent to the objects in S producing a new set of object states S 0 and a collection of output messages O: : : : 7 ;! S (I O ) 7 ;! S 0 7 ;! : : : Object-oriented designs represent non-deterministic computational systems.Object calculations are represented as a calculation graph where the nodes of the graph are labelled with sets of states and the edges are labelled with pairs of input and output message sets.
Object system calculations can be transformed by graph homomorphisms.Such transformations can be used as the basis of system composition operations based on graph products and coproducts.Equalizers can be used to constructively find equivalence proofs expressed in terms of graph homomorphisms.The behaviour of a system is expressed as a limit on a diagram consisting of calculation graphs and graph homomorphisms.System properties can be expressed by adding the required behaviour to the diagram and then showing that the limit is preserved.
The rest of this paper uses these features as the semantic basis of a calculus for expressing, verifying and transforming object-oriented system designs.

The o -Calculus
The o -calculus is a notation for expressing object-oriented system designs.It is a standard normal order -calculus [Han94] [Plo75] extended with builtin operators [Lan64] for constructing behaviour functions in terms of behaviour products, coproducts, equalizers and morphisms.
The syntax of the o -calculus is given in figure 1.The semantics of the basic calculus is given as a convertibility relation between terms in appendix A. All o -terms have a type given by the type theory defined in appendix A. The following sugar e 1 whererec v = e 2 is translated as ( v:e 1 )( v:e 2 ).The following sugar case e 1 of : : : else e 2 end is translated as case e 1 of : : : v ! e 2 end.

Object Calculations and Morphisms
An object calculation is a sequence of object state transitions caused as a result of a collection of objects receiving messages, changing state and sending messages.Given an object with identity t, state v and behaviour e, if the object receives messages I, changes state and behaviour to v 0 and e 0 , and produces output messages O then ef(t v)gI = (e 0 f(t v 0 )g O ). (prod e 1 e 2 )(S 1 S 2 ) zip m1 m2 7 ;! (prod e 0 1 e 0 2 S 0 1 S 0 2 ) (e 2 ) = e 1 ( 1 )e 2 = ( 2 )e 2 eq e 1 1 2 = ( e 2 ) Figure 3: System Construction Operators A pre-system behaviour, of type P is a function that expects to be supplied with a set of states.The result is a system behaviour of type O that expects a set of input messages and produces a replacement system behaviour and a set of output messages.A system state of type is either a single object state or a pair of system states.A system message of type M is either a set of object messages or a pair of system messages.System behaviour types are defined below: Object calculations are represented by the transition relation 7 ;! which is defined in figure 2. Each transition is labelled with sequences of trees of input output messages.The operator env associates all atomic state values with a name relative to a given behaviour function, the result is a partial function from names to values.An object calculation e S m 7 ;! e 0 S 0 is well formed when all output messages produced by each transition are input messages in the next transition.
Object calculation morphisms are pairs of functions ( 1 2 ) such that 1 is a mapping between object states and 2 is a mapping between sequences of input output messages.Such a morphism can be applied to a behaviour function e in o to produce a new function ( 1 2 )e whose behaviour is given in terms of a mapping on e-calculations as shown in figure 2. Composition of object calculation morphisms is defined component-wise as follows: ( 1 2 ) ( 3 4 ) = ( 1 3 2 4 ).The type of a calculation morphism is = ( f g !f g M] !M]).

Constructing Systems
The state of a system of objects is a set of binary trees.The leaves of each tree are labelled with object states.Views of the same object may occur at different leaves in the tree providing that they are consistent.A system state S is consistent `S when it is a set of possible states for the same object t: `Si=1 n f(t i )g, when it is the composition of two different object states: `Si=1 n f(t 1 i )g S j=1 m f(t 2 j )g such that t 1 6 = t 2 , when it is the composition of two views of the same object such that attribute names occurring in both have the same values: `Si=1 n f(t i )g S j=1 m f(t j )g when i (n) = j (n) for all i j n 2 dom( i ) \ dom( j ), and finally when pair-wise decomposition of the state is well defined: `S1 S 2 S 3 when `S1 S 2 , `S1 S 3 and `S2 S 3 .
Systems are constructed from objects using the operators ) and eq : O ! !!(O ).The semantics of these operators is given in terms of system calculations.Operator is used to construct a system from its components, operator + is used to construct alternative possible behaviours and eq Rigorous Object-Oriented Methods 2000 is used to express system constraints.System construction is defined using the operators in figure 3 where is disjoint set union and zip merges pairs of sequences to produce sequences of pairs.The operators prod : O ! O ! O and coprod : O ! O ! O are used to construct products and coproducts consisting of behaviour functions and associated behaviour morphisms.They are defined by extending the o -convertibility relation as follows: 1 (prod e 1 e 2 ) = e 1 2 (prod e 1 e 2 ) = e 2 e 1 e 2 = ( p r o d e 1 e 2 1 2 ) 1 e 1 = coprod e 1 e 2 2 e 2 = coprod e 1 e 2 e 1 + e 2 = (coprod e 1 e 2 1 2 ) The theory o is extended with equivalences for the underlying operators prod, coprod and eq.In each case a one step transition defines term equivalence, for example: (prod e 1 e 2 ) S (I O )] 7 ;! e 3 S 0 prod e 1 e 2 S I = ( e 3 S 0 O ) Products and coproducts must observe some simple algebraic properties given in the following theorems.
The following proof shows that there exists an isomorphism between e and e e t (t) (and equivalently e and e t (t) e).
Rigorous Object-Oriented Methods 2000

System Refinement
System development through step-wise refinement is attractive since it allows abstract models to be developed early in the life-cycle and then refined to concrete implementations through a series of verified transformations.Consider two behaviour functions e 1 and e 2 such that e 2 is a more concrete version of e 1 .Typically, the states of e 2 will be related to those of e 1 but will involve more components and inter-relationships.For example, object-oriented design promotes the use of encapsulation whereby structured data is implemented as a collection of objects whose detail is hidden behind method interfaces.The calculations of e 1 will be more abstract than those of e 2 ; e 1 may perform complex tasks in a single computation step whereas e 2 must observe implementation constraints imposed by the target system.
If e 1 is an abstract version of the required system behaviour and e 2 is a (relatively) concrete version then e 2 must do everything that e 1 can do subject to an appropriate transformation on states and calculations.Furthermore, if e 1 is complete then e 2 must not introduce any behaviour that is inconsistent with that defined by e 1 .

Message Passing
Computation occurs in an object-oriented system in terms of message passing.A behaviour is expressed in the design notation as a function which maps incoming messages to a pair (e S O ) where e S is a replacement behaviour and O is a set of outgoing messages.Once the messages O have been produced, the behaviour is immediately ready to handle new incoming messages as specified by e.
The basic model of message handling is therefore asynchronous.This decision arises because object-oriented design notations can express both synchronous and asynchronous message passing.Typically there are different notations to express send message and wait for reply and send message without waiting for reply.
Basing the semantic model on asynchronous message passing does not preclude synchronous message passing since an asynchronous model which incorporates replacement behaviours can implement synchronous messages [Agh86] [Agh91].A message m 2 O is sent synchronously when e is a behaviour that waits for an incoming message m 0 such that m 0 is the response to m.When m 0 is received the behaviour reverts to its original functionality.
Variations on the synchronous model described above are possible.For example, the waiting behaviour may permit a sub-set of the functionality, or may implement a priority based interrupt mechanism, or may allow the behaviour to send messages to itself.
The example program development described in this paper uses a form of synchronous message passing.It is convenient to add syntactic sugar to the design notation capturing this form of message passing.The sugar is a form of let expression occurring in the context of a behaviour function as follows: behaviour agent may carry on handling messages1 .Any incoming message matching p 2 is a response to the messages e 1 ; the response of agent is defined by e 2 .
The semantics of let is defined by a syntax translation to the basic design notation: The locally created behaviour wait is used to extend agent with a handler for the response to messages e 1 .Typically, when the response occurs, e 2 will revert back to the original behaviour agent.

Requirements and Initial Specification
Software to control a simple machine (see figure 4) for dispensing widgets is required.The machine consists of a store of widgets, 4 buttons, an output tray and a two-tone beeper.The buttons are labelled 0 -3.In order to dispense a widget the operator must press the buttons 1, 2 and 3 in order.At any time the operator may cancel the operation by pressing 0. Widgets are removed from the store and delivered to the output tray when they are dispensed.If the operation succeeds the beeper makes a high beep otherwise the beeper makes a low beep.Each widget has a unique identity.
An initial attempt at the required behaviour is shown in figure 5.The behaviour function M has two state components s and o that are sets of widgets representing the store and output respectively.Input messages 0 -2 cause no state change and no output messages.Input message 3 from source object t 0 causes a widget to be dispensed and added to the output tray o if available in the store s.A boolean reply is sent to the source of the message causing a high beep (true) if successful and a low beep (false) if the operation failed.The initial behaviour is under specified since it includes the required behaviour, but also permits illegal sequences of buttons.

A Simple System Invariant
A simple system property is that the number of widgets available in both the store and the output tray is an invariant, i.e. pushing buttons cannot cause widgets to be introduced or lost.This can be expressed as a behaviour: together with a behaviour morphism from M to I that translates an M state (s o) to an I state #s + # o and is identity everywhere else.In order to show that I is an invariant we show that the limit on the diagram M is the same as the limit on the diagram M !I.
The proof shows that there exists a total behaviour morphism : M !I such that a limit on the diagram containing M is unchanged (isomorphic to) a limit on the diagram containing : M !I.The mapping is defined as follows: The following proof is by induction on the length of object calculations.Consider any M transition Mf(t (s o))g m ;! Mf(t (s 0 o 0 ))g and proceed by case analysis on the message sequence m.Note that we omit any message information that is not relevant or can be inferred from context.When m = ( 0 )], 2 (m) = m, f(t (s o))g = f(t (s 0 o 0 ))g and therefore 1 f(t (s o))g = 1 f(t (s 0 o 0 ))g.

-
The behaviour L is a limit on the diagram and contains just those states that are legal.The limit L is not exactly the same as M but there exists an isomorphism between them.Therefore we conclude that : M !I is a property of M. QED

Removing Illegal Message Sequences
The behaviour M is under specified since it permits buttons on the machine to be pressed in illegal sequences.Objectoriented design notations such as UML restrict behaviours such as M using state transition models that impose orderings on sequences of permitted message calls.The machine consists of three states referenced as 1, 2 and 3.
The initial state for the machine is 1.Button 1 may only be pressed in state 1 causing a state change to 2. Button 2 may only be pressed in state 2 causing a state change to 3. Button 3 may only be pressed in state 3 causing a state change to 1; a widget is dispensed as a side effect.Button 0 may be pressed in any state causing a change to state 1.
The state transition machine is expressed as a behaviour function P in figure 6.The behaviour describes a single state component whose value is the machine state.The behaviour handles all machine messages making appropriate state changes but otherwise does nothing.
Behaviour M includes all correct behaviour but also includes incorrect behaviour.Applying the constraint P to M will produce just the required behaviour.The constraint is applied by combining M and P using the behaviour combination operator .This produces a new behaviour M 0 = M P shown in figure 7. The behaviour M 0 has a state (s o ) that is the combination of states from M and P. Message dispatch has been combined so that the conditions from both M and P are taken into account.The multiple patterns (0 ) for all states has been combined into a single pattern (0 ) since the same transition occurs in all cases.Notice that the machine simply ignores buttons that are pressed out of sequence and that if an operator gets into trouble they may always press 0 to reset the machine.

Object-Oriented Encapsulation
The behaviour M 0 does not observe the principle of encapsulation since the state (s o) accesses state components of both s and o.In order to be object-oriented, the behaviour M 0 must reference both s and o as objects.Accessing and Rigorous Object-Oriented Methods 2000 Although both s and o are represented as sets of widgets, they are used in different ways.These differences may result in radically different implementation strategies and so we implement each as a separate behaviour in figure 8.
A store behaviour S handles messages empty and get.The former replies with true when the store is empty and the latter replies with an element of the store selected at random.An output tray behaviour O handles a single message store containing a widget x.The widget is added to the output tray.
The behaviour function M 0 must be refined in order to use references to the store and output tray.The result is a new behaviour function M 00 shown in figure 9.When M 00 receives a message 3 from object t 0 in state 3 it sends a message empty to the store and waits for the reply.If the store is not empty then M 00 sends it another message get and waits for a widget to be returned.On receiving the widget x, M 00 sends a store(x) message to the output tray.Finally, M 00 replies to t 0 with true or false.
Theorem 7 S M 00 O is a refinement of M 0 Rigorous Object-Oriented Methods 2000

A
Figure 4: A Widget Dispensing Machine

Figure 12
Figure12defines a semantics for the o -calculus using a convertibility relation between terms.Figure13defines a type theory for o -terms.
Figure12defines a semantics for the o -calculus using a convertibility relation between terms.Figure13defines a type theory for o -terms.
Figure 12: The Theory o It remains to show that M is a limit on diagrams containing M and : M !I respectively.Proposition 2 M is a limit on the diagram containing M and Id M : M !M.The proof follows directly from the properties of the identity morphism.Now consider the second diagram.Firstly construct a product M I in which nodes are labelled with states from the free product states(M ) states(I).Note that the product contains states that are legal f((t 1 (s o)) (t 2 #s + # o))g and those that are not.Now construct an equalizer e : L ! M I such that 1 e = 2 e: ) ! (M 0 f(t (s o C))g ) (t 0 (3 C )) !case s of !(M 0 f(t (s o A))g f(t t 0 false)g) fwg s 0 !(M 0 f(t (s 0 o f wg A ))g f(t t 0 true)g) empty) !(Sf(t Q)g f(t t 0 Q = )g) (t 0 get) !case Q of fwg Q 0 !(Sf(t Q 0 )g f(t t 0 w )g) case m of store(w) !(Of(t Q f wg)g ) end Figure 8: Store and Output Behaviour changing state in both s and o must occur via message passing.