British Computer Society BCS Deriving Two-Phase Modules for a Multi-Target Hardware Compiler

This paper adopts the CSP framework for deriving a compilation scheme from a simple imperative language to twophase modules. Two-phase modules are processes that communicate with one another using two-phase handshake protocols. The two-phase modules generated by our compilation scheme can be implemented as asynchronous or clocked circuits. The derivation techniques have been applied to a concurrent language which is a superset of the language discussed.


Introduction
A verified compiler is essential to a structured approach for verifying designs: if its users are confident of the correctness of the translation from the source language to the target description, they can focus on getting the source program right. The verification of the compiler should itself be structured to make it simple, modular and flexible.
We have been investigating a method for verifying a compilation scheme for occam-like languages that targets both asynchronous [2] and clocked [8] hardware. The method involves two steps; the first is to translate the constructs of the source language systematically into an intermediate form known as two-phase modules, which interact with one another using simple two-phase protocols. Figure 1 shows the hardware modules implementing the following program: var x: do true ! x := :x od The second step of our method is to use appropriate protocol converters to derive the implementation of the two-phase modules in a particular target technology. The second step has been described elsewhere [3]; the purpose of this paper is to describe the first step.
The verification of asynchronous and clocked realisations of occam-like languages has been undertaken by a number of researchers, including He et. al. [5], Smith and Zwarico [11], van Berkel [12], and Weber et. al. [14]. There are three principal differences between our work and theirs. First, our approach provides a means of capturing and reasoning about asynchronous and clocked systems within a unified framework; it can also deal with systems containing both asynchronous and clocked elements. Second, our verification strategy is structured into a number of stages to improve modularity and reusability of proofs. Finally, many of our proofs can be automated; we shall identify some of them later.  Figure 1: Two-phase modules for a simple program. Note that each connection actually contains several wires: the assignment module for instance, labelled ':=' above, has a top 2-wire control interface, a right 3-wire read interface and a left 3-wire write interface. The definition of these interfaces can be found in Figure 2.
Our derivation of the two-phase modules includes several novelties. The use of the CSP framework [6,10] makes our descriptions concise and simplifies our presentation. Using the CSP algebra, we can systematically refine source programs into a collection of interacting two-phase modules; program variables are modelled by data modules, while control constructs result in control modules. We also exploit the regularity of the handshake protocols which govern the interaction between these modules -this leads to a notion of refinement for such modules which greatly simplifies our derivation.
The rest of the paper is organised as follows. Section 2 provides an overview of our derivation strategy. Section 3 introduces the source language for our compiler and the two-phase handshake protocols adopted by the intermediate form. Section 4 reviews the algebra of CSP and shows how it can be used to capture handshake protocols. Sections 5, 6 and 7 are devoted to translating variables, expressions and statements from the source program into two-phase modules. Section 8 describes the specification of two-phase modules, and Section 9 outlines the derivation of implementations. Concluding remarks are presented in Section 10.

Overview of strategy
This section provides an overview of our derivation strategy for the sequential subset of our source language; the CSP notation used here will be summarised in Section 4. The network of two-phase modules implementing the source program P is specified as r a (P), where the request signal r activates P and the acknowledgement signal a indicates that P has terminated. r a (P) satisfies r a (P) = r ! P ; a ! r a (P): Thus r a (P) can be activated multiple times, while P cannot. We define a compilation function to be the parallel composition of an activatable master control process r a (M(P)) and a data process D(P): C r a (P) def = r a (M(P)) k D ( P ) : The master control process M(P) involves only synchronised communications with D(P), which maintains the program state. To avoid deadlock on the internal links between M(P) and D(P), we shall ensure that the communications on the added channels satisfy the related two-phase handshake protocols.
Our task is to verify that the compiled design is at least as good as its specification derived from the source program: r a (P) v C r a (P): The verification task can be broken down into two stages. In the first stage, we demonstrate the correct refinement of a program P into the master process M(P) and the data process D(P) executing in parallel: P v M(P) k D ( P ) : This result will be used in the second stage, the main challenge of which is to show that r a is a homomorphism. Taking sequential composition as an example, calling r a a homomorphism means that the activatable control of the composite can be implemented by those of its components together with the control module SEQ: r a (M(P; Q)) v rp ap (M(P)) k SEQ k rq aq (M(Q)) (see Equation 15). We can then establish that r a (P; Q) v rp ap (M(P)) k SEQ k rq aq (M(Q)) k D ( P ; Q ) : An outline of these two verification stages can be found in the appendix. Once we check that the specification of modules like SEQ satisfies the above formula, various asynchronous and clocked implementations can be developed using the technique of protocol conversion [3]. An example is included in Section 9.

Joy and two-phase modules
To simplify the presentation we shall focus on the sequential subset of Joy ( [3], [9], [14]), our source language. Its syntax is given by the following BNF rules, where v stands for program variable of Boolean type, B for Boolean expression and P for process. Note that this subset of Joy deals only with Boolean variables and expressions. A Joy program may include skip, which does nothing except terminating successfully, and assignment, conditional, iteration and sequential composition statements. The Boolean guarded process B ! P is executed when the Boolean guard B is true; its execution completes when P completes execution. Two guarded processes may be composed using [ ], the choice operator. The statement if BGexecutes its boolean guarded command set until one succeeds; do repeatedly evaluates its boolean guarded command set until execution fails. The translation of Joy into its intermediate form is performed in a purely syntax-directed manner. The networks that the Joy compiler generates are delay-insensitive in the following sense: wires with arbitrary (bounded) delay can be introduced between any two primitive components without affecting the functional behaviour of the system.
The idea of using modules communicating with handshake protocols to represent the main components of the source language has been explored by van Berkel [12]. Our work takes an 'indirect approach' in which our language is given a denotation in an existing notation (CSP) and a mapping from CSP to two-phase modules. In contrast, van Berkel takes a direct approach in which each command of the source language is defined by a corresponding handshake process. Furthermore, van Berkel focused on a trace-based model, while CSP provides a more sophisticated failures/divergences model and a rich algebraic system. The signaling interfaces used by the two-phase components are shown in Figure 2. Each handshake interface requires an active partner (marked A in the figure) that begins the handshake by sending a request, and a passive partner (marked P) that responds to the handshake by sending an acknowledgement.
The simplest interface, shown in Figure 2(a), is the two-wire control interface, which consists of one request and one acknowledgement signal. A handshake begins when A sends an event to P along the wire marked r (for request).
A then awaits a response on the wire marked a (for acknowledgement). When A has received an acknowledgement from P, the handshake is complete. Figure 3 shows a module that composes two programs in sequence and has both passive and active interfaces. When triggered through the top input req port, the SEQ module starts the first program in the sequence by dispatching the event through the bottom left output P0.req. When the first program signals termination via the event P0.ack, the second program is started by a P1.req output. SEQ returns ack after it receives P1.ack.
Some interfaces pass data as well as control information. Figure 2(b) shows two interfaces for passing Boolean data encoded on two wires, sometimes known as dual-rail encoding. The read interface is used in expression and guard evaluation. The active partner in the handshake requests a value by sending an event on r; the passive partner sends an acknowledgement on a 0 or a 1 , according to the value it wishes to return.
The three-wire interface in Figure 2(c) is called the write interface, and is used to assign values to variables. Writing a value begins with a request on either r 0 or r 1 , depending upon the value to be written; completion is signalled by an acknowledgement event on a.

CSP
We shall regard the Joy language as a subset of CSP. To describe the operation of the target architecture, we need to include a few more operators. P u Q represents the nondeterministic choice between P and Q in which the environment plays no part, while P [ ] Q stands for the external choice between P and Q where the environment decides which branch is selected to execute. The pattern if B ! P [ ] :B ! Q will be abbreviated as P B Q. The alphabet of P, P , corresponds to the set of events that the process P can engage in. var w : P declares w to be a local variable of P. ? denotes the chaotic process. Recursion is handled by the operator. For instance the handshake protocol for the two-wire interface shown in Figure 2(a), which always performs the event r before the event a and never engages in two consecutive events r before the occurrence of a, is given by: HP(r; a ) def = X:: ((r ! a ! X) [ ] skip): The presence of skip as a possible choice allows HP(r; a )to terminate when its partner in a parallel composition terminates. 1 The modelling of handshake protocols forms an important part of our derivation techniques. What follows elaborates on this example.
Using the CSP algebra, one can show that a sequence of HP(r; a ) is still a handshake protocol: HP(r; a ); HP(r; a )=HP(r; a ) : Parallel composition of P and Q, represented by P k A Q, synchronises on the set of events A common to both P and Q; the A will usually be dropped since it can be deduced from context. Sometimes we abuse the notation by using P k Q to represent a parallel program where the events in the set P \ Q are not hidden. The synchronised communication events between components of a parallel composition are sometimes called channels: c? is an input channel while c! is an output channel. Channels may pass values: c!e sends the value of the expression e to the output channel c, and c?x reads a value from the input channel c and assigns it to the variable x. InputChan(P) and OutputChan(P)representthesetsofinputchannelnamesandoutputchannelnamesrespectively, and Chan(P) def = OutputChan(P) [ InputChan(P).
A process Q with a and b in its alphabet satisfies the two-phase handshake protocol on (a; b) if (Q k HP(a; b)) = Q: This condition will be signified by` a; b Q. Given this condition, one can show that for any process R, HP(a; b) distributes into (Q;R) to give (Q ; R) k HP(a; b) = ( Q k HP(a; b) ) ; ( RkHP(a; b)) = Q ; ( RkHP(a; b)): A similar law will be used later in showing that P; Q v M ( P ; Q ) k D ( P ; Q ) : We need to generalise HP(a; b) in order to pass data in the handshake protocol, as shown in Figure 2(b) and Figure 2(c). One way to achieve this is to let I be a finite set, B be an I-indexed family of finite set of events, and A = f(a(i); B ( i )) j i 2 Ig: The two-phase handshake protocol on A can then be defined as follows: is said to obey the two-phase handshake protocol on the set A if Q k HP(A) = Q: This condition will be referred to as`A Q.
We can now define the two-phase handshake refinement operator v A as follows: where v adopts the refinement ordering in the failures-divergences model of CSP. In other words, R v A S means that S behaves better than R in any environment which obeys the handshake protocol HP(A). One can show that, if R v A S and`A Q, then (Q k R) v (Q k S): Later we shall illustrate how this property can be used in replacing D(P) by a sequential version, SD(P), which has useful algebraic laws to simplify our derivation.

Variables
This section and the next two show how to construct communicating processes to model program variables and expressions of the source program P, and to transform P into a network of communicating processes such that P v M ( P ) k D ( P ) . The master process M(P) should retain the control structure of the source program P, while the data process D(P) can be expressed as Var(P ) k Exp(P ); where Var(P ) and Exp(P ) implement respectively the variables and expressions of P.
Given that VAR(P) denotes the set of all program variable names for P, we define Var(P ) as the parallel composition of all the processes Var(x) representing program variables in the source program P: Var(P ) def = k f x : x 2 V AR(P) : Var(x)g: Var(x) models a program variable x by providing a pair of channels (x:req; x:val) for read access: after x:req is activated, the value of x should be available from the channel x:val. More precisely, given that Chan(Var(x)) Chan(Q), Var(x) should satisfy Similarly, assigning a new value to x can be achieved by communicating with the channels (x:write; x:ack). Formally, given that Chan(Var(x)) Chan(Q), one should be able to show that (x:write!v ! x:ack? ! Q) k Var(x) = (x := v) ;( QkVar(x)): Designing Correct Circuits, 1996 The application of these lemmas in proofs will be demonstrated in the appendix. We use three components to implement Var(x). Cell(x) is the basic storage cell, and RMux(x) and Wmux(x) are multiplexors allowing multiple users to connect to Cell(x): Var(x) def = Cell(x) k RMux k WMux: Let us consider each of these components in turn. As explained in [6], a Boolean variable x can be modelled by a communicating process Cell(x) which models a storage cell: The process Read(x) describes how the user can read the value of the variable x using communications: The local variable x in the process Cell is used to hold the current value of the program variable x. It is required that the user of the process Read must obey the handshake protocol on (x:req; x:val) in order to avoid deadlock on these two channels.
The process Write(x) describes how the value of x is updated by its user process, Users of Write(x) are also required to obey the handshake protocol on the channels (x:write; x:ack).
As a state holder, the process Cell(x) should be able to communicate with a number of users. To serve multipleuser requests, we treat the reading and writing actions as atomic. For this purpose we introduce multiplexors RMux and where the sets I and J are both finite. It is clear that the process RMux is a legal user of the process Read(x) since it obeys the handshake protocol on the channels (x:req; x:val). Furthermore, the process RMux complies with the handshake protocol on (x:req i ; x:val i ) for every i 2 I.
Given that Var(x) is associated with the alphabet InputChan = fx:req i j i 2 Ig [ f x:write j j j 2 Jg; OutPutChan = fx:val i j i 2 Ig [ f x:ack j j j 2 Jg: Each user process of Read(x) is allocated a pair (x:req i ; x:val i ) of channels for accessing the variable x via the multiplexor RMux, and in turn each user process is required to satisfy the handshake protocol over the corresponding channels. The users of Write(x) can be treated in a similar way.

Expressions
Let Exp(P ) be the parallel composition of the expression processes required by the program P. The characterisation of an expression process is similar to that for a variable process: for any Boolean expression b in P, and for any process Q with Chan(Q) Chan(Exp(P) k Var(P )), (b:req i ! ! b:val i ?v ! Q) k D ( P )=( v := b) ;( Qk D ( P )): A Boolean expression can be modelled by a communicating two-phase module in the same way as a program variable.
For example, the evaluation of the expression x _ y can be described by the process OR(x; y) def = skip[ ] var w : ( req? ! x:req i ! ! x:val i ?w ! ((val!w ! OR(x; y)) w (y:req j ! ! y:val j ?w ! val!w ! OR(x; y)))): The module operates as follows. It receives an event from the req channel, and it activates x:req i , the req channel of the two-phase module evaluating the x operand. The result of the evaluation will be received from the x:val i channel, and its value will be assigned to the local variable w. If w has the value true, then this value will be passed to the val channel; otherwise the two-phase module evaluating the y operand will be activated, and the value returned from it will be passed to the val channel. Taking the multiple-user issue into account, we adopt the following definition for (x _ y): Exp(or(x; y)) def = OR(x; y)) k RMux(req; val) where RMux(req; val) becomes the process RMux(x:req; x:val) after proper channel renaming: RMux(req; val) def = skip[ ] ( [ ] i 2 I ( req i ? ! req! ! val?w ! val i !w ! RMux(req; val))): In general, a composite expression b = b 1 _ b 2 can be defined in the same way as the expression (x _ y) except that the former will communicate with the processes Exp(b 1 ) and Exp(b 2 ) rather than Var(x) and Var(y). To avoid channel name clash among the expression processes, we will rename the channels req and val used in the process Exp(b) to b:req and b:val respectively.
To be able to execute the expression processes in parallel with the master control process, we must make sure that the processes representing expressions have disjoint sets of channels. In particular, since most expression processes may need to access the variable processes, the allocation of the channels x:req i ; x:val i turns out to be an important issue of the hardware compiler.
For simplicity we assume that there are index functions RIdx and WIdx. For each variable process Var(x), in addition to the channels used by the expression processes, the following set of channels fx:req i ; x:val i j i 2 RIdx(x)g [ f x:write j ; x:ack j j j 2 WIdx(x)g is available at the disposal of the master process. For the expression process we adopt the similar convention that the set fb:req i ; b:val i j i 2 RIdx(b)g of channels can be used by the master process.
To conclude this section, note that the set of two-phase handshake channels consists of data-read channels, datawrite channels and expression evaluation channels:

Master control
This section will complete the first stage of our derivation of two-phase modules for Joy programs: to show that P v M ( P ) k D ( P ) . The task can be simplified if we exploit the regularity of handshake protocols to construct SD(P), a sequential version of D(P), which does not contain parallel composition: Since all users of D(P) must follow the handshake protocol HP(V)where V is defined in equation 7, from lemma 3 we can replace D(P) by SD(P)within a handshake environment: D(P) = V SD(P): (8) SD(P)has two properties which are used extensively in our derivation: SD(P); SD(P) = SD(P); (9) (M(P); Q) k S D ( P )=( M ( P ) k S D ( P )); (Q k S D ( P )): Designing Correct Circuits, 1996 The second equation shows the synchronised termination of M(P) and SD(P), a simpler version of which is given in lemma 2.
The construction of the master process is based on a translator M whose task is to replace each direct evaluation of the expression b by communicating with the process Exp(b) using the multiplexor RMux(b:req; b:val), and to replace every assignment to the variable x by communicating with the variable process using the multiplexor WMux(x:write; x:ack).
The master processes of the primitive commands have straightforward definitions: M(skip) def = skip; (11) M(x := e) def = u i2RIdxe; j 2 WIdxx var v : (e:req i ! ! e:val i ?v ! x:write j !v ! x:ack j ? ! skip): The definition of M(x := e) suggests that the choice of channels used to communicate with Var(x) and Exp(e) are rather irrelevant. This nondeterminism allows us later to allocate a specific pair (i; j) of channel indices for implementing M(x := e).
The master process of a composite program is formed by those of its individual components: M(P ; Q) def = M(P) ; M ( Q ) ; The master process for a conditional statement evaluates its Boolean guard by interacting with the related expression process: where the local variable w is not used in M(P). As for conditional statements with multiple branches, where the local variable w is not used in either M(P) or M(if BG). An outline of the proof will be given in the appendix.

Specification of two-phase modules
For a source program P, we define the specification of its target circuit with request channel r and acknowledgement channel a as r a (P) = X:: r? ! P ; a! ! X: r a (P) v r a (M(P)) k D ( P ) : The right-hand side of this formula is defined to be C r a (P), the compilation function mentioned in definition 1. Note that C r a (P) involves M(P), a purely communication-based process without program variables or assignments.
Next, we shall demonstrate how to implement the process r a (M(P)) by a network of two-phase modules within an environment obeying both HP(r; a ) and HP(V), where V is the set of channels for variable-read, variable-write and expression evaluation (equation 7). We define (R v Env S) i (R k HP(r; a ) k HP(V)) v (S k HP(r; a ) k HP(V)): The main objectives of our design are to preserve the modular structure of the source program, and to use a small number of two-phase modules.
We have already introduced a number of two-phase modules, such as Cell for implementing variables and OR for expression evaluation. The following describes three further examples; the implementation of two of these will be outlined in the next section.
First, the skip statement can be implemented by the SKIP module: where SKIP has the property that SKIP = r ! a ! SKIP: Second, consider the assignment statement. Let i 2 RIdx(e) so that the channels e:req i and e:val i can be used for communicating with the expression evaluation module for e, and let j 2 WIdx(x) so that the channels x:write j and x:ack j can be used for communicating with the variable module for x. One can then show that: where ASGN should satisfy ASGN = r? ! e:req i ! ! e:val i ?w ! x:write j !w ! x:ack j ? ! a! ! ASGN: The third example is sequential composition. If VAR(P0) \ VAR(P1) = ;, then r a (P0; P1) v Env r0 a0 (P0) k SEQ k r1 a1 (P1) where SEQ should satisfy SEQ = r? ! r0! ! a0? ! r1! ! a1? ! a! ! SEQ: This CSP description of the SEQ module matches exactly the state diagram shown in Figure 3. Other two-phase modules, such as those for the conditional and iteration statements (Figure 1), can be developed in a similar way.

Implementation
The CSP description of two-phase modules serves two purposes. First, we use it as a behavioural specification for further refinement into circuits. Second, it can be regarded as a normal form which provides a basis for hardware/software partitioning in a codesign environment. Most of our work has been focused on hardware synthesis. To illustrate this approach, we outline an implementation within the CSP model. In practice, we use a separate hardware model [9] which is formally linked to CSP. Let us first introduce a CSP description for a wire: This description captures the behaviour of a wire, which becomes chaotic if it receives a second input before the first signal has propagated to the output. As expected, two wires can be connected into a single one: W i r e ( a; b) k b W i r e ( b; c) = W i r e ( a; c). Since the two-phase modules communicate with two-phase two-phase protocols, the two-phase implementations of these modules are relatively straightforward [2]. For instance, it can be shown that the two-phase module SKIP can be implemented by W i r e ( r ; a ) .
The two-phase implementation of SEQ involves three wires: SEQ def = W i r e ( r ; r 0) k W i r e ( a 0 ; r 1) k W i r e ( a 1 ; a ) ; and one can show that this definition of SEQ satisfies formula 15. This proof is an example which can be checked using an automatic tool such as FDR [4]. A four-phase implementation of SEQ can be obtained from the two-phase specification as follows. Given that 4-2 and 2-4 are respectively converters that transform a four-phase protocol to a two-phase protocol and vice versa, we first generate a specification for the four-phase sequential composition operator by connecting converters to the two-phase version as shown in Figure 4. Automatic tools are then used to check that the four-phase implementation shown in Figure 5 satisfies this specification. Further details of this method can be found in [3].
A clocked implementation can be obtained using protocol converters that transform between two-phase handshaking and clocked control. This approach requires a clocked circuit model [9] based on the theory proposed by Verhoeff [13], and we have established a formal link between CSP and this clocked model using a Galois connection. The details are beyond the scope of this paper.

Concluding remarks
We have presented in this paper a systematic approach for deriving two-phase modules, an intermediate form for a compilation scheme that targets a simple imperative language for asynchronous and clocked implementations. The use of CSP enables us to structure our derivation into several stages, making the proofs modular and reusable. Our derivation is simplified by algebraic laws governing the operation and refinement of two-phase modules.
There are two significant extensions to our framework which have been developed. The first involves adapting our derivation to accommodate the extension of the source language to cover parallelism and communication [3]. This can be accomplished by including processes that implement channels in the source language, and by using a further translator -in addition to M -that introduces communication between the master control processes and the channel processes. The second extension involves optimising the hardware produced by our method. For instance, we have developed protocol converters for clocked implementations which can generate conventional clocked circuits [8] from designs with dual-rail encoded data. D four-phase SEQ Figure 5: A four-phase implementation for sequential composition. The element labelled 'D' is described in [7].