Verification of an Optimized Fault-Tolerant Clock Synchronization Circuit

In previous work, we explored the interaction between different formal hardware development techniques in the implementation of a fault-tolerant clock synchronization circuit. This case study presents a clever optimization of the earlier design and illustrates how we have extended our framework to support its incremental design refinement. The primary design tool represents circuits as systems of stream equations, where each stream corresponds to a signal within the circuit. These signals are annotated with invariants which can be established using proof by co-induction. These invariants are exploited to verify localized design refinements. This study lays groundwork for a more formal integration of disparate reasoning tools.


Introduction
A significant amount of effort within the formal methods community has been focused on how to verify hardware using particular verification systems.Not as much time has been spent asking what type of reasoning support we need to formalize the design process.The case study presented here is subject to the same criticism.The exercise began as an attempt to derive a clock synchronization circuit using the Digital Design Derivation (DDD [1]) system.When it became clear that DDD could not handle all the development steps, the effort evolved into exploring how a mechanical theorem proving system (PVS [8]) could support the derivation process.Lost in the effort was the question of how to best use formal techniques in the hardware development process.Much of this process is tedious and does not require sophisticated reasoning support.Our mission as researchers in this field should be to identify the types of reasoning that are useful to this process, and to develop those.These efforts toward developing a verified synchronization circuit have led to the position that a formal design process should be as simple as possible, but should allow sufficient flexibility to the designer to make aggressive optimizations to the design.
Early research in applied formal methods was concerned with basic questions of design and implementation correctness.We are now beginning to explore how formal reasoning interacts with intelligent design processes.This case study addresses a clever refinement to a design that is already well along the path toward realization.The goal is to preserve or reestablish the implementation's correctness while sustaining a secure verification path.The cleverness of the refinement lies in its exploitation of undisclosed properties of both the implementation and the environment in which it is to operate.Consequently, the formal characterizations of both the design and the implementation are extended.
Effective automated support of formal methods must accommodate multiple distinct modes of reasoning.One motive of this study is to explore heterogeneous reasoning and contribute to the growing experience with it.Derivation based formalisms-reasoning systems that employ transformations rather than logical inference-are relatively effective for routine design refinement.However, because they are dedicated to preserving specific refinement relations, they are not as general as deduction based systems, where the implementation relation can be expressed within the formalism.The developers of the DDD system were confronted with this limitation in contrasting a formal derivation of the FM8502 [4] and FM9001 [1,2] microprocessors with Hunt's proofs of correctness in the theorem prover Nqthm.

Verification of an Optimized Fault-Tolerant Clock Synchronization Circuit
In particular, Hunt's implementation of a functional memory model by an explicitly synchronized process exposed a gap in the derivation path.While this particular kind of problem has been addressed [17,10], we believe that derivation gaps are an inevitable consequence of creativity in design and engineering.
On the other hand, generality hardly justifies the use of a theorem prover for all verification tasks.Even if one has somehow incorporated automatic provers and rewriters for lower level tasks, we believe that reasoning environments should support a variety of reasoning formalisms.At some point, such a system may employ the more unified view of a logical framework, but, for the present, experience is needed in the coordinated use of multiple interactive systems.
The technique presented here to augment DDD style derivation with PVS theorem proving support is not restricted to either DDD or PVS.Essentially, we present an effective means to establish invariants on signals within a circuit so that we can verify context-dependent optimizations of a circuit design.These invariants (assertions on signals) are established using co-induction, as is the verification of the optimization.Our goal is to develop a formalized design environment that supports annotation of signals with invariants (and handles all the associated bookkeeping aspects), so that a designer can explore various optimizations in a rigorous manner.

Related Work and Prior Developments
In the study described here, the DDD system provides mechanized support for behavioral and structural transformations, while PVS supports theorem proving activities.The interaction between these systems requires manual support at present, although we are laying the groundwork for mechanizing it.We view DDD and PVS as examples of autonomous reasoning peers.They are each useful tools in a formal hardware development process.
A case study by O'Leary, Leeser, Hickey and Aagaard, outlining the verification of a binary non-restoring square root implementation, reflects a contrasting perspective on heterogeneous reasoning [3].The design and its proof of correctness develop through several stages of program transformation before a structural description emerges.These structures are then refined in several stages toward a realizable hardware description.Their study, like ours, exhibits both derivational and deductive reasoning processes, but within the unified framework of Nuprl.We believe the authors would argue in favor of such a framework as a prerequisite for heterogeneous reasoning.
Miner presents a verified class of fault-tolerant clock synchronization algorithms and an informal sketch of a hardware realization [5].The general algorithm was verified using the mechanized proof system EHDM [11].A hardware realization of the verified algorithm was developed and tested, but the hardware was not formally verified with respect to the algorithm [6,14].In [7], a core circuit design was developed using a variety of formal reasoning systems.The circuit was developed using a combination of the Prototype Verification System (PVS) developed at SRI [8], the DDD system developed at Indiana University [1], and a BDD-based tautology checker.This formal development identified a small improvement over the original design.While that work was in progress, Torres-Pomales identified a more significant improvement [15].This paper explores how Torres-Pomales' optimization can be incorporated into the formal design framework proposed in [7].This optimization trades space for time in a clever manner and led us to the conclusion that a formal design environment should be flexible enough to accommodate engineering insight.The argument justifying the optimization requires arithmetic reasoning including integer division, so its justification requires more than simple transformational techniques and also places the optimization outside the realm of model-checking approaches.
Specifically, we look at how the optimization can be incorporated with minimal impact on the surrounding proof.In essence, we want a transformation rule in DDD that allows a subsystem to be replaced by a behaviorally equivalent variant as established in PVS.To accomplish this, we need the means to transfer expressions between DDD and PVS, a theory of streams built into PVS, and a mechanism for sanctioning ad hoc transformations in DDD.
Since the goal of our verification activities is developing working hardware, a VLSI implementation of the formally developed circuit design has been fabricated and tested.The circuit layout was manually generated using conventional design tools, so the link between the fabricated circuit and the design is not completely formal.The VLSI realization has been incorporated into the full fault-tolerant clock synchronization system described by Torres-Pomales [15].The circuit worked perfectly on all tests.

Verification Strategy
In order to carry out the verification reported here, we needed to identify what role each formal system would play in the development process.Ideally, the hardware development should not depend directly upon a general purpose Designing Correct Circuits, Båstad 1996 proof system.Most theorem proving systems require a great deal of experience before they can be used effectively.However, such a system is indispensable in exploring the reasoning required for a formalized design process.
The principle design tool is DDD, and we wish to use it in a manner to minimize the application of general purpose verification activities.For the effort reported here, we used a shallow embedding of DDD's representation of hardware in PVS.Our primary motivation was justification of custom refinements, not in reasoning about the DDD approach to hardware design.
Figure 1 depicts our view of the design hierarchy and illustrates where each tool contributes to the design process.At the top-most level are mathematical properties that the resulting design must satisfy.A general purpose mechanical theorem proving system supports the verification of algorithms that ensure these requirements.The verified algorithm should be as general as is reasonably possible, so that the design space is not unduly restricted.The verified algorithm is (manually) translated to a DDD specification.This specification is refined within DDD and then transformed into an architecture.A sequence of basic DDD transformations, augmented with localized PVS verifications, determines the final architecture.At the final stage, DDD maps the abstract representations of data into boolean representations.Additional transformations may be applied prior to using conventional design tools to produce a physical realization of the design.

Overview of DDD
DDD implements a formal design algebra for developing correct digital circuit descriptions.The designer interactively transforms high level behavioral specifications into a description suitable for entry into hardware synthesis tools.The top level describes the intended behavior of the circuit using a collection of mutually recursive function definitions in tail-form.Each function corresponds to a control state and arguments to the functions represent the visible storage elements in the design.Transformations at this level allow the designer to modify the control structure of an architecture while preserving the functional correctness, relative to synchronization constraints.Once the control structure is determined, DDD automatically transforms the behavioral specification into an initial architectural level description.
DDD represents the structure of a digital system using a system of mutually recursive stream equations.A stream in DDD is an infinite sequence of uniformly typed values, X = [ x 0 ; x 1 ; x 2 ; . ..]The stream constructor 'cs' adds an element to the front of the sequence cs(z;X) = [ z;x 0 ; x 1 ; x 2 ; . ..]Function 'cs' models a delay element with initial value z.Functions extend to sequences so that f(X;Y) denotes [f (x 0 ; y 0 ) ; f ( x 1 ; y 1 ) ; :::].DDD uses a system of equations to define a network of streams; recursion in such systems represents feedback in the circuit.For example, the following stream equation defines a loadable counter circuit.
where i is the initial integer value of the counter, INC and MUX are the increment and selection functions lifted to streams, and streams S and L are the multiplexor select signal and load input respectively.Within DDD, free variables in a system of stream equations are bound by a system level abstraction.This abstraction defines the input signals for the circuit.The circuit's output is a subset of the named streams in the system of equations.

Overview of PVS
PVS is a general purpose verification system developed at SRI International [8].It consists of an expressive specification language coupled with a powerful mechanical theorem prover.The PVS specification language is based on higher-order logic.The base types include the booleans and the real numbers.The language includes predicate sub-types.The other numeric types are defined as sub-types of the reals.One consequence of introducing predicate sub-types is that the resulting type system is undecidable.Thus, PVS automatically generates proof obligations called type-correctness conditions (TCCs) when it type-checks a theory.Theories can be parameterized, providing some support for parametric polymorphism.PVS also allows for dependent types and several standard computer science type constructors such as records, tuples, and lists.PVS includes a prelude theory that defines a large collection of useful results.User defined libraries provide a mechanism to extend PVS with domain-specific theories.
Designing Correct Circuits, Båstad 1996 PVS provides an interactive theorem proving environment using a sequent calculus presentation of the proof goals.The prover includes decision procedures for ground linear arithmetic and equality.There is a strategy language similar to LCF-style tactics.Thus, the user can define high-level proof procedures.There are several powerful strategies distributed with PVS that automatically verify a large number of results.PVS allows the user to prove lemmas in any order.It maintains proof dependency analysis to ensure that all obligations have been discharged.Included in the analysis is an enumeration of all axioms used by the proof chain.

Reasoning about Streams in PVS
Recursive stream definitions are not directly supported by PVS.It was necessary to identify a mechanism to allow such objects to be defined in PVS.Although streams over type can be represented as functions from natural numbers to , this representation does not lend itself to direct definition by stream equations.The equational style of definition illustrates that streams can be viewed as a co-inductive type.Just as inductively generated types give rise to recursive function definitions and proofs by induction, co-inductive types allow for definition by co-recursion and proofs by co-induction [9].Co-induction is a categorical dual of induction; induction principles are justified using least fixed point arguments, co-induction principles are justified using greatest fixed point arguments.Although the underlying formal basis of co-induction is an interesting area of study, our work is primarily concerned with the application of these techniques to hardware verification.

Stream Definition
Streams in PVS are defined as parameterized uninterpreted types constrained by a set of axioms.For X;Y;S of type Stream[], and a : , the following axioms hold: In this definition, type is instantiated using the tuple type and is instantiated with type integer.The following two facts are easily proven about COUNT.
hd_COUNT: LEMMA hd(COUNT(S, L, i)) = i tl_COUNT: LEMMA tl(COUNT(S, L, i)) = COUNT(tl(S), tl(L), mux(hd(S), hd(L), inc(i))) The proofs consist of expanding the definition of COUNT followed by rewriting with the stream axioms.To simplify subsequent proofs, we adopt the convention that for every stream defined in PVS, we introduce lemmas for simplifying the hd and tl.The next section introduces a proof principle that simplifies proofs of stream equality.
Verification of an Optimized Fault-Tolerant Clock Synchronization Circuit

Stream Equivalence
Definition of streams using co-recursion enables a useful technique for proving two streams equal.A stream bisimulation R is a sub-relation of the equality relation such that for any two streams x and y,if x R y then hd(x) = hd(y) and tl(x) R tl(y).The PVS type declaration defining the type of bi-simulations between streams over is: PVS automatically generates proof obligations for any object declared to be of this type.The following theorem provides a tool for proving stream equivalence by exhibiting a suitable bi-simulation.All that remains is to show that this relation satisfies the type constraints of a bi-simulation.Take an arbitrary pair that is in the relation

Heads:
The PVS proof for this case consists of rewriting with lemma hd COUNT and axiom hd cs.

2
We have written a PVS strategy named (CO-INDUCT-AND-SIMPLIFY) that completely automates the above proof steps.This strategy suffices to automatically discharge several standard stream identities.

Signal Invariants
In order to justify some refinements, it is necessary to establish invariants on the input signals.The following PVS theory fragment defines predicate Invariant to be true for any boolean valued stream that is true at every finitely accessible point.At first glance, this appears to be a useless definition.However, when used in conjunction with PVS' dependent type mechanism it provides a useful means to define an invariant relating a collection of signals.

S(R): TYPE = {A| Invariant(IF R THEN NOT tl(A) ELSE A => tl(A) ENDIF)}
A stream of type S(R) corresponds to a signal that once asserted remains asserted, unless it is reset by boolean stream R.
Theorem co induct provides a mechanism for proving that an arbitrary boolean valued streams is always true.To establish an invariant property about a stream, it is sufficient to show that that property is contained in some coinductive assertion.The PVS strategy (CO-INDUCT-AND-SIMPLIFY) automatically verifies many invariant properties.

Fault-Tolerant Clock Synchronization
In a fault-tolerant computer architecture, the clocks of the redundant computing elements need to be synchronized to ensure that they operate in a coordinated manner.The synchronization algorithm must also tolerate a bounded number of failures.The property that a synchronization algorithm must ensure is that: For any two clocks C p and C q that are nonfaulty at time t jC p (t) C q (t)j Clock synchronization algorithms are designed so that by periodically exchanging values of clocks and executing a fault-tolerant averaging function, the above property is guaranteed.
Schneider [12] demonstrates that many fault-tolerant clock synchronization algorithms can be treated as refinements of a general protocol.Shankar [13] and Miner [5] have provided mechanically checked proofs of Schneider's paradigm.Usually, N > 3 F .We use to denote a collection of readings from clocks in the system.In the mechanically verified theory, is a function from clock indices to clock readings.

R-the nominal duration of a synchronization interval. cfn-a convergence function that must satisfy three properties:
Translation Invariance The function depends only on the relative magnitude of the readings, not the absolute magnitude.
Precision Enhancement For any two good clocks with similar estimates of other clock's values, the result of computing the convergence function is similar.
Accuracy Preservation If the readings from good clocks are sufficiently similar, then the computed value of the convergence function is close to all good clocks.

2
;where m = the mth largest value in collection employed in the Welch and Lynch [16] clock synchronization algorithm, possesses the required properties of a convergence function [5].
In previous work [7], we developed a hardware realization of this verified algorithm using a combination of formal design techniques.The verified algorithm was manually translated into a DDD behavioral level specification.A standard technique was chosen for exchanging values between the redundant clocks.At a fixed offset into each synchronization interval, a signal is broadcast to the other participants in the protocol.The estimate for a remote clock's value is computed by determining the difference between the expected offset for receiving this signal and the actual offset when it is received.Using standard DDD transformations, an ad hoc refinement verified using PVS, and BDDbased tautology checking, we developed a hardware description suitable for realization using a field-programmable gate-array.
In a separate effort, Torres-Pomales [15] discovered a more efficient realization of the core synchronization circuit.We were faced with the problem of how to incorporate this optimization into our existing verification.We isolated the sub-circuit affected by the optimization, and then verified a localized refinement with respect to the existing design description.The two registers capture the current value of the counter when the surrounding hardware receives signals from appropriately selected remote clocks.The verification discussed in [7] establishes that capturing just two readings in each synchronization interval is sufficient for correct execution of the algorithm.The proof technique employed for the ad hoc refinement in the earlier effort was not easily generalized.The difficulties encountered led us to refine our proof technique for verifying these custom transformations.

Optimization
Torres-Pomales discovered that the convergence function has a much more efficient realization [15].He recognized that he could exploit the time interval between the (F + 1)th and (N F)th signals to partially compute the convergence function.His optimization consists of capturing the (F + 1)th reading as before, but he then increments the captured value every other clock tick until the signal from the (N F)th clock arrives.At this point the stored value is exactly the required value of the convergence function.
This next section will outline a technique to transform our previous design into Torres-Pomales' design.The optimized convergence function is depicted in Figure 3.This optimization requires assumptions about the input

Verification
The original circuit (Figure 2) is described by the following collection of stream equations: THETA-F1 = cs(i; MUX(F1, RD, THETA-F1)) THETA-NF = cs(i; MUX(NF, RD, THETA-NF)) The optimized circuit (Figure 3) is described by these stream equations: HOLD = cs(b, F1 & :HOLD) CIN = HOLD & :NF constrained by F1.Stream NF cannot be asserted until after F1 is asserted.These type constraints are the same as presented in the statement of Theorem Optimize correct.Also, Boolean variable b is constrained to equal odd?(i + j) whenever hd(F1) is asserted and hd(NF) is not.Under the same conditions, j + 1 = hd(RD).These restrictions on b and j in the bi-simulation essentially state invariants about HOLD and THETA-NF during the sub-interval i2 depicted in Figure 4.In addition, the current state of the optimized sub-circuit is functionally related to the current state of the original circuit.These invariant properties constitute the primary reason that this refinement is correct.All of these constraints are used in the following proof by co-induction.
Proof: (of Optimize correct) A co-inductive proof subdivides into two major cases.The first case consists of showing that the pair of streams to be proven equal are included in the candidate bi-simulation.The second case consists of showing that the given relation is indeed a bi-simulation.To show that this pair of streams is in B, we instantiate the existentially quantified variables of B appropriately.
The rest of the proof involves satisfying the type correctness conditions generated by PVS when we instantiated the variables constrained by dependent type declarations.The correctness conditions for RD, F1, and NF follow from the fact that these streams satisfy an invariant.The correctness of the instantiations for b and j depends on the fact that RD is the output of a counter and that F1 is asserted before (or simultaneously with) NF.

2
The above argument may seem difficult.However, the only real difficulty lies in determining the appropriate invariants for the input streams and state variables.Once these are correctly chosen, the proof of stream equality by exhibiting a bi-simulation is mostly mechanical.This approach to verifying a local replacement forced us to focus directly on the mathematical justification for the replacement.The routine aspects of the verification are discharged in a mechanical fashion.

Establishing Invariants
The verification presented above is only valid if the input signals satisfy the corresponding invariants.These can also be established using a co-inductive proof.The signal RD is generated by the sub-circuit shown in Figure 5.This behavior of this circuit is described by the stream equation: Designing Correct Circuits, Båstad 1996 Verification of an Optimized Fault-Tolerant Clock Synchronization Circuit This is proven automatically using the PVS strategy (CO-INDUCT-AND-SIMPLIFY).The requirements on F1 and NF are discharged in a similar manner.Since these proof obligations are often discharged automatically, it would be more productive to add a function to the derivational system to attempt a simple co-inductive proof prior to generating the necessary PVS theories.There is no need to use the power of a general purpose prover when a simple function added to the design tool can provide the same level of assurance.

Concluding Remarks
Optimizations of hardware designs often exploit implicit properties of the surrounding system.Approaches for formal hardware development need to include an effective means to represent and reason about changes in an evolving hardware design.Derivation-based formalisms provide a suitable framework for managing the routine design refinements, but cannot be expected to cover the possible design space.General purpose theorem proving systems, on the other hand, provide sufficient generality to capture arbitrary design refinements, but can be cumbersome for the more routine aspects of design.Formal design environments need to strike a balance between the two extremes.
In this paper, we presented a scenario where the generality of a general purpose proof system is necessary to complete the verification.However, the verification effort also identified a plausible extension to a derivational reasoning system.It is possible that a trivial bi-simulation is sufficient to justify an ad hoc refinement.In this case, it is not necessary to use the full power of a mechanized proof system.A simple extension to the derivation system can automatically attempt a trivial co-inductive proof.If it succeeds, the refinement is allowed, otherwise the necessary proof obligation is generated.In cases of design modifications resulting from clever engineering insight, the verification strategy presented here focuses effort on the mathematical justification for the refinement.The mundane aspects of the verification are handled automatically.

Figure 2 Figure 2 :
Figure2illustrates the core circuit computing the convergence function presented in[7].The signal RD is the output of a counter.The signals F1 and NF are boolean valued signals that indicate receipt of a synchronization signal from at least F + 1 and N F distinct participants in the protocol, respectively.Here, N represents the number of clocks in the protocol and F represents the number of physical faults tolerated by the system.By discarding readings from the F fastest and F slowest clocks, this protocol can tolerate F Byzantine failures.