A Framework for the Specification of Active Rule Language Semantics

We present a formal framework that can be used to specify and study a number of different semantics for rule execution in active databases. We shall consider the core of several active rule languages that are already available (e.g., Ariel, Starburst and HiPAC) but whose rule execution is specified only by informal descriptions. The framework is based on a generic active rule language and relies on a transaction rewriting technique. This technique takes a user defined transaction, which is viewed as a sequence of basic database updates forming a semantic unit, and translates it into a new transaction that explicitly includes the additional updates due to active rule triggering. We show that this framework provides a basis for the theoretical analysis and the comparison of different execution models of active rules. Moreover, it allows us to formally investigate a number of important issues related to active rule processing, such as transaction equivalence, confluence and optimization, independently of a specific rule execution model.


Introduction
Active rules allow the user to specify actions to be taken automatically, when certain events occur and some conditions are met.It is widely recognized now that active rule processing is a powerful mechanism for the management of several important database activities (e.g., constraint maintenance and view matherialization [3,4]), and for this reason, active databases have been extensively investigated and experimented in the last years (see [10,13] for extensive bibliographies).A number of active database systems are now already available: among the others we recall Ariel [9], Starburst [15], Postgres [14] and HiPAC [5,11].Indeed, each of these systems uses a different semantics for rule application.The main differences stem from the choice of when rules should be fired (e.g., within the original transaction or at the end of the transaction or in another transaction), how they should be fired (e.g., in parallel or sequentially or according to a fixed rule order), and the granularity of the computation (e.g., set-oriented or tuple-oriented).Other differences concern the complexity of specifiable events, conditions and actions as well as the binding mechanisms (if any) between the various components of an active rule.However, a crucial point is that in the various approaches, active rule execution is generally specified only by informal, natural-language descriptions.It follows that very often, when the number of rules increases, active rule processing becomes quickly complex and unpredictable [10].
In this paper we present a formal framework, based on a new approach to active rule processing [12], that allows us to formally describe a variety of different active rule semantics.We show that this framework can be used as a basis for the theoretical analysis and the comparison of different execution models for active rules.Moreover, it allows us to formally investigate a number of important issues related to active rule processing, such as transaction equivalence, confluence and optimization, independently of a specific rule execution model.
The framework is based on a rewriting technique, to be performed at compile time, that takes as input a user defined transaction T and a set of active rules expressed in a generic language, and produces a new transaction T 0 that "embodies" active rule semantics, in that T 0 explicitly includes the additional updates due to active processing.It follows that the execution of the new transaction in a passive environment corresponds to the execution of the original The work of this author has been partially supported by the EU Human Capital and Mobility grant N. ERBCHBGCT930365 "Compulog-group".
transaction within the active environment defined by the given rules.The compilation procedure has two parameters that allows us to specify: (1) how the triggered rules are interleaved with updates of the user's transactions, and (2) the order in which triggered rules are selected for execution.We show that the active rule execution strategies of most active database systems can be described by just adjusting these two parameters.Other approaches consider rewriting techniques [8,14], but usually they apply in a restrictive context or are not formal.Conversely, we believe that this formal and simple approach can improve the understanding of several active concepts and make it easier to show results.In fact, the execution model of our transactions is based on a relational transaction model which has been extensively investigated [1].This allows us to use known results for transaction equivalence for a formal investigation of important properties of active rule processing.First, we can check whether two transactions are equivalent in an active database.Then, due to the results on transaction equivalence, we are also able to provide results on confluence [2].Finally, optimization issues can be addressed.We point out that with this approach, the various results on active rule processing are independent of the specific rule execution model.As a final remark, we note the proposed technique does not require any specific run-time support, and therefore it can be easily implement on the top of a traditional database management system.
The rest of this paper is organized as follows.In Section 2 we define the basic framework that refers to a generic active database language.In Section 3 we show how this framework can be used to specify a number of active rule language semantics.In Section 4 we report on general results that can be derived for active rule processing.Finally, in Section 5, we summarize some conclusions.Because of space limitation, several technical details are omitted in this paper.

The basic framework 2.1 Preliminary notions
Let us fix a generic database scheme R = fR 1 (X 1 ); : : : ; R n ( X n ) g , where the R i 's are relational schemes and the X i 's are sets of attributes A 1 ; : : : ; A k .We then denote by s an instance over R, that is, a set of relations s = fr 1 ; : : : ; r n g over R 1 ; : : : ; R n respectively, and by Inst(R) the set of all possible instances over R.
A relational atom over a scheme R(A 1 ; : : : ; A k )is an object of the form R(t 1 ; : : : ; t k )where t i , for i = 1 ; : : : ; k , is a term, that is, a variable or a constant.A built-in atom is an object of the form t 1 t 2 , where t 1 and t 2 are terms and is a comparison operator.A literal is an atom (positive literal) or a negated atom (negative literal).A relational atom R i (t 1 ; : : : ; t k )without variables is true in a database instance s = fr 1 ; : : : ; r n gif R i (t 1 ; : : : ; t k )2r i .A condition is a set of literals.A valuation of a condition is a substitution of the variables in C with constants, that makes true all the atoms in C.
An action is a relational atom preceded by one of the symbols f+; g.An update U has the form A[C], where C is a condition and A is an action such that all the variables occurring in it also occur in C.An update is executed for those tuples (if any) that verify the specified condition.Specifically, the effect of an update U is a function EFF (U) : Note that, for sake of simplicity, we do not consider modify operations here.

Transactions and active rules
We define a user transaction as a collection of updates forming a semantic atomic unit for recovery and concurrency purposes: T = U 1 ;: : : ; U k .The effect of T over an instance s is then defined as follows: EFF (T)(s) = EFF (U 1 ) : : : EFF(U k )(s): 5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 An event is a relational atom preceded by one of the symbols f; g.An event denotes the fact that a certain update operation has been performed on a database instance.Thus, the execution of the update +R i (X;Y)[X = a] generates the event R i (a; Y ).
An active rule has the form: L : ( E 1 _ : : : _ E x )C!A 1 ; : : : ; A y where L is the label that is used to express priority, (E 1 _ : : : _E x )is a disjunction of events, C is a condition and A 1 ; : : : ; A y is a sequence of actions such that: (1) each variable that occurs in a negative literal in the condition also occurs in a positive literal or in the event part, and (2) each variable that occurs in the action also occurs in the condition part.An active program P is a set of active rules.An active database is a pair (s; P )where s is a database state and P is an active program.The above rule language includes the core of the languages used in the main active database systems (Ariel, Postgres, Starburst and HiPAC).
The intuitive semantics of the above active rule is: "if any of the events E 1 ; : : : ; E x has occurred and there is no triggered rule with priority higher than L, evaluate the condition C and if it holds, perform the actions A 1 ;: : : ; A y using the bindings of the event and the condition parts."There are some important points that should be further specified: (1) the temporal relationship between the execution of the various components of a rule, (2) the parameters that can be passed from the event and the condition parts to the action part, (3) the order in which the rules are fired when a conflict occurs.The various systems may vary considerably on these points.We will show however that with the above generic language and with the transformation technique presented in the next section, we are able to express the main cases.

Transaction transformation
The key point in our approach is that the behavior of an active database with respect to a transaction T = U 1 ; : : : ; U k is defined in terms of execution in a passive environment of a new transaction T 0 , induced by T, that "embodies" active rule semantics.We present now the technique that allows us to transform T in T 0 .We will consider the main active rule execution models that have been proposed in the literature: immediate, deferred and decoupled.The immediate modality reflects the intuition that rules are processed as soon as they are triggered, while deferred modality suggests that a rule is evaluated and executed after the end of the original transaction [11].Finally, the decoupled modality suggests that the triggered rules are all executed in a different transaction.According to these execution models, three different rewriting procedures will be given.Specifically, consider a user transaction as a sequence of updates: T = U 1 ;: : : ; U k : This transaction is transformed under the immediate modality into an induced one: T I = U 1 ; U P 1 ;: : : ; U k ; U P k where U P i denotes the sequence of updates computed as immediate reaction of the update U i with respect to a set of active rules P.This reaction can be derived by "unifying" the update U i with the event part of the active rules.Clearly the obtained updates can themselves trigger other rules, hence this reaction is computed recursively.Note that under the immediate modality the induced transaction is an interleaving of user defined updates with rule actions.
Under the deferred modality, the induced transaction has the form: T DE =U 1 ;: : : ; U k ; U P 1 ; : : : ; Hence the reaction is deferred (or postponed) until the end of the user transaction.Here again the induced updates can themselves trigger other rules, and so the reactions of the original updates are recursively computed, but using the immediate modality.
Under the decoupled modality, there are two induced transactions.The former is: T DC 1 =U 1 ;: : : ; U k and corresponds to the user transaction while the reaction is decoupled to a different transaction: T DC 2 = U P 1 ;: : : ; U P k : Again, the reactions of the original updates are recursively computed.In the rest of this section we will formalize the above notion of reaction.
Let U be an update A[C] and R be an active rule of the form L : ( E 1 _ : : : _ E x )C 0 !A 0 1 ; : : : ; A 0 y that does not share any variable with U (this can be easily enforced by means of variable renaming).Then, we say that U triggers R because of E i if: 1. there is an E i , with 1 i x, such that E i = B 0 and A = + B , or E i = B 0 and A = B, and 2. there is a substitution , called unifier, such that B = (B 0 ).If an update U triggers a rule (E 1 _ : : : _E x )C 0 !A 0 1 ; : : : ; A 0 y with as unifier, then we say that U induces the sequence of updates (A 0 1 [C 0 ]); : : : ; ( A 0 y [ C 0 ]).
During translation we need to keep trace of the relationship between inducer and induced updates.This is done by subscribing the induced updates in order to encode the inducer, the inducer of the inducer and so on.For instance the update U 3;2;1 means that it was induced by the update U 3;2 that in turn was induced by U 3 .Thus the original update U 3 induces U 3;2 that induces U 3;2;1 .
The following is a recursive algorithm that computes the reaction of a single update to a rule program which allows disjunctive events and sequences of actions in the rules.In the algorithm a rule L : ( E 1 _ : : : _ E x ) C!A 1 ; : : : ; A y is simply denoted by R. Note that the algorithm is parametric with respect to the selection policy of the triggered rules that can be expressed through several SELECT functions.In general, different outputs can be generated by the above algorithm depending on the order in which rules are selected by the SELECT function.Clearly the algorithm can be generalized in such a way that all the possible different induced sequences of updates are generated.Moreover, syntactical restriction can be given to guarantee its termination [12].

Active Rule Execution Semantics
In this section we describe how the framework defined in the previous section can be used to specify different proposed semantics for rule execution in active databases.In particular, we shall consider the core of the the semantics of rule execution in the Ariel, Starburst and HiPAC systems.

Ariel
The basic features of Ariel are: sequences of updates in events and actions, set-oriented reaction, immediate modality and numeric priority among rules which may be not unique.This can be characterized with the following algorithm that specifies the execution model of Ariel rules.

Algorithm ARIEL
Input: An Ariel active program P and a user transaction T = U 1 ;: : : ; U k .

Output: A transaction T A induced by T.
begin T A := U 1 ; REACT(P;U 1 ); : : : ; U k ; R EACT (P;U k ); To each Ariel rule is assigned a numeric priority but the assignment need not be unique and therefore conflicts may occur.Conflict resolution in Ariel can be described by the following procedure.

Starburst
The basic features of Starburst are: disjunction of updates as events, set-oriented reaction, deferred modality and relative priorities among rules.This can be characterized with the following algorithm that specifies the execution model of Starburst rules.In Starburst for any two rules, one rule can be specified as having higher priority than the other rule.It follows that rules are partially ordered.Conflict resolution in Starburst can be described by the following procedure.

Input
5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 Note that the above "induced by" relationship between updates can be easily derived on the basis of the indexes associated with the updates.

Transaction Equivalence.
Two induced transactions T 1 and T 2 are equivalent if EFF 0 (T 1 ) = EFF 0 (T 2 ).Note that this notion is parametric with respect to immediate, deferred or decoupled modalities as well as with the selection policy.Under certain hypotheses, we have defined in [12] a method for testing equivalence of induced transactions which derives from an equivalence result for passive transactions [1].The method is based on a set of transformation rules R that allows us to transform an induced transaction into an equivalent one [12].

Confluence.
We say that an active program P is confluent in a certain execution model M with respect to a user transaction T, if each pair of induced transactions T M 1 and T M 2 that can be generated by the algorithm describing M, have the same effect on any active database over P.Moreover, we say that an active program P is strongly confluent if it is confluent with respect to any user transaction T.In [12] we have provided a practical method to test for confluence based on the results on equivalence.Specifically, we have shown that an active program P is confluent in a certain execution model M with respect to a user transaction T, if an induced transaction generated by the algorithm describing M can be transformed in all the other transactions that can be generated by this algorithm using the above rules R. We have also defined and characterized a notion of local confluence that applies on the active program and provides a necessary and sufficient condition for strong confluence.

Optimization.
Optimizing induced transactions is particularly important since, with our approach, it yields a method for optimizing the overall activity of active rule processing.The notion of optimality that we use is based on syntactic aspects, that is length and complexity of a transaction, but it turns out to be appealing also for operational criteria such as the number of atomic updates performed by a transaction.Let jTj denote the number of insertion/deletion operations involved in the transaction T. Let T and T 0 be two equivalent induced transactions.Then, we say that T is simpler than T 0 if jTj < jT 0 j The set of transformation rules R above mentioned can be grouped into two classes.The former contains simplification rules, which always yield a strictly simpler transaction, whereas the latter consists of commutativity rules, which do not affect the complexity of the translation but are useful in order to apply simplification rules.A reduction of a transaction T based on R is the transaction obtained from T by applying rules in R alternating simplification and commutativity rules as long as some simplification rule can be applied.In [12] we have shown that: the reduction process (1) generates a simpler transaction, (2) always terminates (in polynomial time), and (3) is essentially deterministic regardless of the order of application of the rules.

Conclusions
We have presented a framework based on transaction transformation techniques that allows us to reduce active rule processing to passive transaction execution.We have shown that many problems are easier to understand and to investigate from this point of view, as they can be tackled in a formal setting that naturally extends an already established framework for relational transactions.We have also shown that this framework can be used for the specification and the theoretical analysis of a number of different execution models of active rule languages.We believe that the approach is also appealing from a practical point of view since a lot of work can be done statically, at compile time, and its implementation does not require any specific run-time support.
5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996

REACTInput:
An active program P and an update U j .Output: A sequence U P j of updates induced by U j .begin U P j :=<>; i := 1; index :=< j > ; Triggered:=fR 2 P : R is triggered by U j because of E k g; while Triggered is not empty do R :=SELECT(Triggered); := the unifier of E k and U j ; index := append(index; i); U index := [(A 1 [C]); : : : ; ( A y [ C ])]; REACT(P;(A 1 [C])); : : : ; R EACT(P;(A y [C]))); i := i + 1 ; Triggered:=Triggered fRg; : A set of triggered rules I R .Output: One rule in I R .begin I out := fR 2 I R with highest priority g; if I out is not a singleton then I out := fR 2 I out most recently matched by changesg; if I out is not a singleton then I out := fR 2 I out whose condition is the most selectiveg; if I out is not a singleton then I out := a random rule R 2 I out ; return I out end.