Admissible Record-Oriented Evaluation Plans for Declarative Updates

Efﬁcient evaluation strategies for declarative updates have rarely been investigated. Due to possible dependencies between the resulting database state and the order in which records (objects) are processed, usually declarative updates are evaluated in a set-oriented way in order to ensure a deterministic evaluation. In this paper, we show that such dependencies can be detected by exploiting knowledge about conﬂicts between the operations that are used to access the database during the update evaluation. Thus most declarative updates can also be evaluated deterministically, and in some cases more efﬁciently, in a record-oriented way. We show that some of the detected conﬂicts can be relaxed or even be ignored, while a deterministic evaluation can still be guaranteed.


Introduction
Different approaches for optimizing and evaluating declarative queries have been proposed so far (comprehensive overviews can be found in [10] and [6]).Many query languages like SQL [9], QUEL [15] or POSTQUEL [13] also provide means for a declarative specification of database updates.However, particular strategies for the optimization and especially for the efficient evaluation of declarative updates have hardly been investigated.Most of the optimization strategies that have been developed for queries, e.g., algebraic optimization using equivalences based on heuristics, can also be applied to declarative updates with minor modifications.Unfortunately, this is not possible for the corresponding evaluation strategies, since in some cases the result of a declarative update depends on the order in which records (objects) are processed, as we will show later.
We assume that the evaluation of queries and declarative updates is realized by executing several algorithms which correspond to, e.g., join, selection and projection.Each algorithm produces a result and/or consumes the result(s) of the previously executed algorithm(s).One can distinguish the following two basic processing strategies for these algorithms: set-at-a-time (set-oriented): An algorithm processes sets of records.The result set(s) of the previous algorithm(s) is (are) processed completely within an algorithm before its own result set is propagated to the next algorithm.
record-at-a-time (record-oriented): An algorithm processes a single record.A result record computed from the input record(s) is immediately propagated to the next algorithm.
If additional algorithms for switching between set-oriented and record-oriented processing and vice versa exist, both set-oriented and record-oriented algorithms can be combined in the evaluation.If only set-oriented (recordoriented) algorithms are used, we say that the evaluation is set-oriented (record-oriented), or the query/update is evaluated in a set-oriented(record-oriented) way.Otherwise we say the evaluation is mixed.In certain cases only the Advances in Databases and Information Systems, 1997 assume that each relevant record is selected separately using the index defined on salary, and that index entries are read in ascending order.In order to keep the index consistent with the database state, changes of salary values are immediately stored within the corresponding index entries.Then, instead of a single salary increase, Smith and Brown get an infinite number of raises since their index entries are swapped with each update, and the records are selected and updated again.
Example 2 (given in a similar form in [15]) The relation Employees is given as in Example 1.Now all employees whose managers earn at least $33.000 should get a salary increase by 10%.
With a set-oriented evaluation, obviously only Smith's and Brown's salary is increased.But if we follow the record-oriented strategy and examine records in the order given in Example 1, the following happens: Smith meets the selection condition; his salary is increased to $38.500.
Brown also meets the selection condition; his salary is increased to $35.200.
Jones now also meets the selection conditions since his manager's (= Brown's) salary is greater that $33.000.His salary is changed to $33.000.
Because of the same reason, Miller also gets a salary increase.
In this paper we consider only one special type of declarative updates, namely the application of an update operation to a set of objects that is determined by a query.Furthermore, we assume that the declarative updates we investigate can be evaluated deterministically, i.e., they give a unique result when evaluated using the set-oriented strategy ( [2] and [11] describe approaches for detecting and handling declarative updates which cannot be evaluated deterministically).
A non-deterministic evaluation of a declarative update is caused by conflicts between the database access operations that are executed during the evaluation.In this paper we investigate whether a record-oriented evaluation is still correct, although conflicts exist.First, we introduce a query processing model for an object-oriented DBMS as a basis for the investigation of declarative updates.We think that the development of concepts for dealing with declarative updates is even more important in the context of object-oriented databases, since updates can be performed by database methods, which may occur (at least syntactically) anywhere in queries formulated in a declarative objectoriented query language.Then we develop a formal model which is based on tracing the database access operations during query evaluation.Since our concept is based on an object-oriented data model, we represent all database access operations as system-defined and user-defined methods, respectively.The former represent, e.g., scans on class extensions or index scans (then the respective class object is the receiver) 1 , while the latter are application-specific methods defined in a database schema.The trace model is particularly suited to analyze conflicts that may occur during the evaluation of declarative updates.For conflict detection, we exploit conflict specifications that are given together with the methods.We identify two frequently occurring non-trivial cases where declarative updates can be evaluated deterministically in a record-oriented way, although conflicting methods occur within the evaluation.The concept we describe in this paper allows to consider alternative, possibly more efficient, evaluation strategies for declarative updates and thus can be regarded as a special optimization technique.To the best of our knowledge, this is the first paper that investigates this optimization potential for declarative updates.
The remainder of the paper is organized as follows.Section 2 gives an overview on related work.In Section 3 we briefly describe the logical query algebra declarative updates are mapped to, introduce the physical query algebra which is used to build up evaluation plans, and illustrate how its operators are mapped to concrete algorithms.We examine the application of the record-oriented strategy to declarative updates in Section 4, and show that some of the detected conflicts can be relaxed or even ignored, while a deterministic evaluation can still be guaranteed.In Section 5, perspectives for the applicability of our concept within a rule-based query optimization framework are sketched and some performance considerations are given.Section 6 concludes the paper.

Related Work
About twenty years ago, non-determinism in evaluating declarative updates was first recognized.As a consequence, declarative updates in, e.g., INGRES [16] and POSTGRES [13] are only evaluated in a set-oriented way.This restriction is mostly used for a straightforward solution to this problem (e.g., [6]).In fact, it does not even solve the problem in general, as it is shown in two recent publications that deal with non-deterministic evaluation of declarative updates explicitly [11] [2].Both concentrate on the problem of applying a sequence of update operations, or an update method, resp., to a set of receivers.Conflicts between update operations/update methods that are applied to different receivers are examined.In [11] state-independent conflict specifications, which are also utilized for concurrency control, are used to detect non-determinism.If a conflict occurs at run time (which is detected by the concurrency control), the execution of the declarative update is either stopped and rolled back, or the execution is continued with non-deterministic semantics.-In [2] a more theoretical approach is introduced.The authors concentrate on the analysis of declarative updates on the basis of so-called schema colorings which are used to classify all update methods in a schema according to their update behavior (update, creation and deletion of objects).They show that these colorings can be used to decide whether the application of an update method to a set of receivers leads to a non-deterministic evaluation or not.It is proven that for a so-called key-order independent method, a sequential application of the method to a set of receivers is equal to a parallel application.
In the following we concentrate on the examination of conflicts between update and read-only methods.As mentioned before, we assume that the declarative updates we look at can be evaluated deterministically in a setoriented way.This can be proved by using the techniques introduced in [11] and [2].

The Query Processing Model
In general, declarative updates are formulated using a declarative language.We do not rely on a specific language, but assume that a declarative language like ODMG-93 OQL [4] or VQL [1][5] is used.The main requirement is that methods, including update methods, can be used in the formulation of queries and declarative updates, provided that the semantic correctness of the query/update is given.
We follow an algebraic approach with the distinction of a logical and a physical query algebra as introduced in [7].Queries and declarative updates are translated into logical algebra expressions.The query optimizer transforms these logical expressions, e.g., pushs selections down as far as possible, maps logical to physical algebra expressions using so-called implementation rules, and chooses the cheapest physical expression as the evaluation plan.

Logical Query Algebra
The logical algebra we use in this paper has been introduced in [1].It is not complete with regard to query languages like ODMG-93 OQL.For the sake of simplicity the expressiveness of the algebra is restricted to the class of declarative updates to which the results of this paper apply.
The operators of the logical as well as the physical query algebra are applied to complex values of type f a 1 : D 1 : : : : a n : D n ]g where D 1 : : : D n are complex data types.We assume that the record components are unordered.Operator arguments of this type are denoted by S. The operator parameters are enclosed in <>.We define Ref(S) : = fa 1 : : : a n g for T y p e (S) = f a 1 : D 1 : : : a n : D n ]g and refer to a 1 : : : a n as the references of S. In the following p and m denote property and method identifiers, respectively.The invocation of a method m with parameters p 1 : : : p k and receiver o is represented by o !m(p 1 : : : p k ).
The logical algebra consists of the operators select, project, join, natural join, union and diff as well as the operators get, map const, map op, map and f l a t .The former are defined in analogy to the commonly known operators from relational algebras.The latter are provided for representing class extensions, constants, operations on the built-in data types and method invocations, respectively (f l a t flattens set-valued methods results).
The declarative updates we consider in this paper are restricted to the form select rec !m(p 1 : : : p k ) f r o m x 1 in e 1 : : : x n in e n wherepred (given in the language of ODMG-93 OQL), where m is an update method, and the other expressions for the method receiver rec, the method parameters p 1 : : : p k , the domains e 1 : : : e n for query variables, and the query predicate pred contain exclusively invocations of read-only methods.
Declarative updates are then mapped to a logical algebra expression of the form map < a m a rec < a p1 : : : a p k >> (E) where E is the logical algebra expression which represents the f r o m and the where clause, and m is the update method which is invoked in the select clause.

Example 3
We assume that a class E m pwith the method raiseSalary(p : I N T ) : BOOLis defined in analogy to the relation Employees.The declarative update from Example 1 can be formulated as select e !raiseSalary (10) from e in Emp where e:salary > 30:000 Note that e:salary is a short-hand notation for the invocation of a system-defined method salary() which is provided for reading property values.The representation in the logical algebra is then given by map < a 5 raiseSalary a 1 < a 4 >> ( map const < a 4 10 > ( select < a 2 > a 3 > ( map const < a 3 30:000 > ( map < a 2 salary a 1 > ( Advances in Databases and Information Systems, 1997

Physical Query Algebra
The physical query algebra contains operators which are associated with certain cost functions and evaluation algorithms.For each operator in the logical query algebra, there exists (at least) one corresponding operator of the physical query algebra.For some logical operators, there exist alternative physical operators, e.g., for j o i nwe provide the physical operators nested loop join hash j o i nand merge join which represent different join algorithms.Additionally, the physical algebra contains the operators sort for sorting the input sets of a merge join, the operator select i n d e x < a C p v > for selecting the instances of the class C using the index defined on property p with value v, and the operator collect to accumulate (intermediate) result sets.
The optimizer generates several physical algebra expressions which we refer to as evaluation plans (EP).The overall cheapest EP is then chosen to evaluate the query/declarative update.We refer to an EP P where each operator op is covered by a collect operator, i.e., P = collect(op 1 (: : : c o l l e c t (op k ) : : : ) as a set-oriented EP.The corresponding record-oriented EP P 0 = op 1 (: : : (op k ) : : : ) is obtained from P by removing the collect operators.If some, but not all operators in an EP are covered by a collect operator, we refer to the EP as a mixed EP.
In order to analyze the correct execution of declarative updates, we give the concrete algorithms for some of the physical algebra operators.For each physical operator except sort, we provide algorithms for a record-oriented evaluation.The corresponding set-oriented processing strategy for an operator op can then be obtained by covering the operator with a collect operator, i.e., collect(op).For sort, a set-oriented algorithm is provided.
Since record-oriented algorithms can be thought of as iterators on streams of records [6], we provide for each operator algorithms OPEN and N E X T , for opening the stream and obtaining the next element of the stream, respectively.In the following algorithms P P

OPEN(get<a C>)
fC !open scan(a) g N E X T (get<a C>) : s fs :

Record-Oriented Evaluation of Declarative Updates
Non-determinism in the evaluation of declarative updates is caused by conflicts between the methods that are executed during the evaluation of a record-oriented EP.In general, two methods a and b are said to commute if their execution order can be switched without causing any changes.a can be executed before as well as after b, and the results of both methods and of any subsequent method c, which is executed on a database state that has eventually been modified by a and b, do not change.Otherwise a and b are said to be in conflict.
We utilize so-called state-independent commutativity specifications, where only information about the method itself (i.e., its name) and its actual parameters which are known at compile time are used for a conflict test [18].Note that since we do not consider information about the actual database state, on the one hand we can perform a conflict test at compile time.On the other hand, we can only detect possible conflicts which might, but do not necessarily occur.
Commutativity specifications have mainly been exploited in semantic concurrency control [3][12][18] to achieve a higher degree of parallelism.In this context conflict tests are performed at run time.It is assumed that methods which are invoked for different receiver objects commute (local atomicity property), since a method can only directly manipulate the state of its receiver.If conflicts occur due to nested method invocations, they are detected at run time [12].In the case of declarative updates, a conflict test should preferably be performed at compile time, or at least before the evaluation starts, in order to avoid a non-deterministic evaluation or a rollback of all method invocations executed so far during the evaluation.Since conflicts due to nested method invocations cannot be detected then, methods invoked for different receivers cannot be considered as commuting by default, as it is done in semantic concurrency control.For our purposes, we extend the notion of commutativity with respect to the receiver objects of methods as follows: Definition 1 (Total and partial commutativity) Two methods m 1 and m 2 commute totally iff for any database state DB 0 the execution sequences (o 1 !m 1 : Advances in Databases and Information Systems, 1997

Traces
To investigate conflicts between methods that are invoked during the evaluation, we need a precise description of the order of method invocations.For this purpose, we introduce the notion of traces of method invocations.In order to allow a recursive computation of traces for EPs, we also include the intermediately generated records that are passed on to the other operators during evaluation, since these records lead to further method invocations.The trace for an EP P displays all method invocations that are executed and all records that are generated, when for P the sequence of calls OPEN(P) N E X T (P) : t 1 : : : N E X T (P) : t n N E X T (P) : N U L L is performed.We define the trace of the execution of an EP P against a given database state as follows: The trace TRACE(P) of an EP P is the sequence consisting of method invocation sequences M i i = 0 : : : n + 1 , records t i i = 1 : : : n , and the entry opened, denoted as T R A C E (P) = < M 0 opened M 1 t 1 : : : M n t n M i+1 > such that for the execution of OPEN(P) N E X T (P) : t 1 : : : N E X T (P) : t n N E X T (P) : N U L L all method invocations before the entry opened correspond to those executed in OPEN(P), the sequence M i i = 1::n corresponds to the method invocations executed in N E X T (P) : t i , and the sequence M i+1 corresponds to the method invocations executed in N E X T (P) : N U L L .
According to this definition, we can derive the traces for the physical algebra operators given in Section 3.2 from the corresponding algorithms.T R A C E (nested loop j o i n<a 1 a i+1 ) > (P P 0 )) := < M 0 0 M 0 1 : : : M 0 m M 0 m+1 M 0 opened M 1 v 1 1 (t 1 u 1 )? : : : v 1 m (t 1 u m )? : : : M n v n 1 (t n u 1 )? : : : v n m (t n u m )? M i+1 > where v i j (t i u j )? means that the record v i j is generated from t i and u j and may not be contained in the trace, and where TRACE(P 0 ) = < M 0 0 opened M 0  1 u 1 : : The sequence of method invocations in the trace corresponds exactly to the sequence of method invocations in the execution of the EP, and the sequence of records in the trace corresponds exactly to the result computed by the EP.

Example 4
The set-oriented and the corresponding record-oriented EP for the declarative update described in Example 1 can be formulated as follows: 2 For t 2 T = f a 1 : v 1 : : : a n : vn ]g R e f (t) = Ref (T ) Advances in Databases and Information Systems, 1997 P = collect(map < a 3 raiseSalary a 1 < a 2 >> ( collect(map const < a 2 10 > ( collect(select index < a 1 E m p s a l a r y 30:000 >))))) P 0 = m a p < a 3 raiseSalary a 1 < a 2 >> ( map const < a 2 10 > ( select index < a 1 Emp salary 30:000 >)) The traces T and T 0 of P and P 0 , respectively, are then T = TRACE(P) = < E m p ! open scan index(a 1 salary 30:000) E m p! scan index(a 1 s a l a r y 30:000) : : : E m p! scan index(a 1 salary 30:000) E m p! scan index(a 1 s a l a r y 30:000) > where t i = a 1 : oi a 2 : 10] and s i = a 1 : oi a 2 : 10 a 3 : b i ], o i 2 extension(E m p ), b i 2 ftrue falseg for i = 1 : : : n By comparing T and T 0 we see that removing the collect operator changes the execution order of invocations of raiseSalary and scan index.These methods are in conflict if the latter is invoked for the class E m pand the property salary.Thus, P 0 is not an admissible or valid EP since it cannot be executed deterministically.
It follows that if all methods which occur in the trace commute, the evaluation of a record-oriented EP P 0 leads to the same result and terminal database state as the evaluation of its corresponding set-oriented EP P, and P 0 is then a valid EP.However, commutativity between all methods is not necessary in order to guarantee a deterministic evaluation.This will be investigated more closely in the following.

Identifying Admissible Record-Oriented and Mixed EPs
We implicitly referred to commutativity as total commutativity when we observed that an EP is executed deterministically if all methods which occur in its trace commute.However, total commutativity between all methods is not necessary, as we will show with the following example.

Example 5
The following two EPs can be formulated for the declarative update in Example 1: P = map < a 5 r a i s e S a l a r y a 1 < a 4 >> ( map const < a 4  The methods salary and raiseSalary in T are executed for the same receiver.They are in conflict, since the former reads the property salary of an employee, while the latter changes this property.Thus T 0 cannot be derived from T by commuting non-conflicting method invocations.However, salary and raiseSalary do not commute totally, but partially, because changing the salary of an employee e does not have an influence on reading the salary of another employee e 0 .Since each employee's salary is read and written only once in this example, conflicts will actually not occur.P 0 can be evaluated deterministically, although not all invoked methods commute totally. This example shows that partial commutativity can be sufficient.Nevertheless we have to make sure that the methods in question are executed only once for the same receive.This is the case if the record column which holds the identifiers of the receiver objects contains only unique values in the evaluation of P. We say that a component a is unique in the evaluation of an EP P, if the execution of P yields the set of records S with a 2 Ref(S) and 8s 1 s 2 2 S s 1 6 = s 2 : s 1 :a 6 = s 2 :a Obviously a is unique in the evaluation of P, if it has been generated by the operators getand select index, and if P does not contain a join operator or the operator f l a t .The former condition guarantees that a is created with unique values, while the latter ensures that uniqueness is maintained since records are not duplicated during subsequent processing.If these conditions are not fulfilled, we cannot decide whether a is unique or not, unless we have further information about the result of the evaluation of operators.
We have seen that in some cases partial commutativity is sufficient to ensure a deterministic evalua tion.However, there exist other types of conflicts that do not prevent a deterministic evaluation, as the following example illustrates.
Example 6 Assume that the class E m phas an additional property department which holds the OID of an instance of a class Department.This class provides a property floor which indicates where a department is located.We want to increase the salary of all employees which work in a department on the third floor and earn more than $30.000.For this declarative update, the following EP P can be given: P = map < a 9 r a i s e S a l a r y a 1 < a 8 >> ( map const < a 8 10 > ( nested loop j o i n<a 5 > a 4 > ( select < a 6 == a 7 > ( map const < a 7 3 > ( map < a 6 floor a 5 > ( map < a 4 department a 1 > ( select < a 2 > a 3 > ( map const < a 3 30:000 > ( map < a 2 salary a 1 > ( get<a 1 E m p > )))))) T R A C E (P) contains invocations of the methods raiseSalary and salarywhich partially commute, but the component a 1 which holds the receiver objects for these operations may not be unique in the evaluation of P due to the join Advances in Databases and Information Systems, 1997 in the query.However, if we take a look at the algorithm given for nested loop join in Section 3.2, we see that the inner input -in analogy to outer and inner relation of a join -, i.e., P 2 , is completely evaluated before the first pair of matching records is actually computed.Thus all method invocations in the evaluation of P 2 are executed before the update method is first invoked.An invocation of the update method cannot have an influence on the execution of any of the methods that occur in P 2 , and conflicts between the update method and a method in the evaluation of P 2 can be ignored.
Since invocations of salaryoccur in the evaluation of the inner input of the nested loop join, the conflict between raiseSalary and salary can be ignored, and thus P can be evaluated deterministically.This observation also holds for the operators diff, and, if the operator sort is used for sorting the inputs, for merge j o i n .For diff, the inner input has to be computed completely before the first result record can be computed.For merge j o i n , both inputs have to be sorted, and none of the inputs is computed completely before the first pair of matching records is computed.If sorting is performed explicitly using the operator sort, we have to distinguish whether the conflicting method is executed before or after the sorting.In the former case, a conflict can be ignored, since due to the set-oriented evaluation of sort all invocations of the conflicting method are executed before the update method is first executed.In the latter case, the methods have to commute totally, because, due to the join, the uniqueness of columns in the resulting records cannot be guaranteed.
We generalize the observations of Example 5 and 6 in the following two theorems.The first theorem describes the conditions under which a stepwise transformation of a set-oriented to a record-oriented EP is possible: Theorem 1 Let P and P 0 be EPs of the form P = map < a m a rec < a p1 : : : a p l >> (op 1 (: : : (op k (collect(E) : : : ) P 0 = m a p < a m a rec < a p1 : : : a p l >> (op 1 (: : : (op k (E)::) : : : ) where op 1 : : : o p k are physical operators, but none of these operators is a collect operator.If P is executed deterministically, then also P 0 is executed deterministically and yields the same result, i.e., the same sequence of records is generated and the same database state is reached, if one of the follow ing conditions is satisfied: where Q and R are arbitrary subplans, or (ii) E is map < a 0 m 0 a rec < a 0 p1 : : : a 0 p l 0 >> (collect(Q)) where Q is an arbitrary subplan, a rec is a unique component in the evaluation of Q, m and m 0 commute partially, and each op i i = 1 : : : k ; 1, is one of the operators map, map const, or select, or (iii) E is map < a 0 m 0 a 0 r ec < a 0 p1 : : : a 0 p l 0 >> (collect(Q)), where Q is an arbitrary subplan, and m and m 0 commute totally, or (iv) E is get<a 0 C > or select index < a 0 C p v > and m and C ! scan(a 0 ), and m and C ! scan index(a 0 p v ), respectively, commute totally.

Proof:
The removal of a collect operator permutes the sequence of method invocations in the trace.We have to show that this permutation does not change the invocation order of conflicting methods.

Case (i):
For these operators, removing the collect operator has no impact on the trace.

Case (ii):
Removing the collect operator leads to two kinds of exchanges.Those between invocations of m 0 and method invocations induced by op 1 : : : o p k are uncritical as only read-only methods are involved.The critical permutations take place between invocations of m 0 and m.Since we assume that the values of a rec are unique in the evaluation of Q, i.e., Advances in Databases and Information Systems, 1997 for the same receiver object m 0 is always executed before m, partial commutativity between m and m 0 -as required in the theorem -is sufficient in this case.Case (iii) and (iv): In case of total commutativity, removing the collect operator is always possible, since conflicts do not occur between methods whose invocations are interchanged.
Remark: f l a t and join operators were excluded from the possible intermediate operators op 1 : : : o p k as these operators may lead to duplication of the values of a rec in several records.In this case total commutativity is required.However, we can include f l a t and join operators in case we have additional in formation, e.g., from database integrity constraints.These constraints may indicate that the execution of the methods which are invoked during the evaluation of these operators do not lead to a duplication.That is, in the case of f l a t the resulting sets of the method call have at most size 1, while in the case of joins a record from the inner input does only match with a single record from the outer input, and vice versa.
The second theorem shows that also within the inner inputs of binary operators a stepwise transformation from the set-oriented to the record-oriented strategy is possible.
Theorem 2 Let P and P 0 be EPs with P = map < a m a rec < a p1 : : : a p l >> (op 1 (: : : o p i (bop(Q op i+1 (: : : o p k;1 ( collect(op k (collect(S)) : : : ) P 0 = m a p < a m a rec < a p 1 : : : a p l >> (op 1 (: : : o p i (bop(Q op i+1 (: : : o p k;1 ( op k (collect(S))) : : : ) where op 1 : : : o p k are physical operators (including binary operators with a constant second argument), bop is the operator nested loop j o i nor diff, and Q and S are subplans of P. If P is executed deterministically, then also P 0 is executed deterministically and yields the same result, i.e., the same sequence of records is generated and the same database state is reached.

Proof:
The removal of the collect operator only permutes the sequence of method invocations which are executed during OPEN(P), i.e., the order of the method invocations which occur before the entry opened in the trace is changed.
Each of these method invocations is executed before the first pair of matching records in the binary operator is computed, and thus before the update method m is first invoked in the evaluation of P. Thus an invocation of m cannot have an influence on the results of the execution of these method invocations.If a conflict between m and a method induced by an operator op i+1 : : : o p k is detected, the conflict can be ignored.
These theorems show that total commutativity between all methods is not required in order to ensure a deterministic evaluation.Regarding Theorem 2, it directly follows that choosing the outer and inner input of the binary operators nested loop join and diff does not only have an influence on the evaluation cost and thus on the selection of the cheapest EP, but is also important for generating ad missible record-oriented and mixed EPs.E.g., if a conflict between a method which is invoked in the outer input of a nested loop join and the update method is detected -this would prevent a deterministic record-oriented evaluation -the conflict can be ignored if outer and inner input are exchanged.

Application of the Results
In our approach, the task of specifying conflicts explicitly, e.g., together with the database schema, might be rather expensive, since conflicts have to be specified between user-defined methods, and between user-defined methods on the one hand and system-defined methods on the other hand.However, conflicts can be derived automatically by exploiting traditional write/write and read/write conflicts, e.g., during the semantic analysis of the database schema at compile time.This is possible since we only utilize state-independent information for the specification of conflicts.The schema designer may then correct those cases where the methods semantically commute, although a conflict has been detected.A similar approach has already been suggested in [11].

Integration of the Conflict Test
The conflict test has to take place before the evaluation actually starts, such that its results can have an influence on the choice of the appropriate physical operators for the final EP.We shortly describe how the results of this paper will be integrated in the query optimizer of the object-oriented DBMS VODAK [17].
In VODAK we follow a rule-based approach for query optimization based on the Volcano optimizer generator [1] [7].The optimizer may enforce constraints in the generation of (sub)plans by using so-called physical properties (for example sortedness).According to Theorem 1 such constraints may occur when ommitting the collect operator in a subexpression of an expression with a map operator on top.Theorem 2 identifies cases where the collect operator can be savely omitted.In order to satisfy the constraints introduced by Theorem 1, we include information about methods that lead to potential conflicts, into the physical properties.By default, the implementation rules map the operator to an expression covered by collect.If we omit the collect operator and potentially case (ii) -(iv) of Theorem 1 can occur, we include this information into the physical properties of the plan generated.Later, when a map operator tries to use such a physical expression as subplan, the conflict test can be executed by comparing required physical properties with the provided physical properties.Theorem 2 allows us to identify cases where we can abandon some of the constraints in the physical properties, namely whenever they occur in the second argument of a physical join or diff operator.In this way the optimizer can choose record-oriented and mixed plans whenever they are safe alternatives.Thus we can take advantage of record-oriented processing strategies which can be more efficient than the corresponding set-oriented ones, although conflicts between the methods which are invoked during the evaluation exist.

Performance
Since parallelism cannot be exploited in traditional database systems, the execution of a record-oriented EP P 0 takes as long as the execution of the corresponding set-oriented EP P, provided the database buffer is large enough to hold all records which have to be processed.However, if the database buffer is too small, in contrast to the execution of P 0 , the execution of P requires repeated replacement and reloading of records in the database buffer.We can estimate this effort for replacing and loading of records for a particular class of queries as follows.Assume that P and P 0 consist of n algorithms (no selections or joins) which have to be executed consecutively.Each algorithm has to process k records.The database buffer can store i records, where i k.Loading a record into the buffer needs l units of time, and replacing a record needs r units of time.Then the total time for loading and replacing records during the execution of P and P 0 , respectively, is given as follows.
for P: t = i l + ( k ; i) n (r + l)3 for P 0 : t 0 = i l + ( k ; i) (r + l) Thus, we can achieve, if i k, approximately a factor of n speed-up in the time spent for buffer management.
To substantiate these considerations we have compared the execution time of record-oriented and corresponding set-oriented EPs for queries in VODAK (for this particular experiment it did not matter whether we considered queries or updates).The queries were posed to a 180 MB protein database which was developed within the DOCKING-D project [8] at GMD-IPSI.The execution of queries and updates in VODAK is realized analogously as described in Section 3. In case the database buffer was large enough, the execution time of record-oriented and corresponding setoriented EPs was, as predicted, nearly equal (difference ca.1%).If the database buffer was too small, the execution time differed considerably: in the worst case, the execution of a record-oriented EP was 3.65 times faster than the execution of the corresponding set-oriented EP.In this case, 1069 of 1197 instances of a class were selected first, and 15 method calls were then applied to the selected objects consecutively.The database buffer could store 120 objects on average.

Conclusion
In this paper we have investigated non-determinism in the evaluation of record-oriented EPs for declarative updates.In the examined cases non-determinism is induced by conflicts between update and read-only methods that are invoked within the algorithms which are executed for the evaluation.We have developed a framework which allows to identify admissible record-oriented and mixed EPs.It is shown that some of the detected conflicts can be relaxed and even completely ignored, while a deterministic evaluation can still be guaranteed.We have sketched a possible realization of our concept within a rule-based query optimization framework.
Generally speaking, we have introduced a new optimization potential with regard to the efficient evaluation of declarative updates.The contribution of our paper is twofold.First, and most important, we have shown that not all existing conflicts actually cause a non-deterministic evaluation.Second, we have illustrated that the generation of record-oriented and mixed EPs which are executed deterministically can be ensured if a conflict relation is available for the methods, or, in general, for the database access operations which are executed during the evaluation, and if appropriate conflict tests are performed before the actual execution starts, e.g., during the query optimization process.
The formal framework we have developed in this paper can also be applied to more complex declarative updates than investigated here, e.g., containing several update methods, or containing update methods in the selection conditions.Investigating these cases will be part of our future work.

T
R A C E (get<a C>) : = < C ! open scan(a) opened C ! scan(a) s 1 : : : C ! scan(a) s n C ! scan(a) > Let TRACE(P) : = < M 0 opened M 1 t 1 : : : M n t n M i+1 > T R A C E (map < a m a 1 < a 2 : : : a k >> (P)) :=< M 0 opened M r 1 o 2 !m 2 : r 2 ) and (o 2 !m 2 : r 2 o 1 !m 1 : r 1 ) lead to same resulting database state DB 1 , and the results r 1 and r 2 of the two method invocations are the same.They commute partially if additionally for the receiver objects o 1 6 = o 2 holds.
:a k ) s n (t n ) M i+1 > The expression s(t) denotes a record s that is an extension of the record t, i.e., 8a 2 Ref(t) 2 , a 2 Ref(s) and t:a = s:a.E.g., for map, t is extended by a component which holds the result of the method call.
1 t 1 :a 1 !m(t 1 :a 2 : : : t 1 :a k ) s 1 (t 1 ) : : : M n t n :a 1 !m(t n :a 2 : : : t n Advances in Databases and Information Systems, 1997 P 0 is a valid alternative EP if the traces T and T 0 of P and P 0 , respectively, can be transformed into each other by :a 1 !salary() t n :a 1 !raiseSalary(t n :a 4 )? s n (t n )? > where x i = a 1 : o i ] and t i s i are defined analogously as in Example 4 for i = 1 : : : n .