IQL(2): A Model with Ubiquitous Objects (cid:3)

Object-oriented databases have brought major improvements in data modeling by introducing notions such as inheritance or methods. Extensions in many directions are now considered with introductions of many concepts such as versions, views or roles. These features bring the risk of creating monster data models with a number of incompatible appendixes. We do not propose here any new extension or any novel concept. We show more modestly that many of these features can be formally and (we believe) cleanly combined in a coherent manner


Introduction
We propose an extension of IQL AK89], therefore the name 1 IQL(2), to encompass many new extensions to the core OODB models that have been considered separately in the past.The model is based on two not novel concepts: (i) contexts that are used to parameterized class and relation names; and (ii) views to de ne intensional data.This brings two kinds of ubiquity to objects, i.e., the same object may belong really or virtually to several classes at the same time.We propose a rst-order language with static type-checking, under certain restrictions on the schemas.Most of the examples are given using a more convenient OQL-like syntax.
We brie y consider two technical issues: (i) quanti cation over contexts, and (ii) method resolution for ubiquitous objects.Quanti cation over contexts can be handled under some reasonable restrictions that we present.Uncontrolled ubiquity together with inheritance, leads to severe problems with respect to type checking and con ict resolution.We advocate here the use of strong restrictions so that standard resolution techniques can be used.
As illustrated by examples, the model captures in a coherent framework many features that have been considered separately in the past: (i) a model with objects, classes, inheritance, methods ala IQL or O 2 BDK92]; (ii) a view mechanism ala O 2 Views SAD94]; (iii) a versioning mechanism with linear versions and also alternatives (see, e.g., KC88]); (iv) a mechanism for objects with several roles BD77, RS91] ala Fibonacci ABGO93]; (v) the means of specifying distribution of data in several sites; (vi) a mechanism for data and schema updates (see, e.g., Zic92]); (vii) speci cation of access rights (see, e.g., RBKW91]).
Partially supported by Esprit Project GoodStep.y On leave from Departamento de Inform atica, Universidade Federal de Pernambuco, Brazil.Partially supported by CNPq grant number 200.803-92.1. 1No, Guido, this does not imply that there will be an IQL(3).
The paper is organized as follows.In Section 2, we introduce some notation and auxiliary concepts.A restricted form of the model (without views and inheritance) is presented in Section 3. The language is presented in Section 4. Section 5 deals with inheritance and Section 6 with views.The last section is a conclusion.Additional examples are given in Appendix A.
To conclude this section, we present in an example some of the features of the model.At one extreme, we may decide that one context is completely virtual and that no data is stored there.At another extreme, we can view the database as duplicated in contexts Paris and LA.Each object has a store in Paris and one in Los Angeles.An update method on an object o in Paris context would modify the store in Paris.It may call immediately a method on object o in LA to propagate the change, or one may prefer to propagate updates in batches using a program that is called regularly.2

Preliminaries
In this section, we introduce some notation and some auxiliary concepts.
We consider the existence of the following pairwise disjoint and in nite countable sets: 1. rel: relation names R 1 ; R 2 ; ::: 2. class: class names C 1 ; C 2 ; ::: 3. obj: object identi ers (oid's) o 1 ; o 2 ; ::: 4. dom: data values d 1 ; d 2 :::.The set dom is typically many sorted.It contains the sorts int, real, bool, string and a particular sort for context identi ers (cid's) that will be application dependent.The data sorts will be denoted d 1 ; d 2 ; :::.The values of sort d i are dom(d i ).The set of cid's will be denoted cid.Given a set O of oid's, the set of values that one can construct is denoted val(O): 1. val(O) contains O and dom; 2. val(O) is closed under tupling and nite setting.(Other constructors such as sequencing or multi-setting can be added in a straightforward manner and will not be considered here.)The cid's will serve many purposes.If we take cid's in 1::n], we model time versions.By organizing the cid's in a dag, we also model alternative versions.By taking cid's for instance in f London, Paris, LA, etc. g, we model distributed databases with the same object (with distinct repositories) possibly in many sites.By choosing cid's in f John, Peter, Max, etc. g, we model access rights for various users.
In practice, one may want to use cid's with a richer structure, i.e., use complex values or objects to denote contexts.For instance, in a versioned and distributed database, one would like the domain of cid's to be the set of pairs (timestamp,location).We ignore this aspect here since this would unnecessarily complicate the model, and view the cid's as atomic elements.Indeed, in most of the discussion, we assume that the domain cid of the cid's is an initial fragment of the integers.However, in examples, we sometimes use a richer structure for cid's.
We consider that the \names" of both the schema and the instance are indexed by the cid's.A class in our context is now C(n) for some cid n, and a relation becomes R(n).On the other hand, objects are not indexed by cid's.However, their values and behaviors depend on the roles that they are taking.For instance, a versioned object is the same object in all its di erent versions.Its value and behavior depend on the particular version that is considered.
Given a set C of classes and the set cid of cid's, C(cid) denotes C cid. Starting from sets C, and cid, the types types(C(cid)) are de ned by the following abstract syntax: := d i j C(cid) j A 1 : ; :::; A n : ] j f g j + j ?
where n 0, the A i 's are distinct and \+" is the union of types.
An oid assignment is a mapping from C(cid) to 2 obj fin (the nite powerset of obj).It gives the population of each class in each context.(Note that class populations are not required to be disjoint and objects may be explicitly in many di erent classes.)The set of oid's occurring in is denoted O.
The semantics of types is given with respect to an oid assignment : 3. nite setting and tupling are standard; 4. 1 + 2 = 1 2 ; 5. ?= ;.Given an oid assignment and the corresponding nite set O of objects, a value assignment is a mapping from O C(cid) to val(O); i.e., it associates to a triple (object,class,cid), a value.Remark 2.1 Observe that the value of an object is depending on two parameters: the context and the class.Suppose that we have two contexts business and personal, modeling respectively my business phone-book and my private one.Suppose that we have two classes Friend and Researcher.Suppose that Jones is a friend and a researcher.Then, I may have phone informations for Jones in both contexts and in both classes.The fact that some data is stored and some may be derived is irrelevant (so far). 2

Database Schema and Instance
We de ne the schemas and the instances.We ignore rst an important aspect, namely, the speci cation of the \virtual database" ( below), which is the topic of Sections 5 (inheritance) and 6 (views).
De nition 3.1 A database schema S is a tuple (R; C; cid; T; ) where: (i) R, C, are nite sets of relation and class names; (ii) cid is the nite set of contexts; (iii) T : R(cid) C(cid) !types(C(cid)); (iv) is a view program to be de ned later.This is a conservative extension of IQL.First, R is the set of names of roots of persistence, C the set of class names, cid (is new and) is the set of contexts, T is the typing constraint.In IQL, the view program is simply the inheritance hierarchy since there is no other mechanism for virtual data there.
It is important to observe that we associate types to pairs involving a name (relation or class) and a cid.This captures the fact that the same name may have di erent types in di erent contexts.For instance, if the contexts are versions, the type of a class is allowed to evolve in time.Observe also that the type of a class or a relation in some context may refer to a class in another context.
Example 3.2 We consider a database context Global that is the integration of the two local database contexts, LA and Paris.
The schema is as follows: Let R = fR p ; R la ; R g g, C = fEmployeeg, cid = fParis; LA; Globalg and T be de ned by: class where O is the set of oid's occurring in .
Ignoring the view mapping, we now specify the notion of well-formed instance: De nition 3.4 Let ( ; ; ) be an instance over a schema S. The instance is well-formed if the following typing constraints are satis ed: Two well-formed instances are given in Figure 1.Intuitively, instance I 2 is obtained from instance I 1 by deriving some new data.
Instance I 1 Instance I 2 (Employee(Paris)) = fo 1 ; o 2 g (Employee(Paris)) = fo 1 ; o 2 g (Employee(LA)) = fo 1 g (Employee(LA)) = fo 1 g (Employee(Global)) = ; (Employee(Global)) = fo 1 ; o 2 g (R p (Paris)) = fo 1 ; o 2 g (R p (Paris)) = fo 1 ; o 2 g (R l (LA)) = fo 1 g (R l (LA)) = fo 1 g (R g (Global)) = ; (R g (Global)) = fo 1 ; o 2 g We now de ne a many-sorted rst-order calculus then give examples of queries in an OQL-like syntax.(As in IQL, we could have used here a rule based language but since recursion is not important here, we prefer to focus on a simpler language not to obscure the issue.)We rst consider \ xed contexts" in the sense that we disallow quanti cations over cid's.

A Fixed Context Calculus
The calculus is de ned as follows: Terms The terms of the calculus are: 1. d for each d in dom; 2. R(n) for R in R and n in cid (R(n) denotes the value of relation R in context n); 3. variables x where the type does not refer to the sort cid (the type is omitted when clear from the context); 4. constructed terms with tupling ( A 1 : t 1 ; :::; A n : t n ]), setting (ft 1 ; :::; t n g), projection (t:A for A an attribute), and dereferencing ( t for t denoting an object).The sorts of terms are de ned in the straightforward manner.
Formulas, queries: Atoms are t = t 0 , t 2 t 0 for t; t 0 terms with compatible types, or x x 0 where x; x 0 are of resp.sorts C(n); C 0 (m).(This is interpreted as x and x 0 are the same object in di erent contexts.)Formulas are atoms, or L _ L 0 , L ^L0 , L ) L 0 , :L, 9x (L) or 8x (L) where L; L 0 are formulas.A query is an expression of the form fx j 'g where ' is a formula with only free variable x.
Range-restriction As standard, we restrict our attention to range-restricted formulas and queries.
The range-restriction we adopt here is standard.From this point of view, the only novelty is the use of that behaves exactly like equality for range-restriction.Contexts play no role for range-restriction since we assumed they are constant.From a language viewpoint, the only (relative) novelty is the use of .We illustrate it with an example.Suppose that the cid's are timestamps and that the last two versions are denoted by the constants previous and now.Let Persons be a set of objects of class Person.We can obtain the phone number of persons that have not changed phone number since last version: fP:phone j 9P 0 2 Persons(previous)(P 2 Persons(now) ^P P 0 ^P:phone = P 0 :phone)g; or using an OQL-like syntax: select P.phone from P in Persons(now) where P.phone in select P'.phone from P' in Persons(previous) where P' P. We could express the same query in a simpler manner if either (a) a eld previous (possibly virtual { see below) contains the previous state of each object or (b) using casting: select P.phone select P.Phone from P in Persons(now) from P in Persons(now) where P.previous.phone= P.phone where P.phone = P@Person(previous).phonewhere P@Persons(previous) denotes the casting of P to the same object in class Person(previous).Such casting can be viewed as syntactic sugaring.Another form of syntactic sugaring would be to permit to test whether an object is also in some di erent contexts.This allows us to rephrase (more carefully) the above query: select P.Phone from P in Persons(now) where P is also Person(previous) and P.phone = P@Person(previous).phoneRemark 4.1 To see a more complicated example with \structured" contexts, suppose that we are in a versioned database with one context for private data and one for professional one.To obtain the actual home phone numbers of friends who worked on OQL in 1990, we use: select P.phone from P in Persons(private,now), P' in Persons(prof,1990) where \OQL" in P'.works on and P P' where the domain of cid's is a set of pairs (context,timestamp). 2

Quantifying over Contexts
We start with two examples and then consider some di culties that are raised.First, suppose that cid consists of two contexts, namely LA and Paris, and that we want to modify the salaries of employees by taking the maximum of the salaries in the two contexts.We may use one of the following programs: Observe that the second one, although clearly more desirable (imagine 20 sites!), uses cid variables, i.e., Site1, Site2, for specifying the context (whereas LA for instance is a constant).This is a quanti cation over some contexts.
From the example, it is clearly convenient to be able to quantify over contexts.However, this complicates the type checking of programs as illustrated by the following example.
Suppose that the context is 1..now] and that in Version 15, we added an attribute to class Person, e.g., an email address.Consider the following queries asking for the name of persons such that their stored value has been modi ed at least once (since Version 17): Query 1 Query 2 select = P.Name select P.Name from N in Contexts, P in Persons(N), from N in Contexts, P in Persons(N), P' in Persons(now) P' in Persons(now) where P is P' and not ( P = P') where P is P' and not ( P = P') and N > 17 where Contexts is a relation containing the set of valid contexts.
Recall that \ " denotes dereferencing.Observe that Query 1 should raise an error since the type of a person now and say in Version 14 are di erent.The sorts of the values for a person now and at time 14 are not compatible and P = P 0 is incorrect.On the other hand, Query 2 should be acceptable as far as we test for N > 17 before testing other conditions.However, an issue also of Query 2 is type checking since because of the schema update, we cannot assign a type to P. A rst solution is to use dynamic type checking.Another one is to require that the quanti cation over N be outermost and apply the restrictions on context variables during type checking (i.e., at compile time).
More formally, we require the formula to be of the form: where Q 1 ; :::; Q m are quanti cations over contexts, ' is a (range-restricted) formula that has no quanti cation over contexts, its only free-variables are contexts (' restricts the range of the contexts), is ^or ) and contains no quanti cation over contexts.
Query 2 can be expressed in this form: fP:Name j 9N((Context(N) ^N > 17) ^9P; P 0 (Persons(N)(P) ^Persons(now)(P 0 ) ^P P 0 )) Intuitively, this suggests the following evaluation.First ' is evaluated.Since it has no quantication over context, its evaluation raises no issue.Then, based on the results of ', the global query is transformed into a boolean combination of queries with no quanti cation over context.Each of these queries can be typed checked and executed separately.
Observe that this form is restrictive since it does not allow expressing queries of the form f: : : j 8x9n:::g where the value of context n depends on x.It is possible (although rather intricate) to nd natural examples of such queries (for instance, see the example above where the eld previous contains the previous state of each object).

Inheritance
In this section, we consider the addition of an inheritance relationship to the schema.Since classes in contexts play the role of standard classes, we need to consider statements such as C(n) isa C 0 (m) that possibly relates two distinct contexts.We assume that the inheritance hierarchy is a dag. 2. we access some method m.This is legal if for some C j (n j ) (j : 1::i), the resolution2 of m in C j (n j ) is de ned and is some class C 0 ; and for each C k (n k ) (k : 1::i); the resolution of m in C k (n k ) is also C 0 or is not de ned.Multiple roles do complicate a lot the issue.Consider a class C(n) with m subclasses.Then a variable of class C(n) may denote an object o such that the set of subclasses of C(n) where o is explicitly, may be any of the 2 m subsets of subclasses of C(n).This leads to two important issues: Problem (1): At run time, given an object o and a role C(n) for this object, nd fast the store for some attribute A and the code m for a method m.Problem (2): At compile time, statically type check a program.Both will be time consuming.Both can be simpli ed if we specify a compatibility relation that speci es where objects can be concurrently explicitly.More precisely, is an equivalence relation over C(cid), and C(n) C 0 (m) indicates that an object may belong explicitly to both classes concurrently, so that multiple instantiation is constrained to classes in the same partition w.r.t. to .Type checking can be eased if, in addition, we make antisymmetric by constraining types of classes related by to be comparable w.r.t. to standard subtyping.This would de ne a role hierarchy, but we adopt a more general approach where role hierarchies can be de ned, if necessary, through a view.
To see an example, consider a database of boats and airplanes with three classes, Boat, AirPlane, Vehicle and the schema: class Boat : Name : string, Price : integer, Propeller : string] isa Vehicle AirPlane : Name : string, Price : integer, Speed : integer] isa Vehicle Vehicle : Name : string, Price : integer] If we know that the compatibility relation is empty, an access to the price of a vehicle is legal.
Otherwise, there is a potential con ict since the same object may be in classes AirPlane and Boat explicitly.
The use of is investigated next.

A Trade-o
It is standard to prohibit (or at least control) multiple-inheritance in the context of single-roles.We now add a condition to handle multiple roles.
A schema is strict if for each C(n); C 0 (m), such that C(n) C 0 (m) and C(n); C 0 (m) are not comparable in the isa hierarchy, there is no C 00 (p) such that C(n) and C 0 (m) are both subclasses of C 00 (p) (i.e., C(n) and C 0 (m) have no common ancestor).
For strict schemas, the resolution issues above disappear, i.e., it is easy to see that for each object o and role C(n), this leads to standard resolution for o in the unique class below C(n) where it belongs explicitly.This leads to resolution with a parameter, the class C(n) (i.e., Problem (1) disappears).For non-strict schemas, we can adopt multi-attribute resolution (to solve Problem (2) and techniques such as multi-attribute dispatch tables can be used AGS94] (to solve Problem (1)).

Views
In the previous section, we already considered the speci cation of view mappings, but we restricted our attention to a special class of view mappings related to inheritance only.In this section, we use the entire power of the rst-order language of the previous section to de ne view mappings.
A view program allows to specify from the value of the database composed of explicit information (instance ( ; ; )), a well-formed virtual database (instance ( ; ; ) below).
Queries are rst used to populate classes and relations as in: Employee(Global) w fx j Employee(Paris)g Employee(Global) w fx j Employee(LA)g R g (Global) w fx@Employee(Global) j x 2 R p (Paris)g R g (Global) w fx@Employee(Global) j x 2 R la (LA)g We use two queries to de ne Employee(Global) since a single one would be incorrectly typed.Note also that the above de nition does not not prevent the class Employee(Global) to have explicitly objects in it.Remark 6.1 In the presentation so far, we have implicitly assumed that the extensions of base classes are given and used to compute the extensions of derived classes.It is argued in SAD94] that in many applications, it is not desirable to maintain the extensions of classes.Furthermore, some systems (such as O 2 ) do not provide extensions for base classes, and it would be unnatural to maintain that of derived classes in such context.If class extensions are not maintained, the de nition of Employee(Global) is not necessary and can be viewed as \derived".2 Using such rules, it is easy to specify the values of and .For the speci cation of , we can use two approaches.
In an explicit manner, we can specify or enrich the value of each object in its new class with rules of the form: var x : Employee(Global); x 0 : Employee(LA) define x:phone = uniquefx 0 :phone j x 0 xg This can also be achieved implicitly.We assume that by default, the values of objects are transmitted via derivations.For instance, if an object is in Employee(Global) because of its presence in Employee(LA), then it \inherits" its structure from that of the employee in LA.This implies some constraints on the types that are similar to constraints on types in presence of inheritance.(Recall that inheritance is just a special case of view.) A problem is that the presence of an object in some class C(n) may have its origin in the presence of the object in more than one other classes.For instance, an object may be in Employee(Global) because it belongs to Employee(Paris) and also because it belongs to Employee(LA).
In such cases, the new value is obtained (a) by merging the values associated to the originating object/context pairs, and (b) projecting (casting) to the type that is expected.More precisely, suppose that we de ne the population of class C in context n as the union of ' i where for each i, ' i returns a set of objects of type C i (n i ).Then the value of an object o for C(n) is de ned by: (o; C(n)) = T (1 f (o; C i (n i )) j o 2 ' i g) where merge (1) and projection ( ) are de ned next.
De nition 6.2 The merge of two data values is de ned by: 1. v 1 v = v for each v; 2. if t 1 ; t 2 are tuples, t 1 1 t 2 is the tuple t (if it exists) such that for each attribute A of t 1 and t 2 , t(A) = t 1 (A) 1 t 2 (A); and for each i; j; j 6 = i, if t i has attribute A and not t j , t(A) = t i (A); t has no other attribute; 3. otherwise v 1 v 0 is unde ned.
Observe that two tuples with two non-merge-able values (e.g., integer 4 and 5) for the same attribute, are not merge-able.This does not prevent for instance an object o to have two distinct values, say 4 and 5, in two distinct classes.On the other hand, this cannot happen (in a correct instance) if these two versions of the same object are merged in a unique class.
The projection of a value on a type (given an oid assignment ) is de ned recursively as follows: 1. if is C(n) and v = o is in (C(n)), then (v) is o; 2. if = A 1 : 1 ; :::; A m : m ] and v = A 1 : v 1 ; :::; A n : v n ] for m n and for each i m, i (v i ) is de ned, then (v) = A 1 : 1 (v 1 ); :::; A m : m (v m )]; 3. if = 1 + 2 and either (i) 1 (v) or 2 (v) is de ned and equal to v 0 but not both; or (ii) they are both de ned and equal to v 0 , then (v) = v 0 ; 4. otherwise, (v) is unde ned.To conclude this section on views, observe that we have two ways for an object to be virtually in a class.One is by inheritance and the other one is by the view mechanism.We advocated a strict policy for handling inheritance to simplify the treatment of inheritance con icts.The view mechanism is handled di erently.It may be more liberal at the price of being more costly.

Conclusion
In this paper, we have presented a model with many features that are usually considered separately.Our discussion on methods has been quite brief but we believe we covered the main issue, method resolution.Our treatment of views has also been rather short and many features of SAD94] such as imaginary objects were not considered here.However, they would only have made more complicated the model at the cost of clarity and do not present any new di culties.

Figure 2 :
Figure 2: Inheritance and Con icts Example 1.1 Consider a distributed database with two sites: Paris and Los Angeles.Paris and Los Angeles are two contexts of a unique database.Suppose that the database deals with persons, friends and researchers, i.e., we have classes Person, Friend, Researcher.Classes Friend and Researcher are subclasses of Person in both contexts.Let Dupond be an object.First, suppose that in Paris, Dupond is considered a friend, and in LA both a friend and a researcher, i.e., Dupond belongs to class Friend(Paris), Friend(LA) and Researcher(LA).By inheritance, Dupond is also in classes Person(Paris) and Person(LA) (with possibly di erent behaviors in each).Now, we may decide that the data on friends is recorded in LA.We therefore have a relation Friends(LA), and see relation Friends(Paris) as a view of Friends(LA).This would mean that the store for Dupond is in LA and that Dupond is only virtually in class Friends(Paris).This does not prevent Dupond from being really in Researcher(LA) with a speci c store there.