Database Programming Languages (DBPL-5)

We will examine the problem of distinguishing between database instances and values in models which incorporate object-identities and recursive data-structures. We will show that the notion of observational distinguishability is intricately linked to the languages available for querying a database. In particular we will show that, given a simple query language incorporating a test for equality of object-identities, database instances are indistinguishable i(cid:11) they are isomorphic, and that, in a language without any operators on object-identities, database instances are indistinguishable i(cid:11) a bisimilarity relation holds between them. Further, such a bisimulation relation may be computed on values, but doing so requires the ability to recurse over all the object-identities in an instance. We will then show that systems of keys give rise to observational distinguishability relations which lie between these two extremes. We show that a system of keys satisfying certain restrictions provides us with an e(cid:14)cient means of comparing values, while avoiding the need to compare object identities directly.


Introduction
Suppose you were presented with two database instances and wished to nd out whether or not the instances were dierent using some query interface.Using certain data-models and query languages this might be easy.For example, in a relational database system, simply printing out the two instances and comparing them would suce.More succinctly, y ou could nd a single query which w ould produce dierent results when applied to any t w o instances if the instances were dierent.Even if the instances and interface involved more complex but xed depth types, such as in a nested relational model, as long as the query interface allowed you to \see" instances completely you could distinguish any t w o distinct instances.However, in a model allowing recursive or arbitrarily deeply nested data structures, such as a semantic or object-oriented data model [4,10], this technique will not work.In this case database instances must use some kind of reference mechanism, such as object identities, pointers, logical variables, or some other nonprintable values, and so physically diering instances may give identical results on all possible queries.Of the various possible reference mechanisms, we will focus our attention on object identities since they oer the advantage of locational and data independence, and also aord ecient implementation techniques [11].
Suppose, for example, we had the two instances shown bellow: object identities are represented by , and each identity has a value associated with it consisting of an integer and another object identity.This research w as supported in part by the following grants: DE-FG02-94-ER-61923Sub 1, BIR94-02292PRIME, DAAH04-93-G0129, DE-AC03-76SF00098.5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 If our query language allowed us only to print out the values on paths of any xed depth, then we could not observe a n y dierences between these two instances: they would both represent the innite sequence of integers 1; 2; 3; 2; 3; 2; 3; : : : .H o w ever their representations are clearly dierent.Though this hypothetical situation may appear unrealistic, it in fact represents a fundamental problem: in any query or database programming language it is necessary to have some means of comparing data values in an instance.Further, in order to reason about the expressive p o w er of a data-model and query language, it is necessary to be able to compare distinct database instances and to communicate information between them.These issues are complicated by the presence of object identities in a data model: there may b e many dierent w a ys of representing the same data using dierent c hoices, and possibly dierent structures and interconnections of identities.Consequently we w ould like to regard object-identities as not directly observable, and equate any v alues which are observationally indistinguishable.W e shall see that the notion of observational distinguishability i s i n tricately linked to the languages and operations that are available for querying a database.An understanding of these issues is essential in the design of languages for such data-models.In this paper we will make use of a data-model equivalent to that of [1] in order to examine these issues.We will dene an isomorphism relation on database instances, representing when two instances dier only in their choice of object-identiers, and a bisimulation relation, representing when two instances have the same set of paths.We will prove that, given a simple query language incorporating a test for equality o n object-identities, two instances are indistinguishable if and only if they are isomorphic, and that, in a query language without any comparison operators available on object identities, two instances are indistinguishable i they are bisimilar.However, in both of these cases, it is not possible to nd a generic query to distinguish between instances: that is, it is not possible to nd a nite set of queries, dependent only on a database schema, which will evaluate to the same values on two instances if and only if the instances can not be distinguished with any query.We show that it is possible to compute the bisimulation relation on values of a database, but in order to do so it is necessary for our query language to allow recursion over the nite extents of object-identities in the database.We conclude that isomorphism and bisimulation represent respectively the nest and coarsest possible observational equivalences on instances.An important class of observational equivalences, in between these two, can be obtained using systems of keys to determine object identities.We show that, given certain acyclicity restrictions on a system of keys, the resulting equivalences on values can be computed eciently without resorting to recursion over the entire set object identities.Consequently, b y making such systems of keys primitive in a query language, we can obtain a value-oriented language while achieving much of the eciency of an object-identity oriented language.Further suitable systems of keys can be used to control the creation of object-identiers in a manner similar to that of [9], so that we can we can have a query language which supports the creation of object identiers, but avoids the potential for non-terminating computations present in languages that allow unconstrained creation of object identities, such as IQL ( [1]).

A Data model with object identities and nite extents
The description of our data-model falls naturally into two parts: the denition of schemas and that of instances.The schemas are dened in terms of types, and consist of a type system which is dependent o n a nite set of classes, and an association between these classes and types.The model presented here is equivalent to that of [1], and could also be considered to be simplication of the models of [3,10].

Types and schemas
The types in our model are similar to the nested relational types of [2] with the additional feature of class types.These represent the extents present in a database, and therefore go beyond the structural information normally associated with a type system.In order to describe a particular database system it is necessary to state what classes are present, and also the types of (the values associated with) the objects of each class.We consider that these two pieces of information constitute a database schema.Note that, in many data-models, schemas may represent a wide variety of additional constraints; however we believe that this information represents the minimal information which m ust be present in the schemas of any data-model.Assume a nite set of classes C, ranged over by C;C 0 ; : : : , and a countable set of attribute labels, A, ranged over by a; a 0 ; : : : .The types over C, ranged over by ; : : : , consist of base types b, class types C, where C 2 C , r e c ord types (a 1 : 1 ; : : : ; a k : k ), variant types hja 1 ; 1 ; : : : ; a k : k j i , and set types fg.W e write Types C for the set of types over C. A schema consists of a nite set of classes, C, and a mapping S : C ! T ypes C , such that S(C) = C where C is not a class type.(Since C can be determined from S we will also write S for the schema).

Values and instances
The values that may occur in a particular database instance depend on the object identities of that instance.Consequently we m ust rst dene the domain of database values and the denotations of types for a particular choice of sets of object identities, and then dene instances using these constructs.
Suppose, for each class C 2 C w e h a v e a disjoint nite set C of object-identities of class C.
For each base type b, assume a domain D b associated with b.W e dene the domain of our model for the sets objects identities C , D( C ), to be the union of the following sets: D b for each base type b; C for each class C 2 C ; partial functions with nite domains from A to D( C ) for record types; pairs from A D ( C ) for variants; and nite subsets of D( C ) for set types.
[  This denes the instance illustrated in gure 3.

Isomorphism of instances
Two instances are said to be isomorphic if they dier only in their choice of object identities: that is, one instance can be obtained by renaming the object identities of the other instance.Since object identities are considered to be an abstract notion, and not directly visible, it follows that we would like to regard any t w o isomorphic instances as the same instance.In particular, any query when applied to two isomorphic instances should return isomorphic results.Isomorphism therefore provides the nest level of distinction between instances that we might hope to observe.
If I and I 0 are two instances of a schema S, and f C is a family of mappings, f C : C ! 0C , then we can extend f C to mappings f b c c f a1 :1 ;:::;ak:k u (a 1 7 !f 1 (u(a 1 )); : : : ; a k 7 !f k (u(a k ))) f hja1:1;:::;akkj i (a i ; u ) ( a i ; f i u ) f f g f v 1 ; : : : ; v n g f f v 1 ; : : : ; f v n g Aisomorphism of two instances, I = ( C ; V C ) and I 0 = ( 0C ; V 0C ), of a schema S consists of a family of bijections, f C : C ! 0C , such that for each class C 2 C and each object identity I and I 0 , are said to be isomorphic i there exists an isomorphism f C from I to I 0 .W e write I = I 0 to mean I is isomorphic to I 0 .
We will show that, in a query language equipped with an equality test on object identities, isomorphism coincides exactly with observational indistinguishability of instances.

Bisimulations and correspondences between instances
The data model presented above captures our intuition about how databases with recursive v alues and extents are represented.We w ould also like a semantic model where two instances are considered to be dierent if and only if they are distinguishable, or equivalently, a w a y of grouping together those instances in our model which are indistinguishable.However to talk about whether two instances are distinguishable assumes some latent language for querying the databases, and of course the notion of distinguishability i s dependent on this language and the predicates available in it.
It is clear that the isomorphism relation on instances is at least as ne as any possible observational equivalence relation: that is, it should not be possible to distinguish between two isomorphic instances using 5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 any reasonable queries over instances.However there may w ell be indistinguishable instances that are not isomorphic.We will construct a \bisimulation" relation on instances based on the idea that no comparisons on object identities are available, and that only base values are directly observable.Other complex values, such a s sets and records, can be compared by comparing their component parts.In particular object identities are compared by dereferencing them and comparing their associated values.The equivalence classes of instances under this relation correspond to a regular tree or value based model of instances (see [1, 1 2 ]).Since we believe that equality tests on base types are common to any query language, and hence that any complex values not containing object identities can be tested for equality b y recursively applying type deconstructors and then base equality tests to the values, it follows that bisimulation is the coarsest possible observational equivalence relation on instances: if two instances are not in the bisimulation relation then any reasonable query system should be able to distinguish between them.We will rst dene correspondence relations between the object identities of two instances, and then dene bisimulation to be the largest correspondence relation satisfying certain consistency conditions.
A correspondence between two families of object identiers C and 0C is a family of binary relations C C 0C .For each t ype , w e can extend C to a binary relation 0C , so that are the smallest relations such that: 1. c b b c b for c b 2 D b , 2. x a1:1 ;:::;ak:k y if x(a i ) i y(a i ) for i = 1 ; : : : ; k , 3. (a i ; x ) h j a 1 : 1 ;:::;ak:kj i (a j ; y ) i f i = j and x i y, and 4. X fg Y if for every x 2 X there is a y 2 Y such that x y and for every y 2 Y there is an x 2 X such that x y.
A correspondence C is said to be consistent with instances I = ( C ; V C ) and I 0 = ( 0C ; V 0C ) if for each C 2 C and all o 2 C , o 0 2 0C , i f o C o 0 then V C (o) C V 0C (o 0 ).Note that the union of any family of consistent correspondences is also a consistent correspondence.
Let I, I 0 be instances of a schema S. Then I I 0 denotes the largest consistent correspondence between I and I 0 .W e call I I 0 the bisimulation correspondence between I and I 0 .Given any t w o instances I and I 0 , w e s a y I and I 0 are bisimilar and write I I 0 if and only if, for each C 2 C , 1. for each o 2 C there is an o 0 2 0C such that o I I 0 o 0 , 2. for each o 0 2 0C there is an o 2 C such that o I I 0 o 0 , Proposition 2.1: The relation is an equivalence relation on the set of all instances I of a schema S. Note that the relations and = do not in general coincide: it is easy to construct two instances which are bisimilar but not isomorphic, for example by duplicating object identities.The instances illustrated in section 1 are an example of two instances that are bisimilar but not isomorphic.
We will see that, for a query language which does not include any means of directly comparing objectidentities, observational indistinguishability coincides exactly with bisimulation of instances.
3 Querying the model In this section we will present an adaption of the query language SRI ( [5,6]) to the model of section 2.2.
The language is based on the mechanism of structural recursion over sets which w as described in [5] as a basis for a query language on the nested relational data-model.Our choice of this mechanism is because its semantics are well understood and because it is known to be strictly more expressive than other formally developed query languages for nested relational model, such as the calculus of [2].Consequently most of the results on the expresivity o f v arious operators in this language paradigm will automatically carry over to other query language paradigms.We will present t w o v ariants of the query language, SRI and SRI(=): the = representing the inclusion of the equality predicate on object identities.
The query language is described for a schema S, with classes C, such that S(C) = C for each C 2 C .
W e expand our type system to allow object types, , as dened in section 2.1, and rank 1 function types, !T , where T is a (object or rank 1 function) type.We assume base types unit, Bool with associated domains D unit f;g and D Bool f T ; F g , in addition other base types ranged over by b, with associated domains D b .( Bool is actually unnecessary since it is equivalent t o a v ariant of units, but is included for convenience).For each other base type b, and any v alue c 2 D b , w e assume a corresponding constant symbol c.A ground type is an object type which contains no class types.Ground types are signicant in that values of ground type are considered to be directly observable, while values of non-ground type will contain object identities, which do not have meaning outside of a particular instance.Further the set of values associated with a ground type will not be dependent on a particular instance, so that expressions of ground type can be evaluated in dierent instances, and their results can be compared.
A query is a closed expression of ground type.
The syntax and typing rules for SRI are given in gure 4. In SRI(=) we assume an additional binary predicate = C for each class C 2 C , with the typing rule `e1 : C `e2 : C `e1 = C e 2 : Bool = C tests whether two terms of type C evaluate to the same object identity.The semantics for SRI and SRI(=) are given in appendix A.

Distinguishability of instances in SRI(=)
Two instances I and I 0 are said to be indistinguishable in some query language L i, for any query q expressed in the language L, e v aluating the query q for either of the two instances I and I 0 returns the same result.
For the only-if part, given an instance I we construct an expression e I such that `eI : Bool and V [ [e I ] ]I 0 is true i I 0 = I.Details of the construction of e I are given in Appendix B. Claim: For any reasonable query language L, such that L supports an equality predicate on object identities, any t w o instances are indistinguishable in L if and only if they are isomorphic.Justication: We need to show that, in any natural query language we can think of for this model, it is possible to construct an expression equivalent to the expression e I from the proof of theorem 3.1.We observe that the constructors used in forming e I do not go beyond those found the nested relational algebra of [7], the calculus of [2] without powerset, or what we w ould expect to nd in any other query language.
The previous result tells us that, given any instance I, there is a query which distinguishes I from any other non-isomorphic instance, but does not tell us how to nd such a query without knowing exactly what the instance is already.Our next result tells us that, though any t w o non-isomorphic instances are distinguishable, it is not possible to nd a single query or set of queries which are independent of the database instances, but which will distinguish between non-isomorphic instances.This means that, given two instances and a query interface or language such a s SRI(=) for examining them, we can not in general decide whether or not the two instances are isomorphic, or nd a query which distinguishes between them.We m ust rst dene the notion of Z-internal functions on instances [8].
Suppose that is a function from instances of a schema S to some set D, and Z is a nite set of base values.
5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 For each v 2 D we write Supp(v) for the set of base values occuring in v (that is values in D b for some base type b).Also we write Supp(I) for the set of base values occuring in an instance I.It follows by a simple cardinality argument that e can not distinguish between these instances.Note that this proof requires only that SRI(=) expressions be Z-internal for some nite Z. Consequently the result holds equally well for any other pure query language: that is, any language incorporating operators to extract, manipulate and compare data from an instance, but which cannot express general computations.

Computing bisimulation correspondence using SRI
Recall that the query language SRI is the same as SRI(=), only without the = C predicates on object identities.So SRI gives us no way of directly comparing object identities.The second part of the proof is to show that, if I and I 0 are not bisimilar then they are distinguishable in SRI.Suppose I 6 I 0 .Then we can assume, with out loss of generality, that there is a class C and an object identity o 2 C such that o 6 C o 0 for any o 0 2 0C .W e can then build a series of SRI functions, each o f which unfolds object identities of class C to succesively greater depths, and show that, for any o 0 2 0C , i f o 6 C o 0 then there will eventually be an expression in this series which distinguishes between the two.For details of both parts of this proof see [12].
Claim: In any reasonably expressive query language, L, such that L does not support any means of directly comparing object identities, observational indistinguishability of instances in L will coincide prescisely with bisimilarity.Justication: First note that SRI is at least as expressive a s a n y other established query language which does not support comparisons of object identities.Consequently, i f t w o instances are indistinguishable in SRI then they will also be indistinguishable in any other such language.
The proof of the second part of proposition 3.4 relies on being able to create queries which unfold nested values to any xed nite height.We observe that any query language equipped with constructors and destructors for each of the basic types, basic logical operators and equality tests on each base type can express such nite unfoldings and tests of values.We claim that such operators will be present i n a n y reasonable query language for nested or recursive data-structures.
Using SRI (or some other reasonably expressive query language) we can also test for the bisimulation correspondence relation described in section 2.4 on individual values.That is, for any t ype , w e can form a function expression Cor : ( ) ! Bool such that, for any u; v 2 This result tells us that SRI has the same expressive p o w er as SRI() (the language SRI augmented with predicates for testing ).
This result is a little surprising since our values are recursive, and we can not tell how deeply we need to unfold two v alues in order to tell if they are bisimilar.We are saved by the fact that all our object identities come from a xed set of nite extents.The cardinality of these extents provide a bound on the number of unfoldings that must be carried out: if no dierences between two v alues can be found after P fjCj j C 2 C g dereferencings of object identiers, then the values are equivalent.Consequently we can implement Cor by iterating over each class, and for each identier in a class unfolding both values.
Unfortunately this implementation of seems to go against our philosophy of the non-observability of object identities: if we can't observe object identities then should we be able to count them?From a more pragmatic standpoint, a method of comparing values which requires us to iterate over all the objects in a database is far too inecient to be practical, especially when dealing with large databases.We w ould like to know i f w e can test for without iterating over the extents of an instance.The following subsection will show that this is not possible.

N -bounded values and SRI N
A v alue v is said to be N -bounded i any set values occuring in v have cardinality at most N .An instance I is N -bounded i for each class C 2 C and every o 2 C ,V C (o) i s N -bounded.Note that, for any instance I there is an N suciently large that I is N -bounded.
We n o w dene a variant of the language SRI which has the same power as SRI when restricted to N -bounded values, but which will not allow recursion over sets of cardinality greater than N .
The language SRI N is the same as the language SRI except that an expression sri(f; e; u) is not dened if jV [ [u] ] I j is greater than N .Proposition 3.5: It is not in general possible to compute the correspondence relations on N -bounded instances using the language SRI N .That is, there exists a schema S and type such that there is no expression Cor with `Cor : !Bool such that V [ [Cor] ] I coincides precisely with .Proof: First note that for any SRI N expression e, there is a constant k e , such that any e v aluation of an application of e will involve less than k e dereferences of objects.Consequently it is enough to construct a schema with a recursive structure such that, for any constant k, w e can construct an instance containing two objects which require k + 1 dereferences in order to distinguish between them.This tells us that we can not hope to test if two v alues are equivalent using SRI, o r a n y other reasonable query language, without making use of recursion over classes.We conclude that a more ecient mechanism for comparing values is needed.
5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 We h a v e seen that comparing database instances and values in instances involving object identities is problematic.On the one hand, we m a y consider only the values in a database to be signicant and not wish to allow direct equality tests on object identities, since this would force us to distinguish between dierent representations of the same values.On the other hand, we h a v e shown that computing bisimulation or value-based equivalence requires the ability to recurse over all the object identities in an instance.Such a n equivalence relation is expensive to use in a query language over databases, and a more ecient means of comparing values is required.
A solution, common in many practical database systems, is to use keys: simple values that are associated with and used to compare object identities.Two object identities are taken to be equivalent i their keys are equivalent.In a sense this can be thought of as computing an equvalence similar to , but restricting the parts of the instance that are tested for comparison.However it is also possible to have external keys which depend not only on the value associated with a particular object in the database, but on other objects and values in the database as well.
In this section we will formalize the idea of keys, and show h o w they can determine equivalences on values that lie in between equality and bisimulation, as illustrated informally in gure 5. We show that, if a key specication satises certain acyclicity properties, then the resulting equivalence on values can be computed without resorting to recursion over the entire set of object identities.

Key specications
Suppose we h a v e a s c hema S with classes C. A k ey specication for S consists of a type C for each class C 2 C , and for any instances I = ( C ; V C ), a family of functions K C I : C ! [ [ C ] ]I for each C 2 C .W e write K C for such a k ey specication.The idea is that, for any instance I, K C I will map object-identities of class C to their keys, and that any t w o object identities will be considered to be equivalent i they have the same, or equivalent k eys.Example 4.1: Consider the schema described in example 2.1.We w ould like t o s a y that a State is determined uniquely by its name, while a City is determined uniquely by its name and its state (one can have t w o Cities with the same name in dierent states).The types of our key specication are therefore A k ey specication is said to be well-dened i for any t w o instances, I and I 0 , i f f C is an isomorphism from I to I 0 , then for each C 2 C and each o 2 C , f C (K C I (o)) = K C I 0 (f C (o)) Well-denedness simply ensures that a key specication is not dependent on the particular choice of object identities in an instance, and will give the same results when applied to two instances diering only in their choice of object identities.We will assume that all key specications we consider are well-dened.
Two k ey specications, K C and K 0C , are said to be equivalent i, for any instance I, a n y C 2 C and any  C ) is acyclic then there is an equivalent k ey specication K 0C such that each t ype 0C is ground (contains no classes).
We will see that key specications with acyclic graphs are particularly useful later.K c 0b for c b ; c 0 b 2D b then c b c 0b , 2. if x a1:1 ;:::;ak:k K y then x(a i ) i K y(a i ) for i = 1 ; : : : ; k , 3. if (a i ; x ) h j a 1 : 1 ;:::;ak:kj i K (a j ; y ) then i = j and x i K y, 5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 DISTINGUISHABILITY OF DATABASES WITH OBJECT IDENTITY 4. if X fg K Y then for each x 2 X there is a y 2 Y such that x K y and for each y 2 Y there is an x 2 X such that x K y, and 5. for each C 2 C and any o 2 C , o 0 2 0C , i f o C K o 0 then K C I (o) C K K C I 0 (o 0 ).

Language
Note: For any s c hema S, i f w e take the key specication given by C C for each C 2 C , and for any instance I = ( C ; V C ) and each C 2 C , K C I V C then the relations K and relations are the same.In general K may b e ner than since we do not restrict the keys to be functions of the values associated with an object identity.
C K is called the correspondence generated b y K C .Proposition 4.2: If K C is a key specication then, for any instance I and each t ype , K is an equivalence relation.
An instance I is said to be consistent with a key specication K ). Suppose K is a key-specication for a schema S. Given two instances of S, s a y I and I 0 , w e s a y I is K-equivalent t o I 0 , and write I K I 0 i 1.For each C 2 C , each o 2 C there is an o 0 2 0C such that o C K o 0 , and for each o 0 2 0C there is an o 2 C such that o C K o 0 ; and 2. For each

Keyed schema
A keyed schema is a pair consisting of a schema S a n d a k ey specication K C on S. A simply keyed schema is a keyed schema (S; K C ) such that the dependency graph of K C is acyclic.An instance of a keyed schema (S; K C ) is an instance I of S such that I is consistent with K C .Lemma 4.3: For any instances I and I 0 o f a k eyed schema (S; K), if I K I 0 then K is a consistent correspondence between I and I 0 .Proposition 4.4: For any t w o instances, I and I 0 , of a simply keyed schema (S; K), if I K I 0 then I I 0 .

Computing key correspondences
Give n a k eyed schema, (S; K), we dene the language SRI(K) for the schema to be the language SRI extended with new operators key C for each C 2 C .The typing rules for these new operators are: `e : C `key C e : C and the semantics are given in appendix A. Similarly we dene the language SRI N (K) as an extension of SRI N .
We get the same results for computability o f k ey correspondences, K , a s w e did for bisimulation correspondence, namely 5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996 1.We can nd a formula in SRI(K) to compute K for each t ype .2. We cannot in general nd a formula to compute K on N -bounded values in SRI N for any N .
However the following result goes some way t o w ards justifying our earlier statement that key specications with acyclic dependency graphs are of particular interest.Proposition 4.5: For any simply keyed schema (S; K) there is an M such that for any N M , and any type , K can be computed on N -bounded values using SRI N (K).That is, for each t ype , there is a formula Cor K of SRI N (K) such than `Cor K : !Bool and for any t w o N -bounded values u; v 2 [ [] ]I, V [ [Cor K ] ] I (u; v) = T i u K v.
It follows that acyclic key specications provide us with an ecient means of comparing recursive v alues which incorporate object identities, without having to examine the object identities directly.

Conclusions
We h a v e seen that there are a variety of dierent observational equivalences possible on recursive database instances using object identities, and that the observational equivalence relation generated by a particular query system is dependent on the means of comparing object identities available in that system.These range from equality tests on object identities, which in a suitable query language allow us to distinguish between non-isomorphic instances, to an absence of any means of comparing on object identities, which leads to a minimal observational equivalence of bisimulation in any reasonable query system.These results are summarized in gure 6. Systems of keys generate various observational equivalences lying between these two.Use of keys, particularly acyclic key specications, can provide an ecient method of comparing values in a query language without resorting to direct comparisons of object identities.We therefore believe that such systems of keys can play an important part in the development of practical languages for databases with object-identity.We also saw that, by making use of the knowledge that object identities arise from nite extents, w e can compute whether two v alues in a database are bisimilar, or key-equivalent, though in general we cannot compute these relations without using recursion over the extents of object identities.This raises the interesting question of what other, more general functions on recursive v alues can be computed using the knowledge of these nite extents, and is a topic for further research.
V [ [e] ]I = V [ [e] ]I 0 .( V [ [ ] ] is the semantic operator on SRI expressions dened in appendix A).The following result tells us that isomorphism of instances exactly captures indistinguishability i n SRI(=), and is therefore an important result in establishing the expressive p o w er of SRI(=).Theorem 3.1: Two instances, I and I 0 , are indistinguishable in SRI(=) if and only if they are isomorphic.5th International Workshop on Database Programming Languages, Gubbio, Italy, 1996

Figure 4 :
Figure 4: Typing rules for query language

Proposition 3 . 4 :
Two instances, I and I 0 , are indistinguishable in SRI i I I 0 .Proof outline: The proof consists of two parts.First we m ust prove that for any SRI query e, i f I I 0 then V [ [e] ]I = V [ [e] ]I 0 .This proof proceeds by induction on SRI expressions.

Figure 5 :
Figure 5: A spectrum of observational equivalence relations City (name : str; state : State) State str For an instance I = ( C ; V C ) the mappings K C I are given by K City I (o) V City (o) K State I (o) (V State (o)):name

Figure 6 :
Figure 6: A summary of the operators considered and the resulting observational equivalences City(NYC) (name 7 !\New York City"; state 7 !NY) V City (Albany) (name 7 !\Albany"; state 7 !NY) and V State (PA) (name 7 !\Pennsylvania"; capital 7 !Harris) V State (NY) (name 7 !\New York"; capital 7 !Albany) V Then is said to be Z-internal i for any instance I, Supp((I)) Supp(I) [ Z.That is, does not introduce any new base values, other than those in Z.For any non-trivial schema S, it is not possible to build a generic expression in SRI(=) which tests whether two instances are isomorphic.In other words, given a schema S, it is not possible to construct a value e S , depending only on S, such that for any t w o instances I and I, V [ [e S ] ]I = V [ [e S ] ]I 0 i I and I 0 are isomorphic.Proof: Suppose there is such a query e, and `e : .Then there is a nite Z such that V [ [e] ] i s Z -internal.For any instances I and I 0 , [ [ ] ] I = [ [ ] ] I 0 = T , where T is a possibly innite set of values.However we can choose a nite set of base values, say W , such that there exist instances I with Supp(I) W . So, for any instance I with Supp(I) W , V [ [e] ]I 2 T and Supp(V [ [e] ]I) W [ Z.The set fv 2 T j Supp(v) W [ Zg is nite.However there are innitely many non-isomorphic instances, I, with Supp(I) W : given one such instance we can produce innitely many more of them by i n troducing duplicates of object identities.
dependency graph, G(K C ), of a key specication K C is a directed graph with nodes C such that G(K C ) contains the edge (C 0 ; C ) if and only if the class C 0 occurs in C .For example, the dependency graph of the key specication described in example 4.1 would have t w o nodes, City and State, and a single edge from State to City.Proposition 4.1: For any k ey specication, K C , if the dependency graph G(K The