Facilitating Modular Property-Preserving Extensions of Programming Languages

We will explore an approach to modular programming language descriptions and extensions in a denotational style. Based on a language core, language features are added stepwise on the core. Language features can be described separated from each other in a self-contained, orthogonal way. We present an extension semantics framework consisting of mechanisms to adapt semantics of a basic language to new structural requirements in an extended language preserving the behaviour of programs of the basic language. Common templates of extension are provided. These can be collected in extension libraries accessible to and extendible by language designers. Mechanisms to extend these libraries are provided. A notation for describing language features embedding these semantics extensions is presented.


Introduction
When we are faced with the task of formally describing a programming language, we could start from a simple language core, which exhibits the main principles of the language, and add a number of language features onto that core in a modular, property preserving way.This process of extending languages is similar to refinement approaches in software development.A language core and language features can be specified separately.Features will be specified as parameterised operators, thus allowing them to be applied to a number of basic languages.Features are specified as independent constructs only referred to by a formal parameter interface to the languages which should be extended by them.Languages and their extensions are presented in a denotational semantics style.Properties are expressed in an equational metalanguage, represented by equational theories.
Peter Mosses [1] criticises the lack of modularity in traditional denotational semantics.David Schmidt [2,3] suggests structuring semantics into units (semantics algebras) and to base extensions on these algebras.These ideas will be picked up and implemented in a framework called extension semantics.The approach will allow us to define languages in a formal and modular way.The general purpose of this framework of extension semantics is to support the exploration of and experimentation with semantics.The development of a new, or the description of an existing full-blown language is not our aim, but features and their interaction can be studied with our approach.Danvy and Hatcliffe have used the notion of a programming language workbench [4].The development of a whole language is an industrial process, but certain prototypical aspects can be extracted a priori in order to be analysed formally in depth with appropriate tools.Sample case studies will illustrate these ideas.
The preservation of behaviour of programs is one of the central issues in language extensions.It guarantees that programs written in the basic language can be reused in the extended language, i.e. they are be executable.Furthermore, they are also executed in a way similar to the execution as a program of the base language.The similarity is formally captured by a notion of observability.We will provide a number of common templates for extensions which guarantee behaviour preservation.Behaviour preservation is of particular importance since it guarantees orthogonality of the new feature with the basic language.A new feature can be added to a language such that there are no conflicts and, thus, the semantics of the basic constructs will remain.Extending by simply adding elements is easy and should not cause difficulties, but consider redefinitions: these might have side-effects on other constructs which use the redefined one.We will concentrate in particular on this issue.
2nd Irish Workshop on Formal Methods, 1998

Facilitating Modular Property-Preserving Extensions of Programming Languages
We are aiming at a usable framework for the extension of languages defined in the style of classical denotational semantics.We will provide a notation based on a library of predefined extension operators, called templates.The application of these operators will guarantee the preservation of properties.We will provide a mathematical framework which allows the language designer to extend the given library by adding new templates easily.The notation supports the extension by providing a variety of powerful constructs.This allows the language designer to focus on the specification of new features rather than on some tedious extension and reformulation of existing elements.
Our framework formalises ideas for modular development and presentation of denotational semantics suggested by David Schmidt [2,3] and Peter Mosses [1] and also provides notational support.A comprehensive framework for language engineering, i.e. the extension, restriction and combination of languages or language aspects, including a formal theory and an applicable notational and methodological support, does not exists at the moment.Certain aspects are currently under investigation in the project DESTIJL [5].
Section 2 introduces basic ideas of language extension together with our formal framework.A more comprehensive explanation of the framework and more properties are presented in section 3. Section 4 introduces equational theories as a representation of a simple equational language used to express properties of the semantics.Effects of semantics extensions on these theories are investigated.This is followed by a sample extension in section 5. Some specific problems such as operator combinations are addressed in section 6. Section 7 presents three case studies which illustrate the area of applications of our approach.We conclude with comparing similar approaches and some summarising thoughts.

Extension of Semantics and Preservation of Properties
In this section, we will introduce the basic ideas of our approach illustrated by the csh-language.The Unix C-shell csh is a shell command interpreter with a C-like syntax [6].The interactive shell reads commands from the terminal.The input is parsed into words.Two languages for a csh-like language shall be investigated: the first, called core (figure 1), describes the principle idea of the language of processing the commands echo and pipe on input-and output-streams (essentially strings in this core).The second language, called extension, includes basic elements such as success values for commands.Other features that might be included are for example environments to store values in variables.This extended language shall be developed in the remainder using our approach of extension semantics.

LANGUAGE DESCRIPTION syntax
Semantics of a language can be structured into semantic algebras.This approach was already proposed by Schmidt [2].Structured extensions of the language will be carried out based on this semantical structuring, rather than the classical viewpoint of obtaining structure by syntax.
Facilitating Modular Property-Preserving Extensions of Programming Languages Definition 2.1 A semantic algebra is a heterogeneous algebra consisting of semantic domains D 1 ; : : : ; D n (possibly compound) functions on these domains Semantic domains are sets.A typical semantic algebra would for example consist of a set of states and operations, such as an update on states (with the usual substitution or override semantics).Let Alg be the collection of all semantic algebras.With each semantic algebra A we have associated a signature sigA, sometimes denoted .The symbols for semantic entities are used as syntactic elements.An equational metalanguage based on -terms will be introduced later (see section 4).For extensions to be applied to a core language, we might want to identify a domain of interest S 2 f D 1 ; : : : ; D n g.It serves to denote that part of the basic language semantics that shall be redefined.
The redefinition of a domain, which represents the structure of some language feature, is not always aimed at, but -as already mentioned -is the complicated case and, thus, addressed here.Based on the domain S, a domain extension will be carried out.
Modularity is a key objective of our approach, thus complete features shall be added onto the basic language step by step.As in similar refinement approaches, the preservation of properties is essential.Behaviour (of programs) is the property we are interested in when languages are extended.Some of the principal ideas of extension shall be illustrated by using the echo command of the csh core language as a very primitive program whose behaviour has to be preserved in an extension of the notion of streams: where Stream maps streams into the extension TStream.A stream s 2 Stream is mapped to a pair s; f consisting of the stream and some element f representing the file system.An injection shall be used to extend the domain Stream:

y s
Streams s shall be refined to equivalence classes s; f E , expressed by a mapping r, where s; f E 4 = fa; b j a = sg Stream F i l e S y s .We associate equivalence classes instead of single pairs s; f to each stream s.We use an equivalence E on Stream F i l e S y s to define relevant or observable behaviour.Two pairs are equivalent if their

Facilitating Modular Property-Preserving Extensions of Programming Languages
Stream-parts are equal.This equivalence defines a notion of observability.We can now relax our formulation of behaviour preservation.Extended functions are only expected to behave correctly with respect to these classes, i.e.

E Stream
The mapping E Stream maps to equivalence classes instead of mapping to single pairs.It compares to a retrieve function (e.g.known from VDM [7]).The retrieve function would map all elements from one equivalence class to the associated element in Stream.We prefer the representation with explicit equivalence classes since the equivalence will be used later on in the construction of appropriate templates.Equivalence relations on the extension are used to define the property that certain elements are similar with respect to some observation.In particular, we will express behaviour preservation requirements by equivalence relations.A quotient algebra is obtained from applying the equivalence to a given algebra.The equivalence has to be a congruence.
A congruence relation E on a set S is an equivalence relation which satisfies the substitution property with respect to functions f on that set (let a; b 2 S): aEb faE fb.Let A be an algebra and E a congruence relation on the sets D i in A. The quotient algebra A=E is defined by = fa E We can say that the quotient structure is an abstract interpretation abstracting from specific implementation details of the extension.The quotient structure focuses on those properties which express the behaviour preservation.Note, that properties of the basic algebra are preserved.The abstract interpretation is a homomorphic image of the extended algebra.
Let us look at the existence of the mappings involved now.The canonical mapping c : s; f 7 !s; f E does always exist, p : s 7 !s; f can be constructed from c and r.These set-based mappings r; c ; p can be extended to morphisms ; ; on algebras by using the constructs congruence E and quotient algebra State=E: The equality = E is the equality on the equivalence classes.Behaviour preservation requires only equivalence, not identity of results on the extended level (cf. the commuting diagrams above).As we will see later, this construction will allow us to define orthogonal language extensions.

Remark 2.1
The lifting of functions , , and the equivalence E depend on each other.These dependencies will be further investigated when templates are introduced in section 3. Behaviour preserving extensions of equational theories are analysed in section 4. Usually associated with a set of operations is a set of operation combinators (higher order operators).The definition of semantic algebras and behaviour preservation given above does not involve these combinators.We will argue that the separated treatment of composition operators will lead to an improvement of the presented extension techniques (see section 6).

The Construction of Extension Morphisms and Templates
A notation for defining languages has been implicitly introduced in the previous section in Figure 1.This section contains notation and formal background for extensions of languages.Before we dive into technical details, let us sketch the outline of our extension approach.A language description consists of syntax, semantic entities and semantic functions.We see semantics as the main aspect on which a language definition should be based and also on which a language extension should be based.Thus, we will provide two major constructs for language definition: semantic algebras and language descriptions, and also two constructs for extension: semantics extensions on semantic algebras and language extensions on language descriptions.base language equivalence) language extension (new and redefined elements) lifted base language (based on domain extension, semantics extension A language extension is the construct which serves to specify a new language feature in its semantics, and also in its syntactical interface.A language extension is a self-contained specification of a language feature.Before any new elements are added or existing elements are redefined, the semantics extension can be applied.Based on the domain construction and an equivalence relation specifying how behaviour has to be preserved, the semantics extension lifts the semantics of the basic language such that the domain constructions are adapted and behaviour is preserved.This is a complete redefinition of existing language elements.The semantics extension is a construct which does not depend on or refers to the basic language semantics directly.Extension templates will be introduced to facilitate the definition of such semantics extensions.The language extensions are based on semantics extensions.The latter would be executed first when applied to a basic language.Then additions and further redefinitions on the lifted base language can be carried out.
This two-tiered approach is one of the essential characteristics of our approach.It provides strong support for the extension and allows the language designer to concentrate on the description of the new feature.

Semantics Extensions
A language description based semantically on semantic algebras can be extended to another language defined by other semantic algebras.This involves defining the extending morphism, describing the relevant behaviour and proving the that the defined morphism is behaviour preserving with respect to the relevant behaviour expressed by the equivalence.This procedure shall be facilitated for the language designer by providing operators working on semantic algebras which adapt the basic semantic algebra to some extended domain construction.These operators will take as argument some basic information about the intended extension.They will allow the construction of an extension morphism from this information.
A semantics extension is an operator on semantic algebras.The operator has to satisfy some properties (congruence, behaviour preservation) in order to allow us to construct a behaviour preserving morphism (see definition 2.2).A semantics extension shall allow us to provide a morphism on algebras which lifts a basic algebra according to a domain extension, e.g. from Stream to Stream F i l e S y s including a lifting of existing functions.New functionality, e.g. operations using F i l e S y s , is not added by this construct.Definition 3.1 A semantics extension from semantic algebra A to B is a 5-tuple T ; E ; E ; ; d where T : A 7 !B is a type constructor on semantic algebras, E : Alg !Alg=E is a mapping on algebras,

d is a collection of default values for each equivalence class in TS=E.
For all domains D not equal to S, E D is assumed to be the equality (i.e. a congruence) and D is assumed to be the identity mapping.Using the result from section 2 we can for instance derive from E and d.

Extension Templates
In this subsection, some properties shall be investigated allowing the derivation of behaviour preserving semantics extensions from a reduced amount of information in some particular situations.It will be proven for each particular case, that using a canonical construction for the equivalence E and for the extended functions f , an extension morphism can be derived such that behaviour preservation is guaranteed.In general, all elements of a semantics extension are necessary to define an extension on semantic algebras.A semantics extension provides a frame to derive behaviour preserving semantics extensions.The process of constructing semantics extensions shall be looked at: properties simplifying this process shall be elaborated and a library of common semantics extensions, called templates, shall be introduced.

Definition 3.2 Let S 7 ! TS be the domain extension. Based on the given type extension T of a semantics extension
T ; E ; E ; ; d , we can use standard constructions for the remaining elements.A template allows the generation of a behaviour preserving semantics extension based on predefined constructions for type extensions.
Examples of these templates based on type extensions are injection S 7 !S T (see figure 2) or indexing S 7 !I !S.These templates can form a library of extension operators for the language designer.A sample application of this template was already presented in section 2 to inject streams into a product of streams and file systems.The extension morphism is constructed from E and a default value r 0 .The equivalence E expresses that only the first component, e.g. a stream, is relevant for behaviour preservation.
The idea behind templates is to reduce the amount of information that a language designer has to give as a parameter for an extension.The operator T (domain type extension) is essential, but then, based on the type constructor, we can start using standard constructions.
It will be shown first that the function lifting can always be defined in a way such that behaviour preservation is guaranteed.Once the equivalence classes exist, the defaults can be obtained just by selecting one for each class.

Proof:
The definition of f is partial, but total on required subset of T A .This definition guarantees f A a = B fa, i.e. equality as a particular equivalence (more than the behaviour preservation criterion requires).

u t
Common templates based on a particular domain extension are the following (we will give E ; E ; ; d E for each T on a domain S): 1. T : S 7 !S R : s; rEs 0 ; r 0 iff s = s 0 ; E : s 7 !s; r E ; s; r 0 is default for s; r 0 2. T : S 7 !S + R : xEx 0 iff x = x 0 ; E : s 7 !s ; s is default for s 3. T : S 7 !I !S : tEt 0 iff 8i 2 I : t i = t 0 i; E : s 7 !t with ti = s for all i 2 I; f 0 with f 0 i = s is default for f 0 4. T : S 7 !PS : pEp 0 iff p = p 0 ; E : s 7 !fsg ; fsg is default for fsg The first case is an injection INJECT which has already been used in section 2 (see also figure 2 where the template is represented in our extension notation).All templates follow classical ways of injecting or embedding simple values into more complex domain constructions.For a more thorough investigation, more general constructions such as indexed products could be investigated.
The templates so far are purely set-based, but using the function lifting, the congruence property for the equivalence can be shown and, thus, the set-based mappings can be extended to homomorphisms.Proposition 3.2 Given a template T ; E ; E ; ; d based on a set-based mapping E , we can obtain a set-based mapping : S !T S and extend to homomorphism on algebras where S is mapped to T S and each function f to f .Proof: is obtained by using the defaults d for each equivalence class.The mapping on set and function symbols (T and ) is a signature morphism.The function lifting guarantees the substitution property (i.e.congruences) regarding the signature morphism.

u t
All suggested templates allow us to derive behaviour preserving semantics extensions.The templates have to satisfy a number of constraints: the relation E is an equivalence, there is a default value for each equivalence class.

Proposition 3.3 The templates allow us to obtain behaviour preserving semantics extensions.
2nd Irish Workshop on Formal Methods, 1998

Facilitating Modular Property-Preserving Extensions of Programming Languages
Proof: The well-formedness of the template components for the four templates is easy to see, classical injections or embeddings are used.

u t
Some knowledge from universal algebra is required for the construction of E and .A retrieval function : T S ! S can be seen as the central function.This could be used to derive an equivalence E on T S .Given a function : T S !S, we define the equivalence relation E of on T S by E = fs; s 0 2 T S T S j s = s 0 g We have chosen to separate E and (the inverse of ) to distinguish the activities of partitioning (obtaining E) and associating (obtaining E ) in the process of constructing a template.Universal algebra shows the equivalence of representations.For example, E can be derived from and E or, the other way round, can be derived from E and the default values.Analysing the powerset template shows the following property rng dom T S for : S !T S and : T S ! S. Partiality has to be considered ( for the powerset, for example, is only total on singleton sets, but is certainly onto).

Language Extensions
In section 3.1, we have introduced semantics extensions on semantic algebras.Now, we will introduce operators on whole language descriptions, called language extensions, including syntax, semantic algebras and semantic functions.Figure 3 contains an example which introduces variables and an environment in which these variables can be stored into the csh-language.String expressions consist now of either strings or variables.To evaluate variables, string expressions and streams are indexed with environments.Here, we have used the templates UNION, INDEX, and MAP INJECT to express these extensions.Due to the lack of space, their full definition will not be presented.The application of templates is explained in detail in section 5. UNION is based on the disjoint union of two domains; INDEX indexes a given domain by an index domain (elements of the new function space are considered equivalent, if they map to the same element).MAP INJECT is an adaptation of INJECT for domain S to functions mapping from S to S. UNION is used to allow variables as string expressions.String expressions, now including variables, are indexed by environments in order to substitute variables by their values during execution.MAP INJECT defines an extension which makes environments modifiable.The application of these templates within the language extension guarantees that the resulting language description preserves the behaviour of the basic language description.Some specific templates for the extension of operation combinators, such as pipe (see section composition in figure 3), are used.They are explained in section 6.
The application of templates prepare the definition of the new feature (see New Feature in the example).The semantics of se includes the interpretation of variables, setenv modifies (overwrites) an environment.
A language extension is divided into two parts.The first part 'Extension' deals with a potential base language and its adaptation to the requirements.The second part 'New Feature' contains the definition of the new feature.The first part is divided into three subsections: Syntax: A renaming operator for syntactical identifiers might be applied 2 .
Semantic algebra: Semantic algebras of a basic language can be lifted to the extension level, normally by applying templates.Templates can be used in a constrained form applicable to particular semantics entities and in an unconstrained form.
Composition: Application of specific extension templates for higher-order operators.
The second part is also structured into three, but slightly different parts: Syntax: Syntactical constructions for the new feature can be specified.When the extension is applied to a basic language, these elements will be added.
Semantics: Semantic algebras can be provided.A simple semantic domain is considered as a semantic algebra without functions.Note, that domains can always be named, i.e. specified by an equation.

ABSTRACTION =
= ev ; S se e 4   = if se = then else let s = case hd se in Semantic functions: New semantic functions can be defined, existing ones can be redefined.
Two orthogonal operators are provided for the notation: ',' and ';'.',' is a separator for denoting independent specifications, its semantics is union.';' is a sequencing operator, its semantics is override.The override semantics allows the redefinition of elements.The full semantics of this language extension notation will not be presented formally.
The semantics extensions underlying the extension have already been presented.The next section will investigate an equational metalanguage which is in fact the basis for the description of language operators.
The following diagram shows the main constituent parts of language extensions and their dependencies.One of the objectives of our approach is a modular presentation of semantics.Separating core and extensions is one way of achieving this.An aim of our approach is to allow language features to be specified as entities of their own which can be reused in different contexts 3 .We have here increased the degree of modularity by introducing a parameterisation concept.Identifiers of the feature specification will be associated with identifiers in the basic language, thus forming formal parameters of the operator.This allows the definition of language features independently of concrete core languages.David Schmidt uses the notion of orthogonality of language features to indicate the aim of having standardised language parts that can be assembled to a larger language in a predictable way.

An Equational Metalanguage for Expressing Properties
In this section, some foundations underlying the extension notation, which was presented in the previous section, shall be investigated.We will discuss an equational metalanguage to express algebraic theories, i.e. to specify and to reason about language properties such as equaivalence of programs.Programs can be specified in the metalanguage, they are interpreted in the corresponding semantic algebras.We will discuss theories represented as equivalences on terms.We will investigate the effect of extensions on these theories.Each algebra A with signature -such as our semantic algebras -gives rise to a term algebra A. An equivalence on the set of terms can be obtained by considering all terms as equivalent which are equal under some interpretation.An algebra has properties, possibly stated in an algebraic theory, here an equational theory.Equations hold between equivalent terms, thus, a theory can be represented by an equivalence on terms.
A description of an operation C echose : Stream !Stream C echose i 4 = S se can be seen as an equational specification of C echose , i.e. the notation that has been implicitly introduced for language descriptions and language extensions is based on an equational language.A metalanguage is introduced to express theories and to reason about language specifications.Algebraic theories are presented as quotients of term algebras where the equivalence expresses equal interpretation and E behaviourally equivalent interpretation of terms.
Specifications of the basic language (figure 1) can be seen as equational specifications in a metalanguage.We will now discuss the effect of extending algebras on the specifications.How do the respective term algebras and their

Facilitating Modular Property-Preserving Extensions of Programming Languages
quotients relate if the basic algebra A is extended to B? Assuming that we have a semantics extension from A to B, let us investigate the relation between A and B and also between their quotients A= A and B= B .Let us assume a syntactic extension : A 7 !B derived from the semantics extension using the type extension T on domains and from the function lifting f : A ! B to f : T A ! T B .It can be easily seen that is a signature morphism.The criterion for a correct extension is, for any terms t 1 and t 2 : The interpretations of equivalent terms t 1 and t 2 have to be behaviourally equivalent with respect to E in the extension.
Instead of B , we have required a weakened equivalence E B based on the equivalence E on a certain domain (as described in the previous sections).Based on -terms (as usual variables and operation applications), equations can be introduced using the equivalence.
Let A be an algebra with signature A .v A is an interpretation of A -terms t in A, i.e. v A t 2 A is a value.t 1 A t 2 iff v A t 1 = v A t 2 for all variables in the two terms.Analogously for an algebra B. Let B now be constructed from A via type constructor T with B = TA.Let A be the term algebra, and A= A the quotient with respect to equal interpretation of terms.The signatures of A and B can be related by a signature morphism from A to T A such that A is a subsignature of B as described above.As a result of the application of the signature morphism, the set of -terms changes.The signature morphism resembles a renaming, if T is applied to all domains A and T A is considered as a new symbol for A. The problem with this interpretation is, that, in general, new terms are introduced and old terms are preserved (consider e.g. the introduction of a product), i.e. the signature morphism embeds A in B .Thus, the corresponding term algebras are in general not isomorphic.We can define B as a homomorphic image of A, if the extension is based on a semantics extension (these were introduced in section 2 and are formally defined in section 3.1).
The relation between A and B depends on how the interpretation v on A is adapted to an extended version v on B. v has to preserve the equivalence of terms, i.e. t 1 A t 2 t 0 1 B t 0 2 or vt 1 = A vt 2 v t 0 1 = B v t 0 24 , if t 0 i is an extended term for t i .This property is called interpretation preservation.The interpretation can be adapted by a canonical mapping from v to v : v fx = vf vx where : A ! B is the extension morphism.Proposition 4. 1 The interpretation adapted by the canonical construction v 7 !v is interpretation preserving.
Proof: fx is the syntactically extended term for fx.The new interpretation of the extended term is constructed from the extended semantics: the original semantics vf is semantically lifted to vf , the value vx of the argument x is mapped into the extended domain vx.Since semantical equality is the criterion for the equivalence, and the extension of v is defined via the semantics, the equivalence is preserved.

u t
The interpretation determines the equivalence of terms.If we adapt the interpretation, then the equivalence is also adapted.It is obvious that the canonical extension of v is interpretation preserving.The equivalence is preserved for a special case, but remember, that only equivalence E, i.e.E fx = E f E E x is required for arbitrary behaviour preservation based on E. For the general case, we define The notion of interpretation preservation has to be adapted appropriately.Still, v is preserved by v , or reformulated A is preserved by E B .

Facilitating Modular Property-Preserving Extensions of Programming Languages
Proof: The criterion of interpretation preservation was relaxed such that equivalences can be dealt with.

u t
Based on t 1 A t 2 iff vt 1 = vt 2 , each term t i can be uniquely extended to t 0 i such that t 1 A t that we have a homomorphism on term algebras, but not a bijective one (similar to the results on algebras themselves).
And we have preservation of equivalence on the term algebra quotients (based on a suitable redefinition of the interpretation function).
We have neglected a satisfaction relation for our equational language so far, since the preservation of a satisfaction relation is a straightforward implication from previous results.The equation t 1 = t 2 is satisfied in an algebra A, if t 1 t 2 holds for a given interpretation on A. Using the equivalence E, we have to weaken our statement.t 1 E t 2 is satisfied, if t 1 E t 2 .As we have seen above, interpretations (and thus the corresponding equivalences) are preserved.

A Sample Extension
A simple example of an extension shall be looked at.The extension by abstraction mechanisms presented in figure 3 resembles more what we might expect as a self-contained feature, but due to the lack of space we will only explain a simpler example in detail (figure 4).Exit values shall be added to the core (figure 1) indicating whether a command was executed successfully or not.The abstract situation can be described as follows.An algebra A shall be extended to B by using the technique of injection for the semantic domains S 7 !S T. The injection template INJECT summarising the definitions from section 3.2 is presented in figure 2. The proof obligation of behaviour preservation is fulfilled by using a template.
Applying the template INJECT in a language extension is presented in figure 4. A new domain Exit is injected by INJECT into the result domain of commands; we have numbered the syntactical occurrences of Stream -without any semantical relevance.By using the template, we map streams to pairs of streams and exit values.The template specifies that with respect to behaviour preservation of operations, only the behaviour on the stream component is relevant, but not the exit component.
As explained above, identifiers in the operator description are only formal parameters which have to be substituted by actual ones when the language extension is applied to a language description.For the sake of simplicity, we will omit the explicit application of the extension operator and assume an application with a one-one correspondence between the names of formal and actual parameters.
The conditional command cond is newly introduced, thus, there is no proof obligation with respect to behaviour preservation.The other proof obligations concerning existing commands are discharged by using the template which guarantees behaviour preservation.For instance, the resulting definition for echo after applying the template would be: C echos i = S s ; t r u e which preserves the original behaviour (observe the first component).
The feature specified here by a language extension is an exit value concept with one operator cond.This operator is an operation combinator whose final result depends on the exit value of its first argument.Using the INJECT template (which allows us to derive a semantics extension), the semantics of a basic language, to which the language extension might be applied, is lifted according to the pattern Stream 2 7 !Stream 2 Exit such that behaviour is preserved.
The identifier Stream 2 is a formal parameter when applied to a basic language.It might be matched for example with a domain State of a basic language when applied to that language.The semantics extension derived from a template adapts any argument semantics to the extended domain construction as specified by the language designer.The base language is now available in the extended language.Any proof obligations are automatically discharged.Its extended definition is consistent and behaviour preserving.On top of this lifted base language, the language designer can specify a command cond with syntax and semantics as it is done in figure 4. If such a command already exists in the base language, it is overridden, otherwise it is a new definition.

Figure 4: Extension by exit values
We have provided two distinct extension constructs, the first, language extension, is dedicated to the full specification of the properties of the new feature, the second, semantics extension, is dedicated to the behaviour preserving lifting of the basic language to some extended domain construction necessary for the new feature.The language designer shall be freed from adapting the definitions of the basic language explicitly and prove the preservation of properties and should instead be allowed to focus on the specification of the new feature.

Typing and Higher-Order Operators
Some more conceptual issues shall be looked at in this section.The first one concerns the adaption of functions whose types have been modified.Then, we address a particular, but very important kind of functions: higher-order functions which appear in our approach in the form of operation combinators.

Typing
The key concept of denotational semantics is the compositionality of its definitions of semantic functions, i.e. semantic functions are applied within definitions of other semantic functions.An example of an operation definition containing applications of semantic functions is the semantic function E for binary addition expressions in a stateless setting: E e 1 + e 2 = E e 1 + E e 2 In an extension by state we would expect the definition to be parameterised by a state variable s : State.
E e 1 + e 2 = s:E e 1 s + E e 2 s The type of the semantic function changes, due to the application of the signature morphism based on the type operator T and the function lifting .Each occurrence of a modified function in the body -the call E e -has to be substituted by a call of the extended version E e s.This applies to every function, not only the semantic functions.The form of the substitution depends on the domain extension, here Expr !V a l is extended to Expr !Store !V a l by indexing V a l with Store.Let in the following be the function which syntactically substitutes 2nd Irish Workshop on Formal Methods, 1998 Facilitating Modular Property-Preserving Extensions of Programming Languages in expressions defining functions.All applications of functions which have changed their types are syntactically modified.A canonical construction of based on the domain extension which is applied is possible.The canonical function extension f c 7 !f has to be adapted by this substitution, e.g.f s = fs. is an extension of the signature morphism which works on metalanguage specifications.
A question that can also be asked here is whether an argument (such as s on the outer level in the example) has to be given to all of the subordinated calls unmodified.The example could have been extended in another way, e.g.
adding side-effects in expressions would require different states to be used by E e 1 and E e 2 .This imposes a new evaluation order on a binary expression.This issue is investigated below.

Operation Combinators
Templates describe transformations on domains and on operators on these domains.Higher order operators define the composition of operators.Operators of the language are combined to non-primitive ones.If basic operators are extended e.g. in their argument or result type, the composition of these operators has also to be adapted.It will turn out that there are certain variants in which operation combinators are defined.These variants will lead to some templates for operator combination and extension.Let us assume an operation combinator on two basic operators c 1 and c 2 and a semantic function C.An abstract form of a definition for C is: Arguments and results to the functions shall be looked at.Arguments x to the argument functions c 1 and c 2 of a composition , which are given firstly to , have to be assigned to the argument functions.There are two common possibilities: The result of applying a function composition has to be a value of the result type of each of the functions.
COMPOSE: the result r is a composition r = gr 1 ; r 2 of the results r 1 and r 2 of both argument functions (where g is an arbitrary expression on the arguments).The composition operator g has to be explicitly specified.
LAST RESULT: often, only the result of the second argument function is taken (if the composition should implement a form of sequencing), i.e. r = r 2 .
These variants should be preserved if an operation combinator is extended.This shall be referred to by the template name STANDARD.STANDARD is a higher-order template like operation combinators are higher-order functions.For these templates for operation combinators, behaviour preservation is guaranteed, if the basic operations are extended with preservation of behaviour.
We have already seen the INJECT template which would allow us to add a new argument or a new result domain to operators.If new arguments and/or results are added to basic operators, we do not need to stick to the given variants, since the new component is irrelevant for behaviour preservation.
Peter Mosses [1] introduces several operation combinators, called action combinators in Action Semantics.These combinators serve to reduce overloading of generally applied operators such as for function composition or the sequence ; for composition of program constructs as commands or declarations.

Revisiting David Schmidt's Book(s) on Denotational Semantics
We have mainly focused our interest on [2], but aspects of [3] are also considered.This attempt is described in [9]

The csh Case Study
In our second case study, we have investigated the csh-language in much more detail than described so far, see [10,11] for details.This investigation was carried out accompanying a students project on specification and language semantics held at the Danish Technical University in Lyngby during the author's stay in Denmark.
In several steps, concepts such as file systems, exit values, aliases, I/O-redirections, or variable and command substitution were added.We have also investigated parallel extensions instead of a sequence of extension steps.In sequential extension, feature are added step by step to a basic language.In parallel extension, all features are added onto the basic language resulting in a number of language extensions.Under certain circumstances (no interactions between the features), these extensions can be merged to one final extension.

The RAISE Concurrency Model
Another application of our approach can be found in [12].The concurrency model of the specification language RAISE [13,14] is based on Hennessy's acceptance trees, adapted to the particular needs of RAISE, see [15] for details of this adaptation.We have used our approach of extension semantics to reformulate this adaption in a rigorous way, thereby proving that essential properties -the behaviour and structural constraints -are preserved (the formal proof of property preservation is missing in the original description).
The basic model is a recursive space of processes P 0 defined by P 0 = !P 0 P P ?
where is a set of events.The first component of those pairs is a mapping from events to processes in P 0 .The second component is an acceptance set.An acceptance set is a set of possible internal states which can be reached non-deterministically by executing a process.Each of these states is a set of actions that can be taken in that particular state.The domain is defined recursively, but a solution for this equation exists (see [15] Chapter 5).? is the semantic correspondence to the chaos process.
We have partitioned the event space into two forms of events (in and out events denote the direction of cmmunication via channels), in the first extension step using a specific template to obtain process space P 1 : PARTITION !P 0 INTO in !P 1 out ! P 1 2nd Irish Workshop on Formal Methods, 1998 PARTITION is a behaviour preserving template.In a second extension, values V and states S are introduced using injection.The final process space P 2 has the structure P 2 = PS V in !V !P 2 out ! V !P 2 P P ?obtained by using templates for indexing and injection.
This case study is a classical situation in which our approach can be used.It is a rigorous, modular development of one aspect -here the crucial concurrency model of the RAISE specification language -of the semantics of a real specification language.In this case, the approach of extension semantics was used as a tool for analysis and design of the language semantics.We have gained a clearly structured description of the development of the RAISE concurrency model based on a widely accepted model for concurrency.Additionally, we proved the extension to be correct, i.e. property-preserving.

Related Work
The process of subsequent extensions is a refinement process.There is, for instance, a similarity between the notion of behaviour preservation and the retrieve function in VDM (see e.g.[7]).The question has been addressed already.A similar approach to ours -also pointing out the similarity to refinement -is presented in [16].There, a refinement relation between denotationally specified languages is provided.This paper follows in its presentation Schmidt's book [2].Riddle and Wallis see definitions of semantic functions as semantic equations and define a correctness preserving refinement relation based on these equations.Constructive support, e.g. in form of a refinement calculus is not provided.Other approaches with similar mathematical frameworks are [17] or [18].Another possible area of application is e.g.[19] where 27 languages derived from another are presented (in a slightly different denotational framework using metric spaces).
Abstract interpretation is a notion which we could use to describe the way we express behaviour preservation.Behaviour preservation is formalised by mapping an extended algebra to a more abstract one which neglects details, but focuses on those properties that have to be preserved from the basic language.The approach of abstract interpretation is well-know in language semantics [20,21], but it is mostly used for optimisation purposes.
A paper on language semantics cannot leave Category Theory unmentioned.One of the most popular approaches to modularity in language semantics is based on monads, see e.g.[22,23,24].A number of common language features have been successfully modelled as separate units based on monads.Moggi calls these descriptions notions of computation.There is less experience with the extension of monads.[25] provides some basic definitions, such as monad morphism, but a suitable, well-founded notation for language extensions does not exist at the moment (see more recent work on the extension of monads [26,27]).Classical denotational semantics provides a well-understood framework on which an extension approach like ours can be based.A lot of existing semantical descriptions are only available in a classical denotational style as we have tried to indicate with the RAISE-example.Ideas realised in our framework of language extension such as the provision of a notation or a library of templates, can also be applied to a monadic framework.This is currently under investigation based on monads and their morphisms.

Conclusions
We have presented a language description in form of a stepwise development by extension.Based on a language core exhibiting the basic ideas of language, language features are added onto that core step by step.The language features can be specified without referring directly to the core on which they should be added.This guarantees a high degree of modularity in language design and language presentation.Language features can also be investigated as self-contained constructs of their own.
We have presented in our framework of extension semantics a two-level approach using two different kinds of extension operators.The first adapts the semantics of the basic language to an extended domain construction specified by the language designer.The use of this operator was simplified and supported by a notion of extension templates.This support was given in order to allow the language designer to focus on the specification of the new features to be added.The technical support by semantics extensions is crucial for the creative part of specifying the new feature.This 2nd Irish Workshop on Formal Methods, 1998 Facilitating Modular Property-Preserving Extensions of Programming Languages technical support adapts basic language constructs automatically preserving their behaviour.This support is essential for the feasibility of the extension approach.The idea of abstraction and preservation of properties can also be found in abstract interpretations [20,21,28], but this approach is mostly used to abstract in order to solve problems in a simpler (more abstract) domain.Properties of semantics can be described in form of an equational theory.We have presented mechanisms to extend equational theories according to new structural requirements such that behaviour is preserved.
Since in each step only a few concepts are explicitly added or redefined, normally a large amount of rewriting would be necessary.We have facilitated extensions by providing templates which can be applied for a number of standard cases.Applying these templates also allows properties to be preserved.Respective proof obligations are automatically discharged.
We would like to refer to the work of David Schmidt.Some of the ideas presented here have been developed based on his text books on denotational semantics [2] and [3].Schmidt uses the notion of orthogonal language features to point out that features should be designed as self-contained units understandable without reference to other language features (and the core).Language features should preferably not conflict.This idea was realised in our approach by the construct of language extensions, i.e. operators with parameters.In particular the semantics extensions guarantee that the behaviour of the basic language is preserved, which means that the new feature does not conflict with the basic language.More modular, or orthogonal, descriptions of languages are also aimed at by Action Semantics [1].Facets are provided which contain constructs to describe the computation of different kinds of information, e.g. the feature description ABSTRACTION (figure 3) could be considered as a reduced facet for describing declarations.
The area of application of our approach is geared to those language manipulations that are expressible through language extensions.The combinations of different paradigms, such as the combinations of the state-based imperative language and a process-domain based concurrent language, is not intended.Merging two different languages based on two different paradigms would require different questions to be answered.Three case studies have been presented to indicate the variety of our approach even though not all problems can be solved.Certainly, programming languages have to be addressed as well.Java, and in particular security aspects of Java, are currently under investigation.Java is in particular interesting since it is a young, still evolving language.In the same sense, Perl can be a target language.
The approach can be further improved if we consider parallel extensions.Instead of extending step by step sequentially (investigated in depth in [10]), we could extend a common core in parallel by adding different new features as long as there are no dependencies between extensions.Issues like the commutativity of extensions arise; questions such as under which conditions can extensions be merged have to be answered.An extension notation based on an operator calculus to combine extenions is certainly an improvement to the expressivity of the approach (investigated in [11]).

4 =
if se = then else hd se ^S tl se se echose is a phrase of the syntactic domain C m d .The semantic function C maps from C m dto Stream !Stream.An input stream i is mapped to an output stream when echose is executed.Let us now consider an extension of the core language.Commands shall work on a configuration consisting of streams and file systems, the extension is expressed by an operator T 1 .T : Stream 7 !Stream F i l e S y sThe semantic function for echo has to be adapted : C echose 7 !C echose such that C echose : TStream !TStream.A Stream F i l e S y s -based algebra certainly preserves the behaviour of the echo command on Stream, if the following diagram commutes: Stream F i l e S y s C echose -Stream F i l e S y

E
Stream C echose i E C echose E Stream i or expressed diagrammatically: Stream F i l e S y s =E C echose E -Stream F i l e S y s

2nd
Irish Workshop on Formal Methods, 1998 Facilitating Modular Property-Preserving Extensions of Programming Languages EXTENSION TEMPLATE INJECT S INTO S R = T : S 7 !S R : f i c 7 !f i s; rEs 0 ; r 0 iff s = s 0 E S s = s; r 0 E s; r 0 for each s; r 0

Figure 3 :
Figure 3: Extension by environments C : C m d! Stream 1 !Stream 2 do INJECT Stream 2 INTO Stream 2 Exit with t 0 = true

Facilitating Modular Property-Preserving Extensions of Programming Languages is
a sorted set of mappings: 4 = f i : A i !B i j A i is a domain in Ag, preserves the operation behaviour with respect to the congruence E, e.g. for op : A i !A j and x 2 A i it

Facilitating Modular Property-Preserving Extensions of Programming Languages
2nd Irish Workshop on Formal Methods, 1998 (but not the other way round) with canonical definitions of A and B .We can achieve bijectivity, i.e. t 1 A t 2 , t 0 1 B t 0 2 if we consider equivalence classes of terms also with respect to the second equivalence E besides , This says that the equivalence classes are mapped 1-1 from A to E B .The result is 2 .

Facilitating Modular Property-Preserving Extensions of Programming Languages output
. One of Schmidt's examples is the extension of a simple imperative language by I/O commands working on input and 2nd Irish Workshop on Formal Methods, 1998 buffers.Schmidt proposes to structure the semantical side into separate algebras, each implementing a language feature.Let us consider a basic algebra implementing a store for a simple imperative language:Stores shall be injected into a product of stores, input buffers and output buffers such that I/O-specific commands like put can be realised.This extension can be expressed by using the injection template:INJECT Store INTO State = Store Input Outputsuch that behaviour is preserved.We have proven for some parts of Schmidt's examples that his informally described extensions are behaviour preserving.