Modeling Facilities for the Component-based Software Development Method

Component-based software development (CBSD) technology uses components as first-class objects and therefore requires a good understanding of the nature of components. Industrial approaches to CBSD based on interoperability standards (such as OMG CORBA) lack of component semantics in their descriptional models. In this paper we present an overview of the SYNTHESIS method emerging the CBSD approach by introduction of semantic information to enrich and complement the industrial modeling facilities. The paper contributes to the development of modeling facilities for CBSD focusing on the interoperable systems design. Proper balance of formal and semi-formal modeling facilities is demonstrated to cope with the CBSD requirements.


INTRODUCTION
Component-based software development (CBSD) has become one of the hottest topics in the area of software engineering.CBSD is a promising solution intended to break up large monolithic software systems into interoperable components and thus to move us from producing handcrafted lines of code to system construction based on object-oriented software parts or components and automated processes.The latter use semantic knowledge to guide the assembly of those components into the desired target system.
In this paper we consider CBSD issues in frame of the 1 Proceedings of the International Workshop on Advances in Databases and Information Systems, September, Moscow, 1996 c -Notice SYNTHESIS method 2 attempting CBSD with reuse of preexisting heterogeneous components [13,14,15].SYNTHE-SIS emphasizes megaprogramming metaphor capturing the idea of scaling-up from non-distributed object-oriented systems to large systems of heterogeneous, distributed software components.We consider interoperability to be the universal paradigm for compositional software development in the range of the systems mentioned.Basically, interoperability implies a composition of behaviors.Correct compositions of software components should be semantically interoperable in the context of a specific application.
The overall goal of the SYNTHESIS project is to provide a uniform collection of modeling facilities suitable for different phases of the forward engineering activities as well as for the reusable information resource specification in the reverse engineering phases.The method considered in the paper focuses on the semantic interoperation reasoning process [15] that should lead to the concretization of the specification of requirements by views over the pre-existing information resources.
The paper is structured as follows.After a short summary of the state-of-the-art and the related R&D directions we identify basic CBSD issues remaining to be open.Presenting the SYNTHESIS method, we discuss in more details its original features intended for the steps of the process of the semantic interoperation reasoning.Further we concentrate on the SYNTHESIS modeling facilities intentionally separated into semi-formal part used for plausible reasoning in course of CBSD and formal part used for strict justification of the design and specification solutions.A comprehensive example showing how different modeling facilities interact is presented.

STATE-OF-THE-ART
The software factory idea [19] as well as a research on software reuse has given important impact on CBSD.The importance of the reuse topic has led to a series of interna-tional reuse conferences, and to government funded research projects like REBOOOT [16].But besides basic research some companies like IBM or Hewlett Packard have set up programs introducing CBSD [11].Very important for CBSD are also industrial activities in the field of interoperability standards like OLE2 or CORBA [20,22].
When talking about CBSD today we cannot ignore the growing influence of the Internet technology, especially the WWW.First of all, the Web is an excellent resource of available components in the Componentware area, ie., in the Microsoft OLE/COM environment (e.g.OCX components) as well as in the OpenDoc/CORBA area (e.g., OpenDocparts).Second, the WWW and the OMG CORBA are complementing each other and we can watch their merge now.Sun's object-oriented Internet development language Java [6] has the potential to speed up this process and -maybeto revolutionize the Web.Before the coming of Java, Web technologies gave users a very crude, i.e. static, way to access the power of the Internet.Building component-based client/server, multi-user applications was almost impossible with pre-Java Web technology.Protocols such as HTTP focus on interaction with the user rather than on application interaction (http users can only really pull pages of text and graphics back and forth across the net) and so impose fundamental limitations on the nature of services accessible from a consumer application running on an Internet device or a home PC.Java increases the value of the Internet by bringing "live" applications into the picture and CORBA 2.0 ORB implementations like IONA's Orbix or Post Modern Computing's BlackWidow claim to bring this even one step further offering the ability to perform semantically rich client/server operations on the Web [7].SunSoft is also working on a Java/CORBA connection, so that Java programs will be able to invoke remote methods in server based objects over the net.
The combination of ORBs and Java takes CORBA beyond the enterprise and into the global sphere.ORBs and Java together enable much more than simple Internet applications -they provide a truly portable platform for building and deploying large-scale, distributed client/server applications across both public and private networks.
Besides being an object-oriented, multi-threaded, and secure language, Java offers two interesting features to CBSD: First, Java is a cross-plattform language because Java programs are "architectural neutral bytecodes".Second, Java allows small programs or applets (mini-applications) to be embedded within an HTML document.When the user clicks on the appropriate part of the HTML page, the applet is downloaded into the client workstation or PC environment, where it begins executing.
CORBA distributed object technology empowers the Java applet with standards-based connectivity to the world of information and computing services.Introducing CORBA to the Java environment means that applets are no longer restricted to simple interaction with the user, but are instead capable of taking part in complex interactions with backend services.With CORBA, Java applets transcend the limitations of simple Web browser technology -CORBA-compliant Java objects become the basis for the provision of Internet and interactive Multimedia Services on a world-scale.The Internet or enterprise-wide intranets build a kind of standardized sockets into which application components can be plugged in.According to IONA's vision [7] distributed applications could then be viewed as collections of "worldobjects" -some may be downloadable to consumer devices, others may reside on backend corporate servers -all should be capable of sharing information with one another.Thus, a combination of the Java programming language with the CORBA standard for application integration offers an ideal solution for downloadable application components capable of accessing multiple, shared backend services located across the Internet.CORBA 2.0 provides the crucial missing link between the Java application (applet) running on a consumer device and the required backend service.Both CORBA and Java essentially seek to abstract the underlying hardware technologies and architectures.For component-based software development this factor brings about a reduction in learning curves and offers improvements in time-to-market as well as maintenance cost reduction.
As far as the SYNTHESIS environment is concerned, we plan to include the Internet and WWW facilities into the general architecture.OMG's CORBA 2.0 encapsulates the underlying information resources (components) and the architectures like IRO-DB [8] make it possible to represent heterogeneous databases with resource specifications in frame of the ODMG'93 standard.In this context we define three different kinds of the Internet sites to support the SYNTHE-SIS design method: the information resource provider sites, the designer sites and the application domain provider sites.Furthermore we distinguish between different design scenarious (centralized at the design site, cooperative that involves resource providers into the design, and an active resource scenario when information resources actively participate in the design process offering their own reuse possibilities).

OPEN ISSUES
Up to now we have no methods, guidelines or design heuristics on how to develop good frameworks.Possible steps in this direction are the metapattern/hot spots approach by W. Pree [23].Framework adaptation, i.e. the customization to the user's needs, is also a field that needs more research.Active cookbooks [17] are an approach to support the user in this problem area.The technique of Design Patterns [10] could be helpful to document (part of) a framework.
On the other hand, the ComponentWare approach provides no application skeleton but individual components that can build the application when assembled and interconnected by a software bus (like CORBA's ORB).But the problem of components is that they do not have sufficient clean semantic specifications to rely on for their reuse.In this context the following issues are considered to be still open: • Complete specifications (for machine and for human) of the available components and of the application requirements are necessary prerequisite for the method • Homogeneous ("canonical") equivalent specifications for pre-existing components should be provided • One and the same set of description facilities should be used for different layers of development (requirement specification, design and reverse engineering) • Sound foundations are necessary to support provable requirement concretization and coherent component composition • The design methodology should support design based on reuse and interoperable composition of components • Componentware and framework approaches integration is desirable.
The SYNTHESIS method overviewed in the following sections addresses many of these open issues.

RELATED WORK
In the context of CBSD the use of formal methods and domain knowledge is quite new but of growing importance.One significant activity in this direction is the U.S. Advanced Technology Program (ATP) Component-Based Software, sponsored by the National Institute of Standards and Technology NIST [5].
Scalable Automated Semantic-Based Software Composition is another project funded by NIST, the state of California and a consortium of companies.The project focus is on semantic -based composition and component synthesis based on specifications.
Composable Software Systems is a research project led by three members of the School of Computer Science, Carnegie Mellon University, Pittsburgh, PA [25].The project tries to develop a scientific and engineering sound foundation for designing, building, and analyzing composable systems, organized as collections of reusable components.

SYNTHESIS MODELING FACILI-TIES
A strategy for incorporation formal design method.SYN-THESIS modeling facilities should provide for semantic interoperation and reuse of the pre-existing resources to cope with the open CBSD issues identified above.SYNTHESIS modeling facilities roughly can be subdivided into semi-formal and formal ones.The former (the SYNTHESIS language [13]) is intended as a mediator between the informal natural language specifications and the formal ones.We focus on model-based specifications for the latter [1].
For incorporation of formal specification method we exploit a transitional computer-assisted strategy [9].These strategies have advantage of computer assistance available to move back and force between semi-formal and formal specifications.
Semi-formal facilities of SYNTHESIS.Uniformity of the SYNTHESIS object model is based on the algebraic framework [18].The fundamental concept of the SYNTHESIS object model [13] is an abstract value.Abstract values are instances of abstract data types (ADT) that resemble algebraic systems [18].A SYNTHESIS object model is purely behavioral.
Type in the language is treated as the first-class value.Type variables have types as their values.Basic operations used in type expressions (mostly while implementing object calculus formulae) are operations of type composition (type meet and join) and of type product.Type specifications are abstract and completely separated of their implementations.
All operations over typed data in the SYNTHESIS language are represented by functions.Predicative specifications of functions are expressed by formulae of the SYN-THESIS object calculus.
Incorporating a sound foundation we focus on modelbased specifications [4] chosen among other specification formalisms such as logic-based, functional and algebraic.The notion of execution of a model-based specification consists of the proof of the initial consistency of the model and the preservation of the invariants by the operations.
The model-theoretic methods [3,24,1] are based on pure mathematical abstraction of the specification of requirements and on the application of the provable stepwise refinement (including data and algorithmic refinement) in process of their development.During the refinement process, "offthe-shelf" components can be taken into account for their reuse.In SYNTHESIS we focus on the Abstract Machine Notation (AMN) [1] applying transitional computer-assisted strategy.
To succeed with the strategy, we are based on formal interpretation of the SYNTHESIS language features in AMN.We interprete each type of the SYNTHESIS specification by a separate abstract machine.

SYNTHESIS CBSD METHOD OVERVIEW
The method emphasizes integration, reuse, adaptation and reconstruction of the pre-existing components (the whole or the pieces of the existing components, legacy systems, databases, program packages, data files, multimedia data) for the new (or modified) system requirements.SYNTHE-SIS method is not considered as one rigid approach, but as top-down, bottom-up iterative processes of analysis, design and development.The interdependence of different phases of the SYNTHESIS method is shown on the Fig 1.
The conventional technique of the OO analysis and design is used for the requirement planning and domain analysis phases.This technique is augmented with the ontological specifications needed to resolve contextual differences with the pre-existing resources, with the specification of the result in the common declarative OO and logic based model that is two dimensionally uniform and with a possibility to justify the result using formal specification and proof facilities.
The information resource description technique is developed to complement the existing core interoperation technology (such as CORBA IDL) in order that the resources could be reused in the semantically interoperable environment.The specification of the resource should be complete for the semantic interoperation reasoning.
The design technique is based on the interoperable reuse of the pre-existing resources.For that the coherence of the contexts of the problem domains and of the resources is negotiated, the search of the relevant resource specifications is supported, the discrepancy reconciliation approaches, concretization view construction technique is provided.The results of the design can be provably checked.
The SYNTHESIS method is neutral to the possible methods providing the object-oriented requirement planning and analysis models as well as to the possible reverse engineering methods.The output of such methods should be transformed to the SYNTHESIS canonical model thus giving precise semantics to the diagrammatic notation.

ROLE OF FORMAL MODELING FACILITIES
Justification of the decisions taken during the Forward phase.
Formal checking of the result of the analysis phase is provided as follows.The type / class definitions of SYNTHESIS model are presented as abstract machines in the B AMN [1] notation to verify consistency of the resulting specification (in particular, to check that the methods defined preserve the invariants given by the assertions, ontological rules and other constraints of the model; to check consistency of type / subtype specifications: these specifications should have a model).
The design model is the refinement of the domain analysis model adapting it to the actual heterogeneous interoperable information resource environment.The formal counterpart of this concretization specification is given in AMN: the concretization of each application type is verified by its transformation into the B AMN and treating concretization as a refinement in B. Proof obligations corresponding to the refinement of an abstract machine are generated and proved.
Type specification mapping technique for the reverse engineering phase.For the heterogeneous world of information resources we provide a technique of mapping of the preexisting resource type specifications into the canonical specifications uniformly defined in the SYNTHESIS language.The commutative type model mapping is developed through the following basic steps: • construct the mapping of a source type specifications into type specifications of the canonical type model (including state and behavior mapping); • provide an interpretation of source type model in the abstract machine notation; • provide an interpretation in abstract machine notation of the types resulted in mapping of the source type model into the canonical types; • justify the state-based and behavioral properties of the type mappings proving that a source type is a refinement of its mapping to the canonical type.
In the reverse engineering phase the correctness of the SYNTHESIS-based specification of the resource is guaranteed by such procedure justified by the refinement relation of abstract machines.The commutative type model mapping is a specific technique providing for uniform representation in the canonical model of different type models determined by programming languages and DBMSs.

AN EXAMPLE DEMONSTRAT-ING THE ROLES OF FORMAL AND SEMI-FORMAL MODELING FACILITIES 10.1 Semi-formal specifications of requirements
We imagine a centralized Agency managing funds dedicated for research and development projects.We assume that the following type specifications were produced as a result of the forward analysis and design based on some OA & D method and of the transformation of its resulting definitions into the SYNTHESIS specifications.Thus we get a definition of the type Proposal and class proposal.In the SYN-THESIS language type specifications are syntactically represented by frames, their attributes -by slots of the frames.Simple concept definition rule is incorporated into the type Proposal by association with the budget attribute of a metaslot containing a definition of a rule: 'if proposal area is computer science then budget is annual and currency is roubles'.Here budget sem is a SYNTHESIS metaclass introducing metattributes (budget kind, currency) that are used in the definition of rules characterizing budget contextual semantics required by the application.budget sem in the language is treated as a class of attributes (association metaclass).

Semi-formal specifications of the pre-existing resources
We assume that the specifications of the pre-existing resources equivalently represent their semantics.We assume that at the Industrial Labs an information system is used that contains the following specifications of types and classes.The class project is a subset of the class submission that includes only those submissions that have been accepted.
The example shows how to design the proposal class reusing preexisting resources at the Industrial Labs.

Concretization view for the proposal class
An example of the specification of the concretization view for the proposal class follows: {virt proposal; in: class; metaslot {comment; In specifications of a view class a function computing the class as a set is defined in a metaslot associated with an attribute in: class.We do not care here whether we declare a virtual class that exists only during its evaluation or a materialized class.The specification of the function is given below by a formula of the object calculus of the SYN-THESIS language [13] Abstract Machine Notation [1] allows to get an exact mathematic definition (specification) of the modeled entity properties (in particular it may be a computer program).Then such definition may be analysed formally.So AMN program is used for proving of the correctness of initial entity definition.In paricular it may be used for correctess proof and symbolic transforming of initial program (for example, composition, decomposition, refinement, etc.)While specifying an abstract machine we should define the machine states and operations allowing to get such states.To specify state, two kinds of entities should be defined: variables defining the state components and invariants.Invariants are laws which must be satisfied by the static states of a system.
A specification of an operation describes properties and relationships that must be satisfied during a change of a state.
In common model-theoretical languages such as VDM or Z such properties are specified using logical assertions that relate values of the state variables before and after operation executing.For these puposes AMN uses so called calculus of subsitutions allowing to express properties of operations in terms of predicate transformers which bind with some postcondition of the operation its weakest precondition.Generalized substitutions (which are operators of such calculus) may be considered as an abstract machine commands.
After an operation being specified one should prove it preserves invariants.To do this every invariant is considered as a post -condition to which the operation (i.e., a predicate transformer) is applied.This application results in forming of a new predicate which must be proved too.
So pre -and post conditions are integrated in notation which looks like a simple programming language.The commands of the language are generalized substitutions that generalize Dijkstra's guarded commands.
Every generalized substitution S defines a predicate transformer binding with some post-condition R its weakest precondition [S]R that guarantees the invariance of R after an operation execution.If it is so, one says that S implements R. Kinds of the generalized substitutions are listed below.
• Multiple substitution [x, ..., y := E, ..., F ]R ⇔ R , where R is R in which free occurences of x, ..., y are simualtenosly replaced by E, ..., F correspondingly where z is not a free variable in R So every generalized substitution defines a rewriting rule transforming the next predicate to a pre-condition.Preconditions describe situations (states) which are admissible for the execution of the corresponding operation.Under such conditions the operation can be completed.Guarded substitution is implemented if predicate R1 is satisfied.
Bounded choice corresponds to restricted form of nondeterminism: any choice is admissible and the final desision depends on the operation developer during the concretization process.Any substitution must preserve the post-condition R. It is why the conjunction sign is used in the predicate.

Elements of the Abstract Machine Notation
We start with elementary notions [1]  ...

Operation end
Abstract machine has a name and may have some formal parameters (to be either natural numbers or non-empty finite sets).An abstract machine has a number of variables that should obey a certain number of predicates forming together the invariant of the machine.The invariant allows to set-theoretically type each variable.The sets defined in the set clause of a machine constitute the basis of its type system.An abstract machine also has an initialization that is a substitution.Finally, an abstract machine has a number of operations defined with the following syntax: V ariable ←− Identif ier[(V ariable)] ∼ = Substitution Identif ier[(V ariable)] ∼ = Substitution Once an abstract machine has been written, one has to check that a certain number of conditions are met.Such conditions together form the proof obligations of the machine that are shown below in a simplified form for the small machine skeleton that follows.machine Operation name ∼ = pre L then S end; end Proof obligations templates: The first proof obligation is just the existence proof obligation for variables.The second one concerns the establishment of the invariant by the initialization.The last one concerns the preservation of the invariant by each operation.

Basic modularization features of AMN
includes and promotes clauses are introduced to make a machine that is based on the composition of other machines.includes gives names of included machines with renaming (dotted identification that required in the case of multiple inclusion) and with actual parameters.Natural number parameters are instantiated by expressions denoting natural numbers and the set parameters are instantiated by means of expressions denoting simple sets.
A machine including other machines can have its own glueing invariants that should be preserved by promoted and new operations of the new machine.New machine may have its own variables with the corresponding initialization.
The promotes clause contains the list of operation names of those operations of the included machines that are to become without any modifications genuine operations of the new machine.
The uses clause is different from includes.Operations of used machines cannot be mentioned in the using machine.Several machines can thus use the same machine.Parameters of the used machines are not instantiated.Sets, constants and variables of the used machines can be read in the using machine.
sees clause contrary to uses does not allow to mention elements of the seen machines in the invariants of seeing machine.Thus a seen machine can be refined independently of the machine that sees it.

Refinement of Abstract Machines
The ultimate goal of the B-technology is to have abstract machine implemented eventually as software modules by means of some programming notation.So, we have to transform abstract machines so that they could eventually be implemented by means of the programming notation.This will be done by a step by step restriction of the constructs that could be used further.This activity is called a refinement.
Algorithmic refinement consists in removing of nondeterminism by being more and more precise about the way our operations are to be eventually made concrete.At the same time we should relax preconditions.
Data refinement consists in removing completely all variables whose types are too complicated to be implemented as such and in replacing them by simpler variables whose types correspond to those found in programming notations: that is, essentially, natural numbers taken in certain intervals (scalar types) and functions from scalar types to themselves (array types).
Defining data refinement we suppose that we have two substitutions S and T working within two different machines (within two distinct variable spaces represented say, by two variables x and y).We assume that these variables are members of the two respective sets s and t so that x ∈ s and y ∈ t are respective invariants of these machines.We suppose that these variables are related by a certain binary relation v from s to t such that ran(v) is equal to t.The relation v is called the abstraction relation.
Now, the refinement of the abstract machines is defined as follows: a machine N is said to refine a machine M if a user can use N instead of M without noticing it.
Syntactic construct refinement is introduced that resembles a machine.However, a refinement can refine either a machine or another refinement.The invariant clause of refinement is just the abstraction relation defined above: it expresses the change of variables between the two constructs.The operations of the refinement only involve the variables of the refinement, not of the construct being refined.At another extreme pure algorithmic refinement can take place: in this case the variables of the refinement and of the construct being refined are the same.In a refinement new given sets may be introduced as well as a more precise value for a given deferred set.
After the refinement is specified it is necessary to prove that it indeed refines what it is claimed to refine.For this a number of proof obligations are generated according to the following templates related to the following abstract machine and its refinement: The proof obligations templates look as follows: ∃(x, y) Here V stands for substitution V within which the variable z has been replaced by z .
The first of the proof obligations is an existence proof obligation for the new variables of the refinement.The second deals with the correct refinement of initialization and the third of them deals with the correct refinement of operations.

Abstract machine resulting from the Proposal type mapping
To save space we have omitted from the example mentioning of the machines constituting the environment into which the specified machine are embedded.We do not show also all the operations of the machines.The machines above were proved by I.A.Chaban using the B-Toolkit environment [1].The proof justifies consistency of semi-formal specifications and correctness of the design with reuse of the pre-existing components.

CONCLUSION
The CBSD method complementing industrial OAD methods and interoperable environments is overviewed.The design technique proposed is based on the interoperable reuse of pre-existing components.For that the coherence of contexts of the problem domains and of the resources is negotiated, the search of the relevant resource specification is supported, the discrepancy reconciliation approaches, concretization view construction technique are provided.
The CBSD method is based on heuristic provisions for search and composition of relevant components into views serving as refinements for the specification of requirements.Formal specification languages are far from being ideal tool for exploring and discovering the problem structure during the refinement.On the other hand, object models widely used in the OAD CASEs are not sufficiently semantically rich to cope with the CBSD open issues identified.
To provide adequate semantic facilities suitable for heuristic methods augmented with formal proof and refinement facilities the SYNTHESIS modeling tools are separated into semi-formal and formal ones.The uniform purely behavioral object model was developed for the former.Model-theoretic facilities were used for the latter.
The paper contributes to understanding of semi-formal and formal modeling facilities interaction in forward and backward phases of the semantically interoperable information systems design.
Fig. 1.SYNTHESIS method structure OpN ame ∼ = pre Q then T end; end machine Identifier refines AM Identifier variables y invariant R initialization U operations z ←− OpN ame ∼ = pre L then V end; end Based on the analysis of the resource classes corresponding to the application class the concretization axioms stating how to refine the application class states by the states of the relevant resource classes should be constructed.From the analysis of the submission/project specification the designer should deduce the following concretization axioms showing how to refine proposal class state by the submission/project classes state.axioms is a predicate constraining admissible states of the instances of the virt proposal class.