Specification of Rewriting Strategies

User-definable strategies for the application of rewrite rules provide a means to construct transformation systems that apply rewrite rules in a controlled way. This paper describes a strategy language and its interpretation. The language is used to control the rewriting of terms using labeled rewrite rules. Rule labels are atomic strategies. Compound strategies are formed by means of sequential composition, nondeterministic choice, left choice, fixed point recursion, and two primitives for expressing term traversal. Several complex strategies such as bottom-up and top-down application and (parallel) innermost and (parallel) outermost reduction can be defined in terms of these primitives. The paper contains two case studies of the application of strategies.


Introduction
Term rewriting is an ideal technique for program transformation where the transformation of one construct into another is defined by means of rewrite rules.Usually, the rewrite engine contracts redexes according to some fixed selection scheme, and the possibilities the user has to control the order in which rules are tried are rather limited.For example, ASF+SDF (Van Deursen et al., 1996) implements a leftmost innermost redex selection scheme and the only way the user can explicitly control the order in which rules are tried is by means of 'default equations'.
Often it is desirable to have more control over the reduction process.For instance, some rewriting systems exhibit better termination behaviour under a leftmost innermost scheme, while others behave better under a parallel outermost scheme.Also, it often yields more efficient normalization if rules are tried according to some specific priority ordering.
The usual solution to get more control over the strategy used to apply transformation rules is to write an explicit transformation function that traverses an expression and performs transformations in a fixed order.This gives great overhead in the specification and often distracts from conceptually simple transformation rules.
In this paper we show how the definition of a transformation in terms of rewrite rules can be separated from a specification of the order in which these rules should be applied.We define a language of strategies inspired by the strategy mechanism of the rewriting language ELAN (Vittek, 1994;Borovanský et al., 1996) and demonstrate its use in ASF+SDF.This leads to a small set of generic modules that can easily be instantiated for any target language.
We illustrate this approach in two case studies.In the first example, we give an implementation of the proof of a (basic term) lemma in the setting of process algebra with conditionals.Strategies are necessary there, because one of the transformation rules is nonterminating.In the second example we discuss the normalization of box expressions in a typesetting language.The transformation rules form a weakly terminating rewrite system that shows infinite reductions under innermost rewriting.Using the strategy language a terminating strategy is specified.
Overview In Section 2 we introduce the basic strategy operators: labels referring to axioms, the identity strategy, sequential composition of strategies, nondeterministic choice and left choice between strategies.In Section 3 we extend this basic language with a fixed point operator for the expression of recursive strategies.As an application an iteration operator is defined using this fixed point operator.In Section 4 two primitives are added that allow for a general definition of a range of term traversing strategies, including bottom-up application of a strategy, and innermost and outermost normalization according to a strategy.We show how these traversals can be interpreted by means of a generic 'push-down' function, that applies a strategy to all direct proper subterms of a term.In Sections 5 and 6 the strategy language is applied in transformations of process expressions and box expressions.In Section 7 we compare our strategy language with that of ELAN and in Section 8 we give some indications for further research.

Basic Strategies
Labels, which refer to rewrite rules, are the basic building blocks of strategies.For example, the expression denotes a rewrite rule labeled RAssoc that transforms a left-associative application of + into a right-associative one.
A strategy defines a transformation function on the terms of a language.For instance, RAssoc defines the function [RAssoc].According to the rule above, applying this function to (a + b) + c gives a + (b + c).Such an application does not have to succeed; for instance, [RAssoc] applied to a + (b + c) is not defined here.
Module Term-SA below defines the syntax of the application of strategies to terms.The application of a strategy s to a term t, written as s]t, results in a 'reduct'.This is either a term, denoting the successful application of the strategy, or t ", denoting failure of the application of the strategy.Consequently, a reduct can be seen as a pair of a term and a Boolean value indicating success or failure.Given a reduct r the function t (r) gives the term part of this pair and the function b (r) This section and the next two will be concerned with the definition of operators for the composition of strategies and their interpretation as transformation functions.

Syntax
Labels (here defined as identifiers starting with an uppercase letter) are atomic strategies.Their interpretation is provided by the user by means of labeled rewrite rules.Apart from the labels, our basic strategy language contains a special

Interpretation
Given a set of labeled rewrite rules, the application of a strategy expression to a term is interpreted according to the rules below and the user-defined labeled rewrite rules.A label that is undefined, or undefined for some term, fails, i.e., if label l is undefined for t, then l]t yields t ".In the rules below, success of an application is tested in a condition s]t = t 0 , where the right-hand side is a term t 0 injected into Reduct.This test fails if the application results in failure, i.e., s]t = t ". module Term-BS imports Basic-Strategies 2:1 Term-SA 2 equations The identity strategy always succeeds and yields the term t itself.[1] ] t = t The sequential composition s 1 s 2 succeeds if s 1 applied to t succeeds and yields a term t 0 and s 2 applied to t 0 succeeds and yields t 00 . [2] The nondeterministic choice s 1 + s 2 succeeds if either s 1 or s 2 succeeds. [3] Theory and Practice of Algebraic Specifications ASF+SDF'97

Specification of Rewriting Strategies
The left choice s 1 < + s 2 succeeds if either s 1 or s 2 succeeds, with a preference for s 1 .That is, if s 1 succeeds, then it will be applied and s 2 is only tried when s 1 fails. [5]

Generic Modules
The Term-* modules defined above and in the next sections are generic modules that define the application and interpretation of strategies to some language of terms.These modules are intended to be instantiated for each sort under consideration by renaming the sorts Term and Reduct.The tool of De Jonge (1997) can be used to automate this instantiation.

Recursive Strategies
In this section we provide a fixed point operator for the definition of recursive strategies.At the end of this section we show how it can be used to define iterative strategies.

Syntax
The fixed point operator v: s denotes a recursive strategy with recursion point v.The variable v is bound by the fixed point operator.

Interpretation
A fixed point v: s denotes the infinite strategy expression obtained by recursively replacing the entire expression for the free occurrences of the variable v in s, i.e., we have where s v := s 0 ] denotes the substitution of s 0 for all free occurrences of v in s.Substitution is defined in Section A. Because this equation would lead to an innermost non-terminating rewrite system, we interpret the fixed point operator lazily in the following module.
Theory and Practice of Algebraic Specifications ASF+SDF'97 module Term-RS imports Term-BS 2:2 Recursive-Strategies 3:1 Strategy-Substitution A equations The application of a fixed point strategy to a term is interpreted by unfolding the fixed point one step.In this way unfoldings are performed by need.

Example: Iteration of Strategies
As a first example application of recursive strategies we introduce the operators and + to express the iteration of a strategy zero or more and one or more times, respectively.The strategy s is defined by means of a recursive strategy that applies the strategy s as long as it succeeds and then terminates successfully.The function get-fresh is used to create a binding variable v that does not occur freely in the strategy s.Observe how the left choice operator < + is used to enforce as many applications as possible.If s cannot be applied, then the sequential composition s v fails, so is applied and succeeds.The unary operator + is defined in terms of .A strategy s+ succeeds if s succeeds at least once.
module CS-Interpretation imports Complex-Strategies 3:3 Strategy-Substitution A equations

Traversal Strategies
The strategies we have discussed so far are applications of a strategy at the root of a term.In order to allow for a general definition of a wide range of term traversing strategies, we introduce two unary operators.Let t = f (t 1 ;:::;t n ), we will call the arguments t 1 ;:::;t n of the leading function symbol f of t the direct proper subterms of t.
1.The strategy 2(s) (conjunctive push-down of s) applies s to all direct proper subterms of a term, provided that these applications will all succeed.
2. The strategy 3(s) (disjunctive push-down of s) applies s to all direct proper subterms of a term for which application is successful, provided that at least one of these applications will succeed.
In the sequel, we will show how both operators can be interpreted by means of one generic push-down function, and how general term traversing strategies can be defined in terms of these operators.
Theory and Practice of Algebraic Specifications ASF+SDF'97

Interpretation
Clearly, 2(s) and 3(s) only differ with respect to their 'success behaviour'.Therefore, we interpret both operators by means of one generic primitive pd (s), parameterized with a boolean operator that determines the 'success behaviour'.
module Push-Down imports Booleans-Generalized 4:1 Basic-Strategies 2:1 exports context-free syntax pd " " BoolOp "(" Strategy ")" !Strategy A generalized boolean operator operates on a list of booleans.The application V fb 1 ;::: ;b n g denotes b 1 ^:::^b n and W fb 1 ;::: ;b n g denotes b 1 _ ::: _ b n .The operators are defined as a separate sort such that they can be used as 'higher-order' parameters of the function pd.
module Booleans-Generalized imports Booleans exports sorts BoolOp context-free syntax W fb b g = b _ W fb g Generic Push-Down The function pd is defined by a schema that should be instantiated for each function symbol f of arity n in the signature of the language under consideration.This schema can be instantiated for a given SDF definition using the specification generation techniques described in Van den Brand and Visser (1996).
module Term-TS imports Term-RS 3:2 Push-Down 4:1 equations For each function f : s 1 s n !s 0 in the signature (in the schemata in this section we abstract from the grammatical (mix-fix function) aspect of SDF signatures) of a specification define a rule s]t 1 = r 1 ; :::; s]t n = r n ; f b (r 1 ); ::: The first n conditions apply the strategy s to the respective arguments t i .These conditions always succeed because the r i are variables of sort Reduct.The application of the boolean operator then determines the success or failure of the strategy from the list of Boolean values denoting the success or failure of the argument applications.If the outcome is true, the resulting term is the application of f to the term components of the reducts.Note that if the push-down succeeds, but s fails on t i , then t (r i ) = t i .
Theory and Practice of Algebraic Specifications ASF+SDF'97 In the case of a constant c (n = 0; an application without arguments) we thus have This entails that the success of a push-down on a constant depends on the default value of the boolean operator ; if it is a conjunction it succeeds, and if it is a disjunction it fails.
Associative lists can be considered as functions with an arbitrary number of arguments.A push-down on such a list entails the application of the strategy to each of the elements of the list.This is expressed by means of the following equation schemata that should be instantiated for each list sort in the specification.
The empty list is treated as a constant.
For a non-empty list the strategy is applied to the head t of the list and the push-down to the tail t of the list.
Conjunctive and Disjunctive Push-Down We can now define 2(s) and 3(s The strategies 2(s) and 3(s) are the conjunctive and disjunctive instantiations of push-down.This entails that 2(s) succeeds if s succeeds on all direct proper subterms and that 3(s) succeeds if s succeeds for at least one direct proper subterm. [1] 2(s) = pd V (s) [2] 3(s) = pd W (s)

Defining Traversal Strategies
Given the push-down operators we can define several term traversal strategies.

Specification of Rewriting Strategies
The strategies bu(s) and td(s) apply s bottom-up and top-down to a term.In the case of bu(s), 2(v) is used to recursively apply the strategy to all proper subterms of the term, after which the strategy s is applied to the result.Top-down is the dual of bottom-up obtained by reversing the order of the sequential composition.
[1] get-fresh( The strategy once(s) applies s once at each position in a term where application is possible.
[3] once(s) = bu(s < + ) The strategy innermost(s) works by means of two loops.The inner loop makes sure that s is pushed down in the term until it finds the innermost redexes.Note that 3(v) fails only on terms whose direct proper subterms are in normal form with respect to s.So s is applied to all innermost redexes.This is iterated until the term is in normal form with respect to s.
The strategy outermost(s) is be defined as the dual of innermost(s); it exhibits a preference for applying s at the root over pushing s down in the term.Thus, repeatedly all outermost redexes are contracted.
[5] get-fresh( The strategy innermost-eff is a more efficient implementation of parallel innermost reduction, because it makes use of information about the locations in the term of the previous applications of s.The strategy consists of a single bottom-up traversal that applies s to each node; if such an application succeeds, the result is reduced innermost, using s. [6] get-fresh(x; s) = v innermost-eff(s) = v: once(s v)

Process Expressions
In process algebra (see Baeten and Weijland, 1990) it is common to prove a lemma stating that every closed term is equal to a closed term with a simple inductive structure, a so-called basic term.Usually, such a lemma is proved by defining a terminating rewriting system consisting of rules that are sound with respect to the process algebraic theory, such that the set of basic terms coincides with the set of normal forms of this system.
A transformation of arbitrary process terms to basic terms is part of a tool that takes CRL process specifications to linear format (we refer to Groote and Ponse (1994b) for CRL and to Bosscher and Ponse (1995) for an informal description of the tool).In this particular setting it is a natural approach to first preprocess terms with a rule that is nonterminating, and then reduce the result in another rewriting system.Below we discuss an implementation of this using the strategy language just defined.

Transformation Rules
The set of basic terms that we are interested in is inductively defined as follows: 1. , and b are basic terms ( 2 A; b 2 B); 2. if p 1 and p 2 are basic terms, then so are p 1 , p 1 b , and p 1 + p 2 ( 2 A; b 2 B).
The following rules are all derivable in the proof theory for CRL (see Groote and Ponse, 1994a); in fact, the rules R2], R3] and R4] respectively correspond to axioms A4, A5 and A7 of process algebra.

R1] x b y
It follows by means of the method of lexicographic path orderings (see Bergstra and Klop, 1985) that the rewriting system consisting of R2]; :::; R8] is terminating.Process terms not containing conditionals of the form p 1 b p 2 with p 2 not equal to are basic terms iff they are in normal form with respect to these rules.
The rule R1] can be used to remove conditionals with a rightmost argument not equal to , but it is clearly nonterminating.Notice, however, that it is enough to apply R1] only once at every subterm to arrive at a term that does not contain conditionals with a rightmost argument not equal to .
Theory and Practice of Algebraic Specifications ASF+SDF'97

Normalization Strategy
We obtain the modules NcrlTerm-SA, NcrlTerm-BS and NcrlTerm-RS by applying the tool of De Jonge (1997) to Term-SA, Term-BS and Term-RS with renaming Term ) P Reduct ) PReduct].The module NcrlTerm-TS can be generated according to the scheme of Section 4.1.

Box Expressions
Box expressions are used in typesetting languages to indicate the layout structure of a piece of text.Horizontal boxes are used for horizontal composition, vertical boxes for vertical composition, etc.The language of box expressions Box ( Van den Brand and Visser, 1994, 1995, 1996) is a target independent intermediate language for pretty-printing and typesetting programs.Typically, a pretty-printer for some programming language translates the abstract syntax tree of a program to a box expression, which is then translated to the input format for the displaying device desired.A further discussion of this application can be found in Van den Brand and Visser (1996).
One of the target languages of the Box language is the typesetting language T E X.In order to express box expressions in T E X, box expressions are flattened by means of a (large) number of transformation rules ( Van den Brand andVisser, 1994, 1995).One of the problems that we encountered was the following.The combination of the rules for repositioning comment boxes and the rules for flattening box expressions leads to a weakly terminating rewrite system that causes an infinite reduction under innermost rewriting.The solution chosen in the implementation was to first apply the rewrite rules for comments and then apply the flattening rules.This could only be achieved outside ASF+SDF by consecutively applying the rewrite rules in two different modules.In this section we discuss the fragment of the language and the transformation rules that contain the problem and show how it is solved inside ASF+SDF using strategies.

Syntax
Boxes are either strings or expressions composed by means of one of the operators H, V, HV, and VPAR.(In fact, there are more operators in the full language, but these are not of interest for the current paper.)

Specification of Rewriting Strategies
to get back our original box expression.In fact, an innermost strategy will do exactly this, giving rise to an infinite reduction.

Normalization Strategy
To overcome the termination problem sketched above we define a strategy that applies the comment rules and the flattening rules in sequence.In fact for the rules discussed above it suffices to apply the comment rules in a bottom-up fashion to each operator in a box-expression.The normal form of a box list is obtained by first applying comment rules in a bottom-up traversal and then applying the flattening rules with an innermost strategy. [

Comparison with ELAN
Our strategy language shows much resemblance with that of ELAN.Both languages contain operators for sequential composition, nondeteministic choice, and left choice and allow for the definition of strategies using rewrite rules.
ELAN does not have a recursion operator.Recursive strategies must also be expressed by means of rewrite rules in combination with an extra built-in strategy that ensures the lazy evaluation of such rules.The problem with this built-in strategy is that it is not formally defined, as is our fixed-point recursion.
Furthermore, ELAN has a different way of expressing that a strategy must be applied at a proper subterm of a term, instead of at the root.ELAN does not contain general push-down operators, such as our 2 and 3. Instead, the notion of a congruence strategy is used.Below, we will discuss the relation between these two approaches.

Congruence Strategies
In ELAN, for each n-ary function f in the signature a 'congruence strategy' f (s 1 ;:::;s n ) is defined according to the following schema: For a given signature our conjunctive push-down can be defined as a sum of congruences, i.e., if f 1 ;:::; f m are the functions of the signature, then 2(s) is defined by 2(s) = f 1 (s; :::;s) + + f m (s; ::: ;s) Conversely, the congruence strategies of ELAN can be expressed in our strategy language via the following construction.For each function f in the signature define a strategy label f that succeeds if it is applied to a term with f as root symbol.f] f (x 1 ;:::;x n ) = f (x 1 ;:::;x n ) Theory and Practice of Algebraic Specifications ASF+SDF'97 This rule does not transform the term; it only tests its leading function symbol.Now the congruence strategy f (s; ::: ;s) can be expressed as f 2(s), i.e., f is a guard for the application of 2(s).
To express congruence strategies f (s 1 ;:::;s n ) in which the substrategies s 1 ;:::;s n are different, we need a family of push-down operators.Define for each arity n an operator pd n by means of the following schema: s 1 ] t 1 = r 1 ; :::; s n ] t n = r n ; f b (r 1 ); :::; b (r n )g = true pd n (s 1 ;:::;s n )] f (t 1 ;:::;t n ) = f ( t (r 1 ); ::: ; t (r n )) Clearly, the schema for pd n can be instantiated in the same way as that for pd .As in the previous section, we obtain for every arity n the conjunctive and disjunctive push-down operators 2 n and 3 n as instantiations of this generic operator pd n .Now, ELAN's strategy f (s 1 ;:::;s n ) is expressed in our language by f 2 n (s 1 ;:::;s n ).Also, by means of such constructions we can define strategies such as map(s) that applies s to all elements of a cons/nil list.nil] nil = nil cons] cons(x; l) = cons(x; l) map(s) = x :nil+cons 2 2 (s; x)

Concluding Remarks
We have described the setup of a language of term rewriting strategies and its interpretation in ASF+SDF.This approach gives us the possibility to control the transformation of expressions, given a set of labeled rewrite rules.The main result of this paper is the definition of the push-down operators as primitives to define term traversals in a general way.
The work in this paper opens up a range of further research issues.

Generic Specification
In this paper we have made use of generic specifications to express the semantics of strategies.This genericity is of two kinds.The first kind requires the instantiation of a generic module, achieved by renaming its sorts.A tool for this purpose is discussed by De Jonge (1997).The second kind requires the generation of equations for the push-down operator for each constructor in the signature of the language under consideration.This can be achieved by a variant of the pretty-printer generation techniques described by Van den Brand and Visser (1996).

Strategy Operators
We have defined parallel innermost and parallel outermost in terms of the push-down operators 2 and 3.These operators could be called greedy; they apply to as many arguments as possible.Using the nongreedy variants of these operators, say and , other schemes may be defined.For instance, reduce(s) = ( v:(s + (v))) defines arbitrary reduction; the operator 'reduce' repeatedly ( ) selects redexes nondeterministically (+).

Parameterized Strategies
By parameterizing labels with extra information we can define still more powerful transformation systems.Consider for example the substitution of terms for variables.The substitution of a term t for variable x in a term t 0 consists of a traversal of the term t 0 replacing all occurrences of x by t.Using strategies this is concisely specified by defining the label strategy x := t as x := t] x = t to express the replacement of a variable.Substitution is then expressed as a traversal in the term t 0 applying the replacement everywhere, i.e., once(x := t)]t 0 .Here we directly reuse the generated traversal.There are many similar applications of strategies parameterized with data.While it is straightforward to distribute information over a term, accumulation is not and is an issue to be addressed in future research.
Optimization In this paper we have discussed the interpretation of strategies in ASF+SDF using the built-in innermost rewrite engine.We have focussed on the definition of the strategy operators that we consider fundamental; we tried to keep this definition as simple as possible.However, if efficiency is at stake, one could for instance consider a lazy interpretation of the sequential choice (this would require an auxiliary operator).
Once more experience with controlled rewriting has been gained, it might be interesting to consider the integration of the strategy language in the rewrite engine of ASF+SDF itself to get a more efficient interpretation.
Another approach to the optimization of strategies is to consider transformation of strategy expressions to more efficient expressions.For example, nested loops could be merged.Such optimizations require an algebra of strategies.
Correspondence with Trace Semantics At a first glance it seems that there is a natural correspondence between trace semantics and the way strategy expressions are interpreted in our system.A strategy expression could be viewed as a specification of a set of paths in the reduction graph of a term.However, while the law is sound for trace semantics, it does not hold for the interpreter defined here.Suppose that x]t = t 0 , y]t = t 00 , z]t 0 = t 000 , but z]t 00 = t 00 ".Then, on the one hand, x z + y z]t will always yield t 000 .On the other hand, (x + y) z]t has two possible outcomes: t 000 and t ".That is, backtracking is only performed over choice and not over sequential composition.
The easiest way to resolve this, is to literally add the above law as an equation over strategy expressions.Given the fact that ASF+SDF implements a left-most innermost reduction strategy, this will yield the desired result.However, a drawback of this solution is that interpretation is only defined for strategies in normal form with respect to (1).

Correspondence with Modal Logic
Another correspondence that comes to mind is the relation to modal logic.The operators and + correspond to conjunction and disjunction of success; the push-down operators 2 and 3 are so chosen because of their similarity to the modalities in modal logic.(Blackburn et al. (1993) discuss a modal logic of trees that contains modalities to traverse a tree.)However, the influence of the outcome of x on the outcome of y in x y is not expressible by a logical conjunction only.
In the strategy language as defined in this paper there is an interesting interaction between operational behaviourtransformation of a term-and the success or failure of the application of a strategy to a term.Further research is needed to find algebraic and logical laws that formalize this interaction.