Program fusion with paramorphisms

The design of programs as the composition of smaller ones is a wide spread approach to programming. In functional programming, this approach raises the necessity of creating a good amount of intermediate data structures with the only aim of passing data from one function to another. Using program fusion techniques, it is possible to eliminate many of those intermediate data structures by an appropriate combination of the codes of the involved functions. In the standard case, no mention to the eliminated data structure remains in the code obtained from fusion. However, there are situations in which parts of that data structure becomes an internal value manipulated by the fused program. This happens, for example, when primitive recursive functions (so-called paramorphisms) are involved. We show, for example, that the result of fusing a primitive recursive function p with another function f may give as result a function that contains calls to f. Moreover, we show that in some cases the result of fusion may be less efficient than the original composition. We also investigate a general recursive version of paramorphism. This study is strongly motivated by the development of a fusion tool for Haskell programs called HFUSION.


INTRODUCTION
In functional programming, function composition is a fundamental tool for combining smaller programs to build new ones.Between two composed functions there is always an intermediate data structure which carries data from one function to the other.The overhead of handling such intermediate data structures can be avoided in many cases by replacing the composition by an equivalent function which does not construct the data structure.
Intermediate data structures can be eliminated by a program transformation technique known as deforestation [16,9].In this paper, we follow an approach to deforestation based on recursion program schemes associated with recursive types [14,13].Program schemes have associated algebraic laws, which are useful for reasoning about programs as well as for program transformation purposes.In connection with deforestation, there is a particularly relevant set of algebraic laws, so-called fusion laws, which involve the elimination of intermediate data structures.
In the standard case, no mention of the eliminated data structure remains in the code obtained from fusion.However, there are situations in which parts of that data structure becomes an internal value manipulated by the fused program.This may happen, for example, when paramorphisms [11] are involved.Paramorphism is a program scheme that captures functions defined by primitive recursion.These are functions that use the arguments and values of the recursive calls to compute their result.A classical example of paramorphism is the factorial function.
The aim of this paper is to analyze fusion laws for paramorphisms.We show, for example, that the fusion of the composition of a paramorphism p with another function f may give as result a function p that contains calls to f inside, and therefore includes the generation of values produced by f .Moreover, we show the existence of cases where the fusion with a paramorphism may lead to a function that is less efficient than the original composition.To see a simple example of this situation, consider the composition of tails and map: = [] map f (a:as) = f a : map f as Function tails is a paramorphism which uses the successive tails of the list as argument to the recursive calls and as values to construct the output list.The fusion of these two functions gives as result the following recursive definition of tm: = [] tm f (a:as) = map f as : tm f as which retains a call to map, now applied to the successive tails of the list separately.This definition of tm has O(n 2 ) time complexity, while the original one was the composition of two O(n) functions.The origin of the problem resides in tails and it is caused by the successive sharing of the tails between the recursive calls and the cons operation that builds the output list.
In addition to studying fusion laws for paramorphism, we introduce a new program scheme, called generalized paramorphism, which is a general recursion version of paramorphism.The new scheme generalizes paramorphism in the same way hylomorphism (another general recursion scheme) generalizes fold: it is obtained by replacing in the definition of paramorphism the coalgebra of the destructors (of the input datatype) by an arbitrary coalgebra.The expressive power of generalized paramorphism is similar to that of hylomorphism, but in addition it incorporates fusion cases that are not achievable with the fusion laws of hylomorphism.
Our interest in studying this generalization of paramorphism has been motivated by the development of a fusion tool for Haskell programs, called HFUSION. 1 This tool was originally based on hylomorphisms, but now it uses generalized paramorphisms as the internal representation of recursive functions.We are particularly interested in presenting so-called acid rain laws [14] for the different program schemes, as they correspond to a mechanizable subset of fusion laws and are the kind of laws being used in the implementation of HFUSION.It is worth mentioning that all the examples to be shown in the paper have been tested in the tool.
The rest of the paper is organized as follows.Section 2 presents background material about recursion program schemes and sets up the notation to be used during the paper.Section 3 reviews the definition of paramorphism and its standard laws, and introduces acid rain laws for this program scheme.Section 4 presents generalized paramorphism and its laws.In these two sections we will show both positive and negative examples of fusion with (generalized) paramorphisms.Section 5 explains our practical motivations for introducing generalized paramorphism.We also describe the fusion cases we gain by the introduction of this program scheme.Section 6 presents final remarks and conclusions.

RECURSIVE TYPES AND PROGRAM SCHEMES
Recursive program schemes encapsulate common patterns of computation of recursive functions and have a strong connection with datatypes.A generic definition of them can be formulated using on the categorical approach to recursive types, in which types constitute objects of a category C and programs are modelled by arrows of the category.This section summarizes the relevant concepts of this approach to recursive types and the generic definition and standard laws of three well-known schemes: fold, unfold and hylomorphism (see e.g.[12,4,5,14,6]).
Throughout the paper the working category will be Cpo, the category of pointed cpos (complete partial orders with a least element ⊥) and continuous functions.The choice of this category facilitates us to work with arbitrary recursive functions.As usual, a function f is said to be strict if it preserves the least element, i.e. f ⊥ = ⊥.The final object of Cpo is given by the singleton set {⊥} and will be written as 1.This object will correspond to our interpretation of the unit type (), whose unique element is also written as (). 2 Product is defined as cartesian product, with projections π 1 :: a × b → a and π 2 :: a × b → b.The pairing (or split) of two functions f :: c → a and g :: c → b is written f, g :: c → a × b.Sum a + b is defined as separated sum, with sum inclusions inl :: a → a + b and inr :: b → a + b.Given two continuous functions f :: a → c and g :: b → c, case analysis is defined as the strict function f g :: a + b → c such that (f g) • inl = f and (f g) • inr = g.Product and sum can be generalized to n components.In the generalization of the sum we will write in i to denote the i-th injection.
In the categorical modelling of types, a datatype τ is understood as a solution to an equation x ∼ = F x, for an appropriate endofunctor F that captures the shape (or signature) of the type.Given a locally continuous and strictness preserving functor F on Cpo, a recursive domain equation x ∼ = F x has a unique solution specified by a pointed cpo µF together with an isomorphism provided by a pair of strict functions in F :: F µF → µF and out F :: µF → F µF each others inverse.The cpo µF contains partial, finite, as well as infinite values.Function in F encodes the constructors of the datatype, while out F the destructors.

Fold
Given an endofunctor F : Cpo → Cpo, a function φ :: F a → a is called a F -algebra.In particular, observe that in F is an algebra.A homomorphism between two algebras φ :: F a → a and φ :: The least homomorphism between in F and any other algebra φ :: F a → a gives rise to a recursion scheme, denoted by (|φ| ) F :: µF → a and usually called fold [2] or catamorphism [12], which captures definitions by structural recursion.That is, fold is defined as the least function that satisfies the equation f For lists, fold corresponds to the standard foldr operator used in functional programming.A fold (|φ| ) F is strict iff its algebra φ is strict.Fold satisfies the following laws: Fold identity Fold fusion Fold-fold fusion The last law is usually referred to as acid rain [14].The goal of acid rain is to combine a function that produces a data structure with another that consumes it.A polymorphic function τ :: ∀a.(F a → a) → (G a → a) that converts F -algebras into G-algebras is said to be a transformer [5].Every function τ of this type satisfies the following property, which can be inferred as a free theorem [15]: for every f :: a → b, φ :: F a → a and φ :: That is, every homomorphism between two F -algebras is also a homomorphism between the respective G-algebras.

Unfold
Given a functor F , a function ψ :: a → F a is called a F -coalgebra.In particular, out F :: µF → F µF is a coalgebra.A homomorphism between two coalgebras ψ :: a → F a and ψ : F -coalgebras and their homomorphisms form a category.The coalgebra out F :: µF → F µF turns out to be final in this categoty.This means that there exists a unique homomorphism from any coalgebra ψ :: a → F a to out F , which is denoted by [ (ψ)] F :: a → µF .It gives rise to a recursion scheme, called unfold [8] or anamorphism [12], which satisfies the equation: Unfold captures definitions by structural corecursion.It satisfies the following laws: Unfold identity Unfold fusion Unfold-unfold fusion Unfold-unfold fusion is another case of acid rain [14].A polymorphic function σ :: ∀a.(a → F a) → (a → G a) is now a transformer from F -coalgebras to G-coalgebras.In this case the free theorem states that, for every f :: a → b, and coalgebras ψ :: a → F a and ψ ::

Hylomorphism
Fold and unfold express standard ways of consuming and generating data structures, respectively.Now we look at functions given by the composition of a fold with an unfold.They correspond to general recursive functions whose structure is dictated by a virtual intermediate data structure.
Given an algebra φ :: F b → b and a coalgebra ψ :: a → F a, the hylomorphism [12,4,5] φ, ψ F :: a → b is the function defined as: Alternatively, hylomorphism can be defined as the least function that satisfies the equation This shows that we can always transform the composition of a fold with an unfold into a single function that avoids the construction of the intermediate data structure.From this definition, we obtain the equation which expresses the shape of recursion that comes with each datatype.Two well-known examples of hylomorphisms are quicksort and merge sort (see e.g [1,6]).The expressiveness of hylomorphisms is very rich.In practice, almost every interesting recursive function can be expressed as a hylomorphism.
Applying the identity laws corresponding to fold and unfold, it is immediate to see that fold and unfold are themselves a hylomorphism: The following fusion laws are a direct consequence of ( 9) and the fusion laws for fold and unfold.

PARAMORPHISM
Paramorphisms [11] correspond to primitive recursive functions.Therefore, like folds, they capture functions that are defined by structural recursion.In this section we review the definition of paramorphism (presenting it in the context of Cpo) and some of its standard laws.We also introduce new acid rain laws that relate paramorphisms with folds.
Given a function φ :: The following diagram makes the types explicit: The difference between paramorphisms and folds is in the amount of information available in each recursive step.In addition to the values returned by the recursive calls (as in fold), function φ has also available their arguments.As we will see later on in this section, this subtle difference with folds makes paramorphisms inappropriate for fusion in some cases.
The following equations express the well-known relationship between paramorphisms and folds.
Equation ( 15) is usually taken as the definition of paramorphism.It states that a paramorphism can be implemented as a fold that produces a pair, whose second component contains a (recursively generated) copy of the input.Equation (16) shows that a fold is a paramorphism that ignores the copy of the arguments to the recursive calls.
The following is the fusion law for paramorphism [11].
The rest of this section is devoted to the analysis of acid rain laws for paramorphisms.We are not aware that they have been presented before.These laws will serve us as basis for designing the acid rain laws for generalized paramorphisms in section 4.
The first law we consider refers to the composition of a fold with a paramorphism.

Proposition 3.2 (fold-para fusion)
For strict φ, F is a homomorphism between the algebras in F and φ, by the free theorem associated with the polymorphic type of τ it follows that Therefore, by applying (17) we obtain the desired result.The strictness condition required to (|φ| ) F in (17) follows from the assumption that φ is strict.
The next fusion law refers to the composition between a paramorphism and a fold.It is particularly interesting and important as it exhibits a case in which the paramorphism internalizes the generation of values of the intemediate data structure that wants to be eliminated.The following lemma will be used in the proof of the law.

Lemma 3.3
For F -algebras φ :: F a → a and ψ :: Proof Let us call p the paramorphism and c the fold.By definition of paramorphism and fold we have that The two functions are defined simultaneously in an asymmetric way.That is, p depends on c while c does not depend on p. Definitions following this pattern are called a zygomorphism [10].From the definition of p and c, it can be derived that [5]: In the context of Cpo this equation is proved by fixed point induction.If we call pc the split p, c , then the statement of the lemma can be rewritten as: which can then be proved by fixed point induction.

Proposition 3.4 (para-fold fusion)
For strict φ, Proof From the definition of paramorphism we can derive that |φ| F , id is a homomorphism between the F -algebras in F and φ, in Then, by the free theorem associated with the polymorphic type of τ it follows that |φ| F , id is also a homomorphism between the G-algebras τ (in F ) and τ ( φ, in Finally, by Lemma 3.3 the desired result follows.Strictness of |φ| F , necessary for the application of Lemma 3.3, is a consequence of the assumption that φ is strict.

Example 3.5
This example shows a simple case in which the fold is copied into the body of the resulting paramorphism, producing multiple generations of data structures.The algebra of filter can be expressed as τ (in), where τ is a polymorphic function given by: Therefore, if we apply para-fold fusion we obtain the following:

Inlining, tf p [] = [] tf p (a:as) = if p a then filter p as : tf p as else tf p as
We applied fusion with the aim to eliminate the intermediate list that was generated by filter, but as result we obtained a function that filters the successive tails of the input list separately.This means that fusion transformed the composition of two functions with linear time behaviour to a function which is quadratic!In other words, in this case the effect of the medicine was worse than the illness itself.

Example 3.6
This example shows another case of the situation presented in the previous example (we simply show the result of applying fusion and skip the details).Consider the function that counts the number of words of a text after having filtered it with a predicate p. Function wc is a paramorphism.It is inspired in one of the word counting algorithms described in [7].This function adds one each time the end of a word is detected, and for this it uses the current character c and the next one d (except at the end).By para-fold fusion we obtain as result a paramorphism with the following recursive definition: In the original definition of wcf, the inspection of the tail was performed on a text that was already filtered.Now, on the contrary, an on-line filtering of the tail is necessary each time before inspection.In this case the time behaviour of the resulting program is linear as the original ones.However, it may happen that the predicate p is applied twice to some of the elements of the input string: once in the context of filter and another one in the condition of the if-then-else.Also, note that the list nodes originally produced by filter are still produced when evaluating the case on filter p cs. So, in spite of our efforts, we could not eliminate the intermediate list.
There exist of course applications of para-fold fusion that yield satisfactory results.This is illustrated by the following example.This function is a paramorphism because it returns the tail of the input list as part of the result when the sought value is met.

replace x y = |φ1 φ2|
where φ1 = λ().[] φ2 = λ(a,zs,as).if a==x then y : as else a : zs Suppose that we want to replace an element in a list after filtering.
repf x y p = replace x y .filter p We are again in a situation where we can apply para-fold fusion, obtaining a paramorphism with the following recursive definition:

repf x y p [] = [] repf x y p (a:as) = if p a then if a==x then y : filter p as else a : repf x y p as else repf x y p as
In this case filter needs to be applied to the sublist that remains after the replaced element (in case that element was found), as that sublist is returned as part of the result.The application of para-fold fusion yields a satisfactory result in this case: Note 3. 9 The previous examples have shown the existence of some cases where para-fold fusion may worsen performance.These are fusions of the form |φ| F • f in which occurrences of f in the result produce the generation of duplicated computations.This means that, in the presence of paramorphisms, fusion cannot be applied without restrictions.It is necessary thus to include some code analysis that helps us to avoid the application of fusion in those cases we know performance will decrease.At the moment HFUSION does not perform this kind of analysis, but we plan to do so in the near future.
We give an intuitive characterization of the different cases of |φ| F • f in terms of the notion of "computation".The analysis focuses on function φ of the paramorphism: • If during the computation of φ both the values returned by the recursive calls and their arguments are necessary, then fusion should be avoided.This is the case of tails .filter p and wc .filter p.
• If the values returned by the recursive calls or their arguments (but not both) appear during the computation of φ, then fusion can be safely performed.This is the case of replace x y .filter p and insert x .mapT f.For instance, in the case of insert x = |φ| , φ is given by: φ1 () = Node x Empty Empty φ2 (a,(t1,r1),(t2,r2)) = if x < a then Node a r1 t2 else Node a t1 r2 If a computation uses t1, then it does not use r1, and vice-versa.The same holds for t2 and r2.This is the reason that makes fusion in Example 3.8 adequate.

GENERALIZED PARAMORPHISMS
This section presents a new program scheme that generalizes paramorphisms in the same sense hylomorphisms generalize folds.This generalization of paramorphism will permit us to capture a wider class of recursive functions that use the arguments of the recursive calls to compute the final result.We will state fusion laws associated with generalized paramorphisms, but now in combination with folds, unfolds and hylomorphisms.
To see how this generalization is obtained, let us recall the diagram that a paramorphism satisfies, writing out F instead of in F : The arguments to the recursive calls are obtained by applying the coalgebra corresponding to the destructors of the data type.The generalization we introduce is obtained by considering an arbitrary coalgebra instead.
Given φ :: F (b × a) → b and a coalgebra ψ :: a → F a, the generalized paramorphism {|φ, ψ| } F :: a → b is the least function that makes the following diagram commute: The notion of generalized paramorphism is in some sense related with that of parametrically recursive coalgebra [3].
Example 4.1 Consider the functor L a that captures the signature of lists.For φ 1 :: () → b and The following equation expresses the fact that paramorphisms are a particular instance of generalized paramorphisms: Generalized paramorphisms are so expressive as hylomorphisms.The following equation shows that every hylomorphism can be written as a generalized paramorphism.It states a relationship similar to the one between folds and paramorphisms (equation 16).
The relationship in the other direction is the following.For each ψ :: a → F a, let us define the functor G x = F (x × a).Then, where ∆ = id , id .Thus, for g = {|φ, ψ| } F , The following two fusion laws resemble laws for hylomorphism.Observe that in (22) the colagebra homomorphism is internalized as part of the code of the resulting generalized paramorphism.

Proposition 4.2 (gpara fusion)
Proof Both laws can be proved by fixed point induction.We show the proof of (22) as it illustrates how f becomes part of the result.Let us define γ(g Taking into account the close similarity between generalized paramorphisms and hylomorphisms, one may think of the existence of a factorization property similar to that of hylomorphism, which states that every generalized paramorphism can be split up into the composition of a paramorphism with an unfold, i.e. {|φ, ψ| } F = |φ| F • [ (ψ)] F .However, this law does not hold.The reason for the failure is originated in the fact that paramorphisms, in contrast to folds, use the arguments to the recursive calls to compute their results.The following law shows that the result of fusing the composition of a paramorphism with an unfold is a generalized paramorphism which internalizes the computation of the unfold as part of its code.Function tails is a paramorphism while down is an unfold.By applying para-unfold fusion we obtain:

Proposition 4.3 (para-unfold fusion)
This is again a situation in which the composition of two linear time functions gives a quadratic function as result.This is due to tails.
The following law is a direct consequence of para-fold fusion (Proposition 3.4).

Proposition 4.5 (para-hylo fusion)
For strict φ, The two previous fusion laws showed compositions that yield generalized paramorphisms as result.The laws that follow are acid rain laws with generalized paramorphism as argument.The generalization of paramorphism opens the possibility of an acid rain law with unfold.
Proposition 4.7 (gpara-unfold fusion) Proof Same proof to fold-para fusion (Prop.3.2), but using ( 22) and the free theorem for σ.  -> in2 a (a:a':as)-> in3 (a,a',as) The coalgebra ψ does not correspond to out La , for L a the base functor of lists, because it contains nested patterns.It can, however, be written as ψ = σ(out La ), where σ is given by: On the other hand, map, which is usually presentd as a fold over lists, can be expressed as an unfold as well: map f = [ (ψ)], where   = [] dWt p (a:as) = if p as then dWt p as else as : tails as Note 4.12 The characterization of good and bad cases of fusion that we can add with the introduction of generalized paramorphism is very the same as the one presented in Note 3.9.Now we must analyze function φ in compositions of the form {|φ, ψ| } F •f and |φ| F •f in order to conclude whether fusion is desirable or not.Performing such an analysis we can conclude, for instance, that tails .down is a bad case while drop2While p .map f and dropWhile p .tails are good ones.

FUSION IN PRACTICE
Our interest in studying generalized paramorphisms has arisen in the context of the development of HFUSION, a fusion tool for Haskell programs that is a reimplementation and an extension of the HYLO system [13].The tool essentially translates recursive function definitions written in Haskell into hylomorphisms and then applies acid rain laws of hylomorphism to function compositions indicated by the user.This explains our special attention in the acid rains laws for the different recursion schemes.
The reasons for translating all recursive functions into hylomorphisms is twofold.One is due to the expressive power of hylomorphism, in the sense that all other recursive program schemes can be written in terms of it.The other reason is simplicity: translating all recursive functions into hylomorphisms, the internal engine needs manipulate only one form of recursion and thus implement only a few fusion laws and restructuring algorithms.
During the implementation of the kernel of the tool we started experimenting with some examples that were fusable by our implementation (modulo some simple modifications to the internal representation of hylomorphisms), but were impossible to be fused with the original representation and laws.We wanted then to give an explanation of these modifications at the abstract level, and it was during that process that the notion of generalized paramorphism came up as the appropriate abstraction that reflects the class of special cases we were playing with.With these modifications, the tool essentially interprets every recursive function as a generalized paramorphism.This explains our definition of generalized paramorphism.
The equivalence in the expressive power between hylomorphism and generalized paramorphism (witnessed by equations ( 19) and ( 20)) permits us to assure that we are not loosing fusion cases with the introduction of generalized paramorphism.On the contrary, we gain new cases captured by para-hylo fusion.We illustrate this by means of an example.Consider again the function compositon presented in Example 3.7: repf x y p = replace x y .filter p.This composition corresponds to a successful case of fusion.The main reason for the success is the fact of having viewed replace as a paramorphism.If, on the contrary, we view this function as a hylomorphism, then fusion fails.Let us see the reason.The definition of replace as a hylomorphism is: The coalgebra ψ is not exactly out La , but it contains it.Therefore, replace is a hylomorphism of the form φ, ψ Ga , for ψ = out La .On the other hand, filter is a fold (|τ (in)| ) whose algebra can be expressed in terms of a polymorphic function τ .If we try to fuse these two functions as hylomorphisms, then the only law we could apply is fold-hylo fusion, but this is impossible because replace is not a fold.

CONCLUSIONS AND FINAL REMARKS
In this paper we introduced a generalized version of paramorphism which has an expressive power equivalent to hylomorphism.We showed acid rain laws for both the generalized and the standard version of paramorphism.With the introduction of generalized paramorphism we gained new fusion cases that cannot be captured with the laws of hylomorphism.
However, there are also some negative aspects.In particular, we saw the existence of compositions involving paramorphisms that may lead to programs with worse performance by the application of fusion.Therefore, in the presence of paramorphisms one should first perform some analysis on the code in order to determine whether to apply fusion or not.
There are some other cases, like insert x .mapT f, where fusion deforests just a single path from the root to the leaves.This is due to the fact that a paramorphism not only traverses its input, but also keeps it for computing the outcome.So in this case only a small amount of the intermediate data structure was eliminated.Nonetheless, fusion with paramorphisms may be good for bringing other functions together.For example, after fusing map g .replace x y .filter q we will have map g .filter q in the body of the resulting function.
Concerning HFUSION, we still owe a benchmark where to test on real programs the effectiveness of our approach based on generalized paramorphisms.We also need to implement a code analysis to avoid bad fusion cases.

Example 3 . 7
Consider the function that replaces the first occurrence of a value in a list by a given value.
replace :: Eq a => a -> a -> [a] -> [a] replace x y [] = [] replace x y (a:as) = if (a==x) then y : as else a : replace x y as

Example 3 . 8
Consider the composition of the function that inserts a value in a binary search tree with the map function for binary trees.data Tree a = Empty | Node a (Tree a) (Tree a) insmap x f = insert x .mapT f insert x Empty = Node x Empty Empty insert x (Node a t1 t2) = if x < a then Node a (insert x t1) t2 else Node a t1 (insert x t2) mapT f Empty = Empty mapT f (Node a t1 t2) = Node (f a) (mapT f t1) (mapT f t2)

Example 4 . 8
Consider the following composition:dm p f = drop2While p .map f drop2While :: (a -> Bool) -> [a] -> [a] drop2While p [] = [] drop2While p [a]= if p a then [] else [a] drop2While p (a:a':as) = if p a then drop2While p as else a:a':as Function drop2While can be defined as a generalized paramorphism with functor H a b = 1 + a + a × a × b.