# [m-dev.] optimizing matrix/vector expressions

Fergus Henderson fjh at cs.mu.OZ.AU
Fri Aug 15 18:17:57 AEST 2003

```The module included below demonstrates a flaw in the way that the
compiler's deforestation/partial deduction optimizations works which
causes the compiler to miss an important optimization opportunity.

This module is a micro-benchmark for evaluating the
vector expression A = -B + 2 * C.  The idea is that
this should eventually get compiled down to a single loop
which performs all of the calculations, rather than
creating a bunch of different intermediate vectors.
To ensure this, we code it using an expression tree,
writing the expression as

A = eval(-vec(B) + 2.0 * vec(C))

where the operators -, +, * and vec/1 are all just constructors for an
expression tree type (`vec_expr(T)'), and the actual evaluation is done
by the eval/1 function.  The eval/1 function first examines the
expression tree to figure out the length of the vector, constructs
an array of the appropriate length, and then loops over the elements
of the vector, filling in the entries for this array by evaluating
the expression tree.

:- module vector_expressions.
:- interface.

:- func benchmark(vec(float), vec(float)) = vec(float).

:- type vec_expr(T)
--->	vec_expr(T) + vec_expr(T)
;	T * vec_expr(T)
;	-vec_expr(T)
;	vec(vec(T)).

:- type vec(T) == array(T).

:- implementation.
:- import_module array, float, int.

benchmark(B, C) = A :-
A = eval(-vec(B) + 2.0 * vec(C)).

:- func eval(vec_expr(float)) = vec(float).
eval(E) = build_array(bounds(E), eval_elem(E)).

:- func bounds(vec_expr(T)) = int.
bounds(M1 + _M2) = bounds(M1).
bounds(_C * M) = bounds(M).
bounds(-M) = bounds(M).
bounds(vec(A)) = array__max(A).

:- func eval_elem(vec_expr(float), int) = float.
eval_elem(X + Y, I) = eval_elem(X, I) + eval_elem(Y, I).
eval_elem(C * X, I) = C * eval_elem(X, I).
eval_elem(-X, I) = -eval_elem(X, I).
eval_elem(vec(A), I) = A^elem(I).

:- func build_array(int, func(int) = T) = array(T).
build_array(Size, Func) = A :-
make_uninit_array(Size, A0),
A = fill_in_array(0, Size, Func, A0).

:- func fill_in_array(int, int, func(int) = T, array(T)) = array(T).
:- mode fill_in_array(in, in, in, array_di) = array_uo.
fill_in_array(I, Size, F, A0) = A :-
( I >= Size ->
A = A0
;
array__set(A0, I, F(I), A1),
A = fill_in_array(I + 1, Size, F, A1)
).

:- pred make_uninit_array(int, array(T)).
:- mode make_uninit_array(in, array_uo) is det.
:- external(make_uninit_array/2).

Since the expression tree in benchmark/2 is known statically, the
compiler ought to be able to optimize it away entirely, by inlining
eval/1, and then partially evaluating both uses of the expression
tree within eval/1.  And indeed, the compiler does partially evaluate
away the first traversal of the expression tree when it is used in bounds/1
to figure out vector length.  However, the compiler does _not_ manage to
partially evaluate the call to eval_elem/2.

The compiler starts off reasonably well: it inlines eval/1 and build_array/2,
and the compiler's higher-order specialization is able to successfully
specialize fill_in_array/4, thus converting the call F(I) in
fill_in_array/4 to instead call eval_elem(E, I).  So we're left
with benchmark/2 calling bounds/1 and fill_in_array/4,
and with fill_in_array/4 calling eval_elem/2 and itself (recursively).
We successfully partially evaluate the call to bounds/1, but
the compiler's partial deduction thinks that there is no useful
information available for specializing the call to fill_in_array/4
from benchmark/2, and so it is not able to specialize the call to
eval_elem from fill_in_array/4.

Any suggestions for how we might be able to get the compiler to
successfully optimize this?

--
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au