[m-users.] Enable LCMC tail recursion optimization by default?

Paul Bone paul at bone.id.au
Sat Apr 23 22:34:41 AEST 2016

On Sat, Apr 23, 2016 at 07:34:40PM +1000, Zoltan Somogyi wrote:
> I have long thought that the ideal way to handle lcmc and similar
> optimizations is for a profiler to generate a list of procedures whose runtime
> is nontrivial, and to give one or more such lists to the compiler.
> The compiler should then optimize the procedures that appear on at least
> one such list for speed, and every other procedure for size. That is from
> the point-of-view of speed. From the point-of-view of robustness, we
> would also need a list of procedures whose inputs can cause them to
> exhaust the stack. Unfortunately, I don't see any easy way to generate
> that list automatically. The converse, a list of procedures whose inputs
> are *bounded* somehow to a level that is guaranteed *not* to stress
> the stack may be easi*er* to obtain, but that task is still substantial.
> And any case, I expect *most* predicates would then fall into the category
> of "NOT known to NOT to be able to exhaust the stack". Compiling them
> with lcmc, just in case, would yield a result that would be scarcely
> distinguishable from compiling *everything* with lcmc.

This is an ideal use of the feedback system.  The feedback system was also
designed to be extensible, allowing new "reports" to be added.  This kind of
report is also pretty straight-forward.  It should not take more than a

The report can contain:

    A list of procedures executed frequently (high "call" port count):
    The threshold for inclusion should be a percentage, eg: "the top 10% of
    procedures by call count".

    A list of procedures never executed.

Choosing a default threshold for inclusion in the first list could be done
by manually looking at a histogram of call counts and picking a percentage.
For example if 90% of the program's time is spent in 10% of the procedures
then that seems reasonable.

Note that call port counts are going to be slightly different from self call
sequence counts.  For this I think that port call counts will be best so as
not to weight results by procedure size.

I've also considered using this kind of information, coverage information in
particular, to guide inlining.  If a procedure is executed frequently but
its error branches are seldom if ever called, then it doesn't make sense to
inline code on those branches, increasing the procedure size and negatively
affecting locality for the executions that _do not_ execute those branches.
Inlining could also be made more aggressive along the frequently executed

Paul Bone

More information about the users mailing list