[m-rev.] For review: --inline-linear-tail-rec-sccs

Zoltan Somogyi zoltan.somogyi at runbox.com
Thu Mar 23 12:51:23 AEDT 2017



On Thu, 23 Mar 2017 12:13:33 +1100, Paul Bone <paul at bone.id.au> wrote:
> It's good to have this special (linear)
> case handled, but I guess that it would be best to handle all the other
> cases with a general "should we inline things in this SCC?" decision.

Yes. However, providing answers to such question requires inlining
to *know* about SCCs. Before this diff, it didn't know; now it does.

> However such a general solution would need to be revised (maybe turned off)
> once the MLDS can do mutual tail recursion since the MLDS tail recursion
> won't increase the code size.

Since most procedures are *not* in a mutually-tail-recursive SCC, the overall
code size increase should be minimal. And converting mutual-tail-recursion
into self-tail-recursion by inlining guarantees that the MLDS code generator
will be able to generate tail recursive code; the MLDS code generator cannot
make that same guarantee for those same predicates in their mutual-tail-recursion
form.

> A general inlining solution could also consider each call site individually.

Maybe, but if you are talking about truly *individual* consideration, then
that is a totally different issue. (This diff can treat different call sites differently,
but only based on the tail call vs not tail call distinction, which is not *fully*
individual.)

> It might not eliminate mutual recursion, but it might reduce its impact and
> make it easier for the MLDS backend to deal with.

I don't see what you want you want to say with this.

> One other special case I can think of is when you have linear recursion, but
> not all the recursive calls are tail calls.
> 
> p :-
>     ...,
>     q.   % Tail call
> 
> q :-
>     ...,
>     r,   % Non-tail call
>     ....
> 
> r :-
>     ...,
>     p.   % Tail call.
> 
> This can be inlined in the same way as you have done.  Except that r should
> be inlined into q, rather than being made the self-tail call.  If necessary
> (if r is the entry into the SCC), then it could be duplicated, one copy made
> to enter the SCC, and other inlined within q.

That would be a different though related optimization from the one that my diff
implements.
 
> > +    % Should the dependency graph include an edge from p to q
> > +    % only if p calls q in a tail call (only_tail_calls calls for this),
> > +    % or if p calls q in any call, and if p references q in a unification
> > +    % (all_calls_and_unifies).
> > +    %
> > +    % Note that only_tail_calls requires recursive calls to be marked by
> > +    % mark_tail_calls.m, and mark_tail_calls.m requires a previously built
> > +    % dependency graph, which therefore must have been built with
> > +    % all_calls_and_unifies.
> 
> So I want to check that I understand.
> 
>     You build the dependency info with all_calls_and_unifies,
>     Then mark tail calls,
>     Then build a new dependency info with only_tail_calls,
>     Then do inlining, which uses these markers to perform the new
>     optimisation.
>     Then mark tail calls again for the LLDS potentially generating warnings.

Yes, but the new dependency info with only the tail calls is only for
the SCC being processed.

> The problem I want to avoid, and it looks like you have, is that warnings
> for mutualy recursive calls that are not tail calls (generated by
> mark_tail_calls) could be incorrect if generated before inlining, or on dead
> versions of the SCCs that have not yet been eliminated.

My change leaves the process of generating warnings completely untouched;
none of the new invocations of mark_tail_calls.m will generate any warnings.

> > -    module_info_ensure_dependency_info(!ModuleInfo, DepInfo),
> > -    PredProcs = dependency_info_get_condensed_bottom_up_sccs(DepInfo),
> > -    set.init(InlinedProcs0),
> > -    do_inlining(PredProcs, NeededMap, Params, InlinedProcs0, !ModuleInfo),
> > +    % Build the call graph and extract the list of SCC. We process
> > +    % SCCs bottom up, so that if a caller wants to inline a callee
> > +    % in a lower SCC, it gets the *already optimized* version of the callee.
> > +    % If LinearTailRec = no, we don't try to do anything special about
> > +    % calls where the callee is in the *same* SCC as the caller.
> 
> List of SCCs

What correction are you proposing? *Everything* is in the list of SCCs.

> > +    ( if
> > +        % If every procedure in the SCC is reachable from every other
> > +        % via tail recursive calls, and ...
> > +        TSCCsEntries = [_],
> > +        % ... the number of tail recursive calls in the SCC is equal
> > +        % to the number of procedures in the SCC, ...
> > +        TSCCArcs = dependency_info_get_arcs(TSCCDepInfo),
> > +        list.length(TSCCArcs, NumTSCCArcs),
> > +        list.length(SCCProcs, NumSCCProcs),
> > +        NumTSCCArcs = NumSCCProcs
> 
> What happens if you have:
> 
>     p :-
>         q.
>     q :-
>         (
>             r
>         ;
>             r
>         ).
>     r :-
>         p,  % non-tail call.
>         ....
> 
> Now there are three procedures, and three recursive tail calls.  But we
> probably dont want to inline r twice.  Particularly if r is or becomes large
> due to other inlining.

For the code above, r will be in a separate TSCC from p and q, so
TSSCEntries won't equal [_], so the new transformation won't be applied.

The transformation is also designed never to inline the same procedure
twice. (The test is the number of tail calls being equal to the number of
procedures in the SCC.) It is this constraint that I was asking about
relaxing. Each extra tail call we allow in the SCC may (though typically
it won't) double the size of the resulting code.

> I also wonder how useful it would be to relax this.  If only some of the
> tail calls are mutual calls.
> 
>     p :-
>         q.
>     q :-
>         (
>             r
>         ;
>             r,  % non-tail call.
>             ...
>         ).
>     r :-
>         p.
> 
> Inlining r into the tail recursive call site in q.  and q into p will result
> in constant stack space _if_ the first branch is executed.  If the first
> branch _is_ executed more often then this could be a good optimisation, if
> it isn't I don't think it would be a sagnificant pesimisation.  I guess the
> real question is would it be an optimisation enough of the time ti make
> writing the optimisation worth-while.

If you mean to imply that all the calls above are tail recursive with the
one marked exception, then code like that *will* be transformed,
because each procedure in the SCC has exactly one *tail* call
to another member of the SCC.

> You should update the copyright information on each of the files you touch.
> Copyright (C) 2017 The Mercury Team

With CVS, we had a pre commit hook to do this automatically.
Can someone who knows git better than me create something similar?

> > diff --git a/doc/user_guide.texi b/doc/user_guide.texi
> > index bb94558..e990a8d 100644
> > --- a/doc/user_guide.texi
> > +++ b/doc/user_guide.texi
> > @@ -8770,6 +8770,14 @@ containing more than @var{threshold} variables. Procedures
> >  containing large numbers of variables can cause
> >  slow compilation.
> >  
> > + at item --inline-linear-tail-rec-sccs @var{threshold}
> > + at findex --inline-linear-tail-rec-sccs
> > +Given a set of mutually recursive procedures (an SCC, or strongly
> > +connected component, of the call graph) in which each procedure
> > +contains exactly one call to a procedure in the SCC and
> > +that call is a tail call, inline the callee at every one of those
> > +call sites.
> > +
> 
> In this and maybe the --help text.  It'd be good to tell users what this
> means to them.  Eg: it reduces mutual recursion to self tail recursion,
> which the MLDS can optimize directly into a loop.
> 
> 
> There's only a couple of things I'd like changed before you commit it.

I committed it on tuesday, after warning on monday that I was going to do so.

If you have a suggestion for the extra text, post it as a diff for review.

>  But
> I'd also like to know that the "there are the same number of tail recursive
> calls as procedures" check is sufficient, or there's other checks elsewhere
> that I missed.

Yes, it looks like you missed a check; it is pointed out above.

> Thanks, once you've committed I'll merge/re-write my change based on this
> (your review was first).

Yes. I will review your diff after you post a post-merge-conflict-resolution
version.

Zoltan.




More information about the reviews mailing list