[m-rev.] for review: optimizing mutually recursive tail calls in MLDS
Julien Fischer
jfischer at opturion.com
Sat Sep 2 23:46:26 AEST 2017
Hi Zoltan,
On Tue, 29 Aug 2017, Zoltan Somogyi wrote:
> The logical reviewer for this diff would be Paul, but I don't think
> I can ask him, so it is for review by anyone.
>
> The diff passes bootcheck on my laptop, but I would like to ask Julien,
> whether or not he reviews the change, to test it out in the Java and C#
> grades.
I have done so and not encountered any problems with this diff.
...
> Implement mutual tail call optimization for the MLDS.
>
> It does not (yet) optimize all the calls that the LLDS backend optimizes,
> with the most significant limitation being that it handles only calls to
> procedures that return all their arguments by reference. This rules out
> det functions (thought it includes *semidet* functions), as well as the
> backends that use copy-out. Lifting those limitations is future work.
...
> diff --git a/compiler/ml_args_util.m b/compiler/ml_args_util.m
> index eeb8600..7645dd2 100644
> --- a/compiler/ml_args_util.m
> +++ b/compiler/ml_args_util.m
> @@ -32,6 +32,7 @@
>
> :- import_module bool.
> :- import_module list.
> +:- import_module map.
>
> %---------------------------------------------------------------------------%
> %
> @@ -84,6 +85,72 @@
> list(mer_mode)::in, pred_or_func::in, code_model::in,
> mlds_func_params::out, ml_gen_info::in, ml_gen_info::out) is det.
>
> + % ml_gen_tscc_arg_decls(ModuleInfo, Vars, Types, Modes, VarSet, Context,
> + % ProcIdInTscc, NextInArgNum0, NextOutArgNum0, NumOutputs,
> + % !OutArgNames, TsccInArgs, TsccInLocalVarDefns, TsccArgs,
> + % OwnLocalVarDefns, CopyTsccToOwnStmts) :-
s/:-/:/ at the end there.
> + % Given a procedure in a TSCC and information about its argument list
> + % in the form of the arguments' variable names (Vars), types (Types),
> + % and modes (Modes), return the information the code generator needs
> + %
> + % - to generate code for tail calls to this procedure, from other
> + % procedures in the TSCC as well from itself, and
> + % - to generate parameter passing code in the procedure's prologue.
> + %
> + % The former is needed before we can start generating code for *any*
> + % of the procedures in the TSCC.
> + %
> + % Parameter passing between the procedures in a TSCC is done by
> + % giving each procedure its own list of lvn_tscc_proc_input_vars
> + % (representing the formal input parameters) and having each tail call
> + % assign the actual input parameters to these. Results are returned
> + % through lvn_tscc_output_vars, each of which contains the address
> + % of the memory slot where the result should be stored.
> + % Both lvn_tscc_proc_input_vars and lvn_tscc_output_vars include
> + % an argument sequence number, with separate sequences for input and
> + % output arguments. These sequences start at 1, so all top level calls
> + % to ml_gen_tscc_arg_decls should pass 1 as both NextInArgNum0 and
> + % NextOutArgNum0. ml_gen_tscc_arg_decls returns the total number of
> + % output arguments, to allow our caller to do a sanity check (requiring
> + % all procedures in a TSCC to have the same number of output arguments).
> + %
> + % ml_gen_tscc_arg_decls will record the names of the output arguments
> + % in !OutArgNames, if the initial value of !.OutArgNames contains no
> + % such record.
> + %
> + % The value returned in TsccInArgs will be a list of the
> + % lvn_tscc_proc_input_vars of the input arguments only. This is used
> + % during the generation of code for tailcalls.
> + %
> + % The value returned in TsccInLocalVarDefns will be a local var definition
> + % of each variable in TsccInArgs. This is used to define those variables
> + % in copies of the TSCC code that are *not* for this procedure.
> + %
> + % The value returned in TsccArgs a list of all the arguments of the
*is* a list of
> + % procedure, both input and output, in declaration order. This will be
> + % the argument list of the copy of the TSCC for this procedure,
> + % and thus it will define the lvn_tscc_proc_input_vars in that copy.
> + % (Thus the lvn_tscc_proc_input_vars will be defined by exactly one of
> + % TsccInLocalVarDefns and TsccInArgs in each copy.)
> + %
> + % Each procedure's part in the TSCC defines local MLDS variables for each
> + % argument, both input and output (returned in OwnLocalVarDefns), and
> + % its code starts by copying the values of each lvn_tscc_proc_input_var
> + % *and* lvn_tscc_output_var to the local var (from OwnLocalVarDefns)
> + % corresponding to them. For input argument, this copies the value;
> + % for output arguments, this copies the address where the argument's value
> + % should be put. These assignments are returned in CopyTsccToOwnStmts.
> + %
...
> diff --git a/compiler/ml_gen_info.m b/compiler/ml_gen_info.m
> index 1fbf7dc..a20c7cb 100644
> --- a/compiler/ml_gen_info.m
> +++ b/compiler/ml_gen_info.m
...
> %---------------------------------------------------------------------------%
> %
> -% Initialize the ml_gen_info.
> +% Initialize the ml_gen_info, which contains the state of the code generator
> +% while it generates MLDS code for a HLDS procedure.
> +%
> +% When the HLDS procedure is part of a TSCC of several mutually-tail-recursive
> +% procedures, we bundle we generate for each procedure into a single piece of
... we bundle *the code* we generate ...
> +% code for *each* entry procedure of the TSCC. To help prevent accidental name
> +% collisions between the compiler-generated local variables of the different
> +% procedures, we get those procedures to use non-overlapping sets of sequence
> +% numbers in the names of those compiler-generated variables, by
> %
> +% - creating an initial set of counters (the sources of those sequence numbers)
> +% before starting to generate code for a TSCC, and
> +% - threading the values of those counters from one procedure to the next,
> +% by calling ml_gen_info_final to pick up the values of those counters
> +% after code generation is finished for one procedure, and passing them
> +% to ml_gen_info_init when starting to generate code for the next procedure.
> +%
> +% The ml_gen_tscc_info structure contains not just these counters, but also
> +% information about tail recursion. By definition, the procedures in a TSCC
> +% contain tail recursive calls to the same set of procedures (the procedures
> +% of the TSCC), but TSCCs are computed from calls in the HLDS. A call that is
> +% a tail call in the HLDS is sometimes not a tail call in the MLDS, which can
> +% happen if making the call a tail call would leave a dangling reference.
> +% We therefore need to record whether each procedure in the TSCC has an
> +% *MLDS* tail call generated to it from any of the *other* procedures
> +% in the TSCC, because if it doesn't, then its MLDS code isn't actually
> +% mutually-tail-recursive, and it should be handled as such. (This is because
> +% the code we generate for only *self*-tail-recursive procedures has lower
> +% overhead than the code we generate for *mutually*-tail-recursive procedures.)
...
> diff --git a/compiler/ml_proc_gen.m b/compiler/ml_proc_gen.m
> index 950913c..609b7b6 100644
> --- a/compiler/ml_proc_gen.m
> +++ b/compiler/ml_proc_gen.m
...
> :- func project_pred_proc_id_info_id(pred_proc_id_info) = pred_proc_id.
>
> -project_pred_proc_id_info_id(pred_proc_id_info(PredProcId, _, _)) = PredProcId.
> +project_pred_proc_id_info_id(PredProcIdInfo) = PredProcId :-
> + PredProcIdInfo = pred_proc_id_info(PredProcId, _, _, _).
>
> + % Partition the procedures in an SCC into following four categories.
into *the* following four categories.
> + %
> + % - Those that don't contain any tail calls (NoneIdInfos).
> + % - Those that in which we can only optimize self recursive tail calls,
> + % either because they don't contain any mutually tail recursive calls,
> + % or because we can't (yet) optimize the mutually tail recursive calls
> + % they do contain (SelfIdInfos).
> + % - Those model_det procedures in which we can optimize mutually tail
> + % recursive calls (MutualDetIdInfos).
> + % - Those model_semi procedures in which we can optimize mutually tail
> + % recursive calls (MutualSemiIdInfos).
> + %
> :- pred partition_scc_procs(module_info::in, list(pred_proc_id)::in,
> list(pred_proc_id_info)::out, list(pred_proc_id_info)::out,
> - list(pred_proc_id_info)::out) is det.
> + list(pred_proc_id_info)::out, list(pred_proc_id_info)::out) is det.
...
> @@ -550,26 +600,832 @@ construct_func_defn(ModuleInfo, PredProcIdInfo, FuncParams, FuncBody,
...
> +ml_gen_tscc(ModuleInfo, Target, ConstStructMap, _SCCEntryPredProcIds,
> + TsccCodeModel, TSCCE, !FuncDefns, !GlobalData, !Specs) :-
> + TSCCE = scc_with_entry_points(PredProcIds, _CalledFromHigherTSCCs,
> + _ExportedTSCCPredProcIds),
> + PredProcIdList = set.to_sorted_list(PredProcIds),
> + (
> + PredProcIdList = [],
> + unexpected($pred, "empty TSCC")
> + ;
> + PredProcIdList = [SinglePredProcId],
> + % For a TSCC containing just one procedure, we neither need nor want
> + % the extra overhead required for managing *mutual* tail recursion.
> + ml_gen_proc_lookup(ModuleInfo, Target, ConstStructMap, self_tail_rec,
> + SinglePredProcId, !FuncDefns, !GlobalData, !Specs)
> + ;
> + PredProcIdList = [_, _ | _],
> + % Try to compile each procedure in the TSCC into the MLDS code
> + % fragment that will go under "top_of_proc_i" for each i.
> + %
> + % Caveat 1:
> + % By definition, every procedure in the TSCC has at least one
> + % mutually-tail-recursive call to it from another procedure
> + % in the TSCC. However, the TSCC is computed from the HLDS.
> + % Some calls that look like tail calls in the HLDS turn out
> + % *not* to be tail calls in the MLDS, because they pass as input
> + % arguments the addresses of local variables in the caller's stack
> + % frame, addresses that would become dangling references if the
> + % call were made a tail call. If it turns out that *none* of the
> + % mutually-tail-recursive calls in the HLDS to a procedure
> + % turn out to be mutually-tail-recursive calls in the MLDS,
> + % then including that procedure in the scheme we use above
> + % would be a waste; it would just incur overhead for no gain
> + % in either stack usage or speed. ml_gen_tscc_trial will return
> + % the identities of such procedures in NoMutualPredProcIds.
> + % All the procedures in MutualPredProcIds will have mutually-tail-
> + % recursive calls to them in the *M*LDS. (Therefore MutualPredProcIds
> + % may have 0 elements, or 2, or 3, or any other number except 1.)
> + %
> + % Caveat 2:
> + % If the MLDS code we generated for any of MutualPredProcIds contained
> + % any function definitions nested inside of them, then we have problem.
*a* problem
> + % The scheme shown above would create M copies of each such nested
> + % function definition, with M > 1. In the usual case of us generating
> + % code non-nested code, ml_elim_nested.m would hoist out such nested
Delete the first "code" there.
> + % definitions M times, leading to the M copies of the hoisted-out
> + % definitions. Therefore, until we teach ml_elim_nested.m not to do
> + % that, we abandon the optimization of mutually recursive tail calls
> + % in the presence of such nested function definitions.
Julien.
More information about the reviews
mailing list