[m-rev.] for review: direct_arg_in_out

Peter Wang novalazy at gmail.com
Mon Jan 4 18:17:10 AEDT 2021


On Wed, 30 Dec 2020 22:50:46 +1100 "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> For review by Peter, since he implemented the direct_arg optimization.
> 
> Note that I have already committed the change to compiler/builtin_ops.m,
> since some of the changes in this diff require it to be present in the
> installed compiler.
> 
> Zoltan.

> Fix filling in partial terms that use direct_arg tags.
> 
> This fix uses the approach discussed on m-dev 2020 nov 16/17 for fixing
> github issue #72, whose core problem is a need for information flow
> back to a the caller from a callee when the callee fills in the
> argument of a function symbol whose representation is a direct_arg tag.
> In most cases when the callee fills in the value of an argument,
> the caller can see it because the argument is in a word on the heap,
> but when the function symbol uses a direct_arg tag, that is not the case.
> 
> compiler/direct_arg_in_out.m:
>     A new module that implements the transformation proposed on m-dev.
>     It creates a fresh clone variable every time an argument of a direct_arg
>     tag function symbol is (or may be) updated. This can happen several
>     times if a type has more than one function symbol with a direct_arg tag.
>     Since the affect variable can be bound to only one function symbol

affected

> diff --git a/compiler/direct_arg_in_out.m b/compiler/direct_arg_in_out.m
> index e69de29bb..2fff65aa0 100644
> --- a/compiler/direct_arg_in_out.m
> +++ b/compiler/direct_arg_in_out.m
> @@ -0,0 +1,2016 @@
> +%---------------------------------------------------------------------------%
> +% vim: ft=mercury ts=4 sw=4 et
> +%---------------------------------------------------------------------------%
> +% Copyright (C) 2020 The Mercury team.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%---------------------------------------------------------------------------%
> +%
> +% File: direct_arg_in_out.m.
> +% Main author: zs.
> +%
> +% This module addresses a problem that can arise when a procedure fills in
> +% one or more fields in an argument that was originally passed to it
> +% in a partially instantated form.
> +%
> +% In the vast majority of cases, such arguments need no special handling.
> +% The caller passes a tagged pointer to a partially-filled-in heap cell,
> +% and the callee simply fills in the parts of the heap cell corresponding
> +% to the field or fields that it instantiates. When the callee returns,
> +% the caller will find those fields filled in.
> +%
> +% The problem arises when the function symbol whose field is being filled in
> +% has the direct_arg representation. This representation is applicable
> +% only to function symbols with a single argument, and only when the argument
> +% type's representation guarantees that the argument's value will contain
> +% all zeroes in its primary tag bits. In such cases, the compiler
> +% represents the function with a direct_arg_tag(N) cons tag, which means that
> +% the representation of this function symbol, applied to its single argument,
> +% will be the value of the argument, with the guaranteed-to-be-zero bits
> +% in the argument value replaced by N. (This primary tag value may, or may not,
> +% be needed to distinguish this function symbol from any other function symbols
> +% in the whole term's type.)
> +%
> +% The problem that this module handles is that when a callee fills in
> +% the argument value of such a term, this update affects only the callee's
> +% own local variables. It does *not* affect any heap cells, nor anything
> +% else that the caller can see. Without compensation for this effect,
> +% the translated program will contain a bug. (See test cases gh72[ab...].m
> +% in tests/hard_coded.) This module is the needed compenstation.

compensation

> +% The third part of the solution is to consistently modify all procedure
> +% bodies to implement that idea. When we find a call to a daio procedure,
> +% we create clones of all its daio variables, update the argument vector
> +% to pass the clones of the daio arguments as well, redirect the call
> +% to the daio clone of the callee procedure, we ensure that we never again
> +% refer to the original, pre-clone version of each such daio variable.
> +% In straight-line code that follows such calls, we can achieve this
> +% by simple substitution, but we also have to handle situations in which
> +% a branched control structure (if-then-else, disjunction, switch or
> +% atomic goal) may need to clone different sets of daio variables in its

(we should consider another name for 'atomic' goals to avoid such a
sentence, assuming someone was to work on TM again)

> +%
> +% This optimization is one reason why mercury_compile_middle_passes.m invokes
> +% this module after the lambda expansion transformation. Another reason is that
> +% it allows us to transform higher order calls the same way as we do plain
> +% calls, provided we transform every reference to a daio procedure in
> +% unifications that create closures to the clone of that daio procedure.
> +% (The arguments that we put inside closures cannot be daio arguments,
> +% since such arguments must be ground.)
> +%

I think there should be a short comment explaining why the alternative of
changing the calling convention was not pursued, if only to make it clear
it was considered.

> +
> +:- pred is_direct_arg_in_out_posn(module_info::in, vartypes::in,
> +    prog_var::in, mer_mode::in, is_mode_daio::out) is det.
> +
> +is_direct_arg_in_out_posn(ModuleInfo, VarTypes, Var, Mode, IsDAIO) :-
> +    module_info_get_type_table(ModuleInfo, TypeTable),
> +    lookup_var_type(VarTypes, Var, Type),
> +    ( if
> +        type_to_ctor(Type, TypeCtor),
> +        search_type_ctor_defn(TypeTable, TypeCtor, TypeDefn)
> +    then
> +        get_type_defn_body(TypeDefn, TypeBody),
> +        (
> +            TypeBody = hlds_du_type(_, _, MaybeRepn, _),
> +            (
> +                MaybeRepn = no,
> +                unexpected($pred, "MaybeRepn = no")
> +            ;
> +                MaybeRepn = yes(Repn)
> +            ),
> +            CtorRepns = Repn ^ dur_ctor_repns,
> +            gather_direct_arg_functors(CtorRepns, [], DirectArgFunctors),
> +            (
> +                DirectArgFunctors = [],
> +                IsDAIO = mode_is_not_daio
> +            ;
> +                DirectArgFunctors = [_ | _],
> +                FromToInsts = mode_to_from_to_insts(ModuleInfo, Mode),
> +                FromToInsts = from_to_insts(FromInst0, ToInst0),
> +                inst_expand_and_remove_constrained_inst_vars(ModuleInfo,
> +                    FromInst0, FromInst),
> +                inst_expand_and_remove_constrained_inst_vars(ModuleInfo,
> +                    ToInst0, ToInst),
> +                IsDAIO = mode_needs_direct_arg_in_out(ModuleInfo,
> +                    DirectArgFunctors, FromInst, ToInst)
> +            )
> +        ;
> +            ( TypeBody = hlds_eqv_type(_)
> +            ; TypeBody = hlds_foreign_type(_)
> +            ; TypeBody = hlds_solver_type(_)
> +            ; TypeBody = hlds_abstract_type(_)
> +            ),
> +            IsDAIO = mode_is_not_daio

I'll assume hlds_eqv_type and hlds_abstract_type are correct here.

> +%---------------------------------------------------------------------------%
> +
> +do_direct_arg_in_out_transform_in_module(DirectArgProcMap,
> +        !ModuleInfo, !:Specs) :-
> +    % Phase one: for every daio procedure, create a clone procedure
> +    % that includes clones every daio argument variable

full stop

> +    % Then delete the original procedure, to ensure that later passes
> +    % detect any references to them that were accidentally left by phase two.

> +:- pred daio_mode_to_mode_pair(module_info::in, mer_mode::in,
> +    mer_mode::out, mer_mode::out) is det.
> +
> +daio_mode_to_mode_pair(ModuleInfo, Mode, ClobberedMode, CloneMode) :-
> +    FromToInsts = mode_to_from_to_insts(ModuleInfo, Mode),
> +    FromToInsts = from_to_insts(FromInst, ToInst),
> +    ClobberedFromInst = clobber_daio_inst(ModuleInfo, FromInst),
> +    ClobberedFromToInsts = from_to_insts(FromInst, ClobberedFromInst),
> +    ClobberedMode = from_to_insts_to_mode(ClobberedFromToInsts),
> +    CloneMode = from_to_mode(free, ToInst).

It seems unnecessary to go through from_to_insts, but it's up to you.

> +
> +%---------------------%
> +
> +:- pred merge_disj_goals_varmaps(direct_arg_var_map::in, list(prog_var)::in,
> +    assoc_list(prog_var, prog_var)::out,
> +    list(arm_varmap(G))::in, list(arm_varmap(G))::out,
> +    daio_info::in, daio_info::out) is det <= goal_like(G).
> +
> +merge_disj_goals_varmaps(_, [], [], !GoalsVarMaps, !Info).
> +merge_disj_goals_varmaps(EntryVarMap, [OrigVar | OrigVars],
> +        [OrigVar - MergeVar | OrigMergeVars], !GoalsVarMaps, !Info) :-
> +    ( if
> +        bimap.search(EntryVarMap, OrigVar, EntryVar),
> +        entry_var_is_ever_changed(OrigVar, EntryVar, !.GoalsVarMaps) = no
> +    then
> +        MergeVar = EntryVar
> +    else
> +        make_new_clone_var(OrigVar, MergeVar, !Info),
> +        % Note that the assignment we add here would be accepted
> +        % by mode analysis if it appeared in the source code *only*
> +        % if OrigVar is ground. If it could still be only partially
> +        % instantiated, as happens in e.g. tests/hard_coded/gh72.m,
> +        % mode analysis would reject the code we create here,
> +        % because neither of the two variables being unified here
> +        % would be ground. However, this pass is done *after* mode analysis,
> +        % so this is not an issue, *unless* XXX some later pass repeats
> +        % a full mode analysis. (The recomputation of instmap deltas,
> +        % which we invoke once the transformation of the procedure body
> +        % is complete, does not care about such details.)

Delete XXX?

> +        %
> +        % The assignment we do here is semantically ok because it does not
> +        % actually create a free-free alias between the two variables,
> +        % since this unification is both the last appearance of (the current
> +        % version of) OrigVar and the first appearance of MergeVar.
> +        % Free-free alias is a problem *only* between two variables
> +        % that can be alive at the same time. This pass ensures the
> +        % disjointness of the two variables' lifefimes by construction,
> +        % but (in the absence of alias tracking) mode analysis cannot
> +        % check this.
> +        add_assign_of_merge_var(OrigVar, MergeVar, !GoalsVarMaps)
> +    ),
> +    merge_disj_goals_varmaps(EntryVarMap, OrigVars, OrigMergeVars,
> +        !GoalsVarMaps, !Info).

> diff --git a/compiler/hlds_module.m b/compiler/hlds_module.m
> index 14f2338b6..651455266 100644
> --- a/compiler/hlds_module.m
> +++ b/compiler/hlds_module.m
> @@ -136,6 +137,28 @@
>                                          pragma_info_type_spec)
>              ).
>  
> +    % Once filled in by simplify_proc.m (for all non-lambda procedures)
> +    % and by lambda.m (for procedures created to implement lambda expressions),
> +    % this map should have an entry for every procedure that is of interest to
> +    % direct_arg_in_out.m. A procedure may be of interest to that module
> +    % because it has one or more arguments that it needs to clone,
> +    % or because it has one or more arguments whose modes to not specify

do not

> +    % whether they need to be cloned, and for which therefore it should
> +    % generate a "sorry, not implemented" message. In both cases, the
> +    % one_or_more(int) specify the argument positions involved; in the latter
> +    % case, we also record the list of arguments for which we *could* tell
> +    % they need to be cloned.
> +    %

> diff --git a/compiler/hlds_out_goal.m b/compiler/hlds_out_goal.m
> index 34da9d436..3f1eb0253 100644
> --- a/compiler/hlds_out_goal.m
> +++ b/compiler/hlds_out_goal.m
> @@ -936,6 +941,13 @@ write_goal_unify(Info, Stream, ModuleInfo, VarSet, TypeQual, VarNamePrint,
>          then
>              true
>          else
> +            % XXX While Follow strings should never contain newlines,
> +            % some callers do pass them.
> +            ( if string.contains_char(Follow, '\n') then
> +                true
> +            else
> +                io.nl(Stream, !IO)
> +            ),
>              write_unification(Info, Stream, ModuleInfo, VarSet, InstVarSet,
>                  VarNamePrint, Indent, Unification, !IO)
>          )

Or string.suffix(Follow, "\n"), but not important.

> diff --git a/compiler/inst_util.m b/compiler/inst_util.m
> index 2bd94c83d..4da1f24df 100644
> --- a/compiler/inst_util.m
> +++ b/compiler/inst_util.m
> @@ -186,17 +186,23 @@ inst_expand_and_remove_constrained_inst_vars(ModuleInfo, !Inst) :-
>  %---------------------------------------------------------------------------%
>  
>  abstractly_unify_inst(Live, InstA, InstB, Real, Inst, Detism, !ModuleInfo) :-
> -    % Check whether this pair of insts is already in the unify_insts table.
> -    module_info_get_inst_table(!.ModuleInfo, InstTable0),
> -    inst_table_get_unify_insts(InstTable0, UnifyInstTable0),
> +    % We avoid infinite loops by checking whether the InstA-InstB
> +    % pair of insts is already in the unify_insts table.
> +    %
>      % XXX For code that uses large facts, the deeply nested insts we unify
> -    % here means that searching UnifyInsts0 here, and updating it (twice)
> -    % in the else case below are *extremely* expensive. In one version of
> -    % Doug Auclair's training_cars example, the map search, insert and update
> -    % account for 116 out the 120 clock ticks spent in this predicate,
> +    % here means that searching the unify_insts table here, and updating it
> +    % (twice) in the else case below are *extremely* expensive. In one version
> +    % of Doug Auclair's training_cars example, the map search, insert and
> +    % update account for 116 out the 120 clock ticks spent in this predicate,
>      % i.e. they account for almost 97% of its runtime.
>      %
> -    % We now combine the lookup with one of the updates.
> +    % We reduce the expense of these operations in two ways.
> +    %
> +    % First, we make each instance of the operation cheaper, by combing
> +    % the lookup with one of the updates. We do this by having

combining

> diff --git a/compiler/term_constr_initial.m b/compiler/term_constr_initial.m
> index 19ff91b1f..a629d433e 100644
> --- a/compiler/term_constr_initial.m
> +++ b/compiler/term_constr_initial.m
> @@ -531,30 +530,151 @@ set_builtin_terminates([ProcId | ProcIds], PredId, PredInfo, ModuleInfo,
>      = constraints.
>  
>  process_no_type_info_builtin(PredName, HeadVars, SizeVarMap) = Constraints :-
> -    ( if
> -        HeadVars = [HVar1, HVar2],
> +    % This predicate should handle every predicate listed by
> +    % no_type_info_builtin in mdbcomp/program_representation.m.
> +    %
> +    % NOTE We assume that no PredName occurs in more than one builtin module,
> +    % which is a fragile assumption.
> +    (
...
> +    ;
> +        HeadVars = [_, _, _],
> +        ( if
> +            ( PredName = "compare_local_uint_words"
> +            ; PredName = "semidet_call_3"
> +            ; PredName = "superclass_from_typeclass_info"
> +            ; PredName = "table_lookup_insert_typeclassinfo"
> +            ; PredName = "table_lookup_insert_typeinfo"
> +            ; PredName = "table_restore_any_answer"
> +            ; PredName = "type_info_from_typeclass_info"
> +            ; PredName = "unconstrained_type_info_from_typeclass_info"
>              )

no_type_info_builtin also lists
     PredName = "instance_constraint_from_typeclass_info"

> diff --git a/tests/hard_coded/gh72.m b/tests/hard_coded/gh72.m
> index e69de29bb..8a8317b45 100644
> --- a/tests/hard_coded/gh72.m
> +++ b/tests/hard_coded/gh72.m
> @@ -0,0 +1,294 @@
> +%-----------------------------------------------------------------------------%
> +% vim: ft=mercury ts=4 sw=4 et
> +%-----------------------------------------------------------------------------%
> +%
> +% This test exhibits many instances of the problem demonstrated by gh72[ab].
> +% The instances are designed to stress-test the compilation transformation
> +% that implements the fix (compiler/direct_arg_in_out.m).
> +%

Do we have a test where the merged variable is used after a branching goal?
It might be worth adding a test involving if-then-else.

The rest seems fine.

Peter


More information about the reviews mailing list