[m-rev.] for review: [CTGC] direct structure reuse analysis (2/2)

Julien Fischer juliensf at cs.mu.OZ.AU
Tue May 2 15:25:35 AEST 2006


On Thu, 27 Apr 2006, Nancy Mazur wrote:

> +:- pred unification_verify_reuse(module_info::in, proc_info::in,
> +    hlds_goal_info::in, unification::in, program_point::in,
> +    sharing_as::in, dead_cell_table::in, dead_cell_table::out) is det.
> +
> +unification_verify_reuse(ModuleInfo, ProcInfo, GoalInfo, Unification,
> +    PP, Sharing, !DeadCellTable):-
> +    (
> +        Unification = deconstruct(Var, ConsId, _, _, _, _),
> +        LFU = goal_info_get_lfu(GoalInfo),
> +        LBU = goal_info_get_lbu(GoalInfo),
> +        (
> +            % Reuse is only relevant for real constructors, with
> +            % arity different from 0.
> +            ConsId = cons(_, Arity),
> +            Arity \= 0,
> +
> +            % Check if the top cel datastructure of Var is not live.

s/cel/cell/

> +            % If Sharing is top, then everything should automatically
> +            % be considered live, hence no reuse possible.
> +            \+ sharing_as_is_top(Sharing),
> +
> +            % Check the live set of data structures at this program point.
> +            set.union(LFU, LBU, LU),
> +            LU_data = set.to_sorted_list(set.map(datastruct_init, LU)),
> +            LiveData = list.condense(
> +                list.map(
> +                    extend_datastruct(ModuleInfo, ProcInfo, Sharing),
> +                    LU_data)),
> +            \+ var_is_live(Var, LiveData)
> +        ->
> +            % If all the above conditions are met, then the top
> +            % cell data structure based on Var is dead right after
> +            % this deconstruction, hence, may be involved with
> +            % structure reuse.
> +            NewCondition = reuse_condition_init(ModuleInfo,
> +                ProcInfo, Var, LFU, LBU, Sharing),
> +            dead_cell_table_set(PP, NewCondition, !DeadCellTable)
> +        ;
> +            true
> +        )
> +    ;
> +        Unification = construct(_, _, _, _, _, _, _)
> +    ;
> +        Unification = assign(_, _)
> +    ;
> +        Unification = simple_test(_, _)
> +    ;
> +        Unification = complicated_unify(_, _, _)

If a complicated unification is encountered at this stage of compilation it is
an error, so just call unexpected in that case.

...

> Index: compiler/structure_reuse.direct.m
> ===================================================================
> RCS file: compiler/structure_reuse.direct.m
> diff -N compiler/structure_reuse.direct.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ compiler/structure_reuse.direct.m	27 Apr 2006 09:30:41 -0000
> @@ -0,0 +1,328 @@
> +%-----------------------------------------------------------------------------%
> +% vim: ft=mercury ts=4 sw=4 et
> +%-----------------------------------------------------------------------------%
> +% Copyright (C) 2006 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% File: structure_reuse.direct.m
> +% Main authors: nancy
> +%
> +% Procedures and data related to the detection of so called direct reuses

s/data/types/

> +% within the CTGC system. A "direct reuse" is a combination of the location of
> +% a deconstruction unification (where a datastructure may become garbage under
> +% certain conditions) and a set of locations of construction unifications where
> +% the garbage datastructure can be reused locally.
> +%
> +% Direct reuse analysis requires two steps:
> +% - detecting where datastructures may become garbage;
> +% - and finding where these garbage datastructures can be reused.
> +%
> +%-----------------------------------------------------------------------------%
> +
> +:- module transform_hlds.ctgc.structure_reuse.direct.
> +
> +:- interface.
> +
> +:- import_module hlds.hlds_module.
> +:- import_module transform_hlds.ctgc.structure_reuse.domain.
> +:- import_module transform_hlds.ctgc.structure_sharing.domain.
> +
> +:- import_module io.
> +
> +:- pred direct_reuse_pass(sharing_as_table::in, module_info::in,
> +    module_info::out, reuse_as_table::in, reuse_as_table::out,
> +    io::di, io::uo) is det.
> +

...

> +    % The strategy for determining the reuse possibilities, i.e., either
> +    % reuse is only allowed between terms that have exactly the same cons_id,
> +    % or reuse is also allowed between terms that have different cons_id, yet
> +    % where the difference in arity is not bigger than a given threshold.
> +    %
> +:- type strategy
> +    --->    same_cons_id
> +    ;       within_n_cells_difference(int).
> +

I suggest calling that type reuse_strategy rather than strategy.

> +    % Determine the strategy that was set by the user.
> +    %
> +:- pred get_strategy(strategy::out, module_info::in, module_info::out,
> +    io::di, io::uo) is det.
> +
> +get_strategy(Strategy, !ModuleInfo, !IO):-
> +    io_lookup_string_option(structure_reuse_constraint, ConstraintStr, !IO),
> +    (
> +        ConstraintStr = "same_cons_id"
> +    ->
> +        Strategy = same_cons_id
> +    ;
> +        ConstraintStr = "within_n_cells_difference"
> +    ->
> +        io_lookup_int_option(structure_reuse_constraint_arg, NCells, !IO),
> +        Strategy = within_n_cells_difference(NCells)
> +    ;
> +        Strategy = same_cons_id,
> +        Pieces = [words("error: Invalid argument to "),
> +            words("`--structure-reuse-constraint.'")],
> +        write_error_pieces_plain(Pieces, !IO),
> +        module_info_incr_errors(!ModuleInfo)
> +    ).
> +

...

> +    % Process one individual procedure.
> +    %
> +:- pred direct_reuse_process_procedure(strategy::in, sharing_as_table::in,
> +    pred_id::in, proc_id::in, module_info::in, proc_info::in, proc_info::out,
> +    reuse_as::out, io::di, io::uo) is det.
> +
> +direct_reuse_process_procedure(Strategy, SharingTable, PredId, ProcId,
> +    ModuleInfo, !ProcInfo, ReuseAs, !IO):-

The above line needs to be indented by one more level.

> +    io_lookup_bool_option(very_verbose, VeryVerbose, !IO),
> +
> +    write_proc_progress_message("% Direct reuse analysis of ",
> +        PredId, ProcId, ModuleInfo, !IO),
> +
> +    proc_info_get_goal(!.ProcInfo, Goal0),
> +
> +    % Determine the deconstructions in which data may potentially become
> +    % garbage.
> +    determine_dead_deconstructions(ModuleInfo, !.ProcInfo, SharingTable,
> +        Goal0, DeadCellTable),
> +    dead_cell_table_maybe_dump(VeryVerbose, DeadCellTable, !IO),
> +
> +    % Determine how the detected dead datastructures can be reused.
> +    % This annotates the goal with potential reuses.
> +    determine_reuse(Strategy, ModuleInfo, !.ProcInfo, DeadCellTable,
> +        Goal0, Goal, ReuseAs, !IO),
> +
> +    proc_info_set_goal(Goal, !ProcInfo),
> +    maybe_write_string(VeryVerbose, "% reuse analysis done.\n", !IO).
> +
> +

...

> +    % To record the place at which a data structure possible becomes garbage,
> +    % we use the notion of a program point. A program point is unique using
> +    % the combination of the context of the goal, and its goal_path.
> +    %
> +:- type program_point
> +    --->    pp(
> +                pp_context  ::  term.context,
> +                pp_path     ::  goal_path
> +            ).

It's probably worth adding a comment that the context is only required
for debugging traces.

...

> Index: compiler/structure_reuse.domain.m
> ===================================================================
> RCS file: compiler/structure_reuse.domain.m
> diff -N compiler/structure_reuse.domain.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ compiler/structure_reuse.domain.m	27 Apr 2006 09:30:42 -0000
> @@ -0,0 +1,438 @@

...

> +:- module transform_hlds.ctgc.structure_reuse.domain.
> +
> +:- interface.
> +
> +:- import_module hlds.hlds_module.
> +:- import_module hlds.hlds_pred.
> +:- import_module parse_tree.prog_data.
> +:- import_module transform_hlds.ctgc.structure_sharing.domain.
> +
> +:- import_module map.
> +:- import_module set.
> +:- import_module list.
> +
> +:- interface.

There's a redundant interface declaration here.

...

> +:- type reuse_condition
> +    --->    always
> +    ;       condition(
> +                reuseable_nodes     ::  list(datastruct),
> +                    % Description of the datastructures pointing to the
> +                    % memory that can be reused within a procedure.
> +                local_use_headvars  ::  list(datastruct),
> +                    % Set of (headvar-related) datastructures that are
> +                    % inherently live at the place where the reuse is
> +                    % decided.
> +                local_sharing_headvars  ::  sharing_as
> +                    % Description of the (headvar-related) structure sharing
> +                    % that exists at the place where the reuse is decided.
> +            ).

Document what `always' means here.

...

> +reuse_condition_init(ModuleInfo, ProcInfo, DeadVar, LFU, LBU,
> +        Sharing) = Condition :-
> +    proc_info_get_headvars(ProcInfo, HeadVars),
> +
> +    % First determine the nodes to which the reuse is related.
> +    % There are two cases:
> +    % 1. Var is a headvar, then it is sufficient to keep the topcel of that Var

s/topcel/top cell/? (and below)

> +    %   as only node. HeadVar-datastructures shared with that node will still
> +    %   be retraceable at the moment of verifying the condition
> +    % 2. Var is a local var, then we must compute all the headvar-
> +    %   datastructures sharing the same memory representation as the topcel of
> +    %   this var (note that the datastructures that share with some
> +    %   substructure of this var are not relevant for the nodes). All the
> +    %   obtained datastructures are kept as the nodes for our condition.
> +    TopCel = ctgc.datastruct.datastruct_init(DeadVar),
> +    (
> +        list__member(DeadVar, HeadVars)
> +    ->
> +        Nodes = [TopCel]
> +    ;
> +        SharedDatastructs = extend_datastruct(ModuleInfo, ProcInfo,
> +            Sharing, TopCel),
> +        Nodes = datastructs_project(HeadVars, SharedDatastructs)
> +    ),
> +
> +    % It is possible that the obtained set of nodes is empty, in that
> +    % case the condition is always satisfied, independent of any calling
> +    % environment.
> +    (
> +        Nodes = []
> +    ->
> +        Condition = always
> +    ;
> +        set__union(LFU, LBU, LU),
> +            % XXX the old implementation did not bother about extending at
> +            % this place, which was contrary to the theory. Check the effect
> +            % of this change!
> +        SharedLU = list.condense(
> +            list.map(extend_datastruct(ModuleInfo, ProcInfo, Sharing),
> +                list.map(datastruct_init, set.to_sorted_list(LU)))),
> +        HeadVarSharedLU = datastructs_project(HeadVars, SharedLU),
> +
> +        structure_sharing.domain.sharing_as_project(HeadVars, Sharing,
> +            HeadVarSharing),
> +        Condition = condition(Nodes, HeadVarSharedLU, HeadVarSharing)
> +    ).
> +

...

> +:- pred backward_use_in_goal(module_info::in, proc_info::in,
> +    hlds_goal::in, hlds_goal::out, set(prog_var)::in, set(prog_var)::out)
> +    is det.
> +
> +backward_use_in_goal(ModuleInfo, ProcInfo, !TopGoal, !LBU) :-
> +    !.TopGoal = Expr0 - Info0,
> +
> +    % Add resume_vars to the LBU-set.
> +    set.union(get_backtrack_vars(Info0), !LBU),
> +
> +    % Handle each goal type separately:
> +    (
> +        Expr0 = unify(_, _, _, _, _),
> +        Expr = Expr0
> +    ;
> +        Expr0 = call(_,_, _, _, _, _),
> +        goal_info_get_determinism(Info0, Det),
> +        (
> +            determinism_is_nondet(Det)
> +        ->
> +            % Implementation of Instantiation 2 from Nancy's Phd.
> +            % In this instantation, a non deterministic procedure
> +            % call only adds its LFU-variables to the current set
> +            % of lbu-variables. Cf. Phd Nancy Mazur.
> +
> +            goal_info_get_pre_births(Info0, PreBirths),
> +            goal_info_get_post_births(Info0, PostBirths),
> +            !:LBU = set.union_list([goal_info_get_lfu(Info0),
> +                 PreBirths, PostBirths, !.LBU])
> +        ;
> +            true
> +        ),
> +        Expr = Expr0
> +    ;
> +        Expr0 = generic_call(_, _, _, _),
> +        Expr = Expr0
> +    ;
> +        % to !LBU.
> +        Expr0 = foreign_proc(_, _, _, _, _, _),
> +        Expr = Expr0
> +    ;
> +        Expr0 = conj(ConjType, Goals0),
> +        backward_use_in_conj(ModuleInfo, ProcInfo,
> +                Goals0, Goals, !LBU),
> +        Expr = conj(ConjType, Goals)
> +    ;
> +        Expr0 = disj(Goals0),
> +        backward_use_in_disj(ModuleInfo, ProcInfo, Goals0, Goals, !LBU),
> +        Expr = disj(Goals)
> +    ;
> +        Expr0 = switch(A, B, Cases0),
> +        backward_use_in_cases(ModuleInfo, ProcInfo, Cases0, Cases, !LBU),
> +        Expr = switch(A, B, Cases)
> +    ;
> +        Expr0 = not(Goal0),
> +        % handled as: if(Goal0) then fail else true
> +        LBU0 = !.LBU,
> +        backward_use_in_goal(ModuleInfo, ProcInfo, Goal0, Goal, !.LBU, _),
> +        % A not does not introduce any choice-points! Hence the
> +        % not itself is deterministic, and no new variables in LBU
> +        % are introduced into the resulting LBU-set.
> +        !:LBU = LBU0,
> +        Expr = not(Goal)
> +    ;
> +        Expr0 = scope(Reason, SomeGoal0),
> +        backward_use_in_goal(ModuleInfo, ProcInfo, SomeGoal0, SomeGoal, !LBU),
> +        Expr = scope(Reason, SomeGoal)
> +    ;
> +        % XXX The implementation for if-then-else is different from the theory
> +        % in the thesis.  We can obtain more precision when the Condition-goal
> +        % is deterministic, which means that the Then-goal can be analysed with
> +        % the initial LBU-set (instead of the set obtained after the analysis
> +        % of the Condition-goal).
> +        Expr0 = if_then_else(Vars, Cond0, Then0, Else0),
> +        LBU0 = !.LBU,
> +
> +            % Annotate Cond-goal.
> +        backward_use_in_goal(ModuleInfo, ProcInfo, Cond0, Cond, LBU0, _),
> +
> +            % Annotate Then-goal.
> +            % When annotating the then-part, the lbu used for it should not
> +            % contain the resume-vars due to the else part.
> +            % trick: to calculate inital LBU for the Then-goal, we set the
> +            % resume-point of the condition to no_resume_point.
> +        Cond0 = CondGoal0 - CondInfo0,
> +        goal_info_set_resume_point(no_resume_point, CondInfo0, InfoTmp),
> +        CondTmp = CondGoal0 - InfoTmp,
> +        backward_use_in_goal(ModuleInfo, ProcInfo, CondTmp, _, LBU0, LBU0T),
> +        backward_use_in_goal(ModuleInfo, ProcInfo, Then0, Then, LBU0T, LBUT),
> +
> +            % Annotate Else-goal.
> +        backward_use_in_goal(ModuleInfo, ProcInfo, Else0, Else,
> +                LBU0, LBUE),
> +        set.union(LBUT, LBUE, !:LBU),
> +        Expr = if_then_else(Vars, Cond, Then, Else)
> +    ;
> +        Expr0 = shorthand(_),
> +        unexpected(this_file, "backward_use_in_goal: shorthand goal.")
> +    ),

It would be clearer if this switch were turned into an auxiliary
procedure, e.g.  backward_use_in_goal_2.

...

> +:- pred determinism_is_nondet(prog_data__determinism::in) is semidet.
> +determinism_is_nondet(nondet).
> +determinism_is_nondet(multidet).
> +determinism_is_nondet(cc_nondet).
> +determinism_is_nondet(cc_multidet).

The name of this predicate is potentially misleading - what you are really
checking is that the determinism means there are multiple solutions, so
something like detism_allows_multiple_solns would be more appropriate.

To be continued ...



--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list