[m-rev.] [reuse] for review: graph selection strategy
Peter Ross
peter.ross at miscrit.be
Thu Jul 5 23:52:51 AEST 2001
On Thu, Jul 05, 2001 at 11:50:56AM +0200, Nancy Mazur wrote:
>
> Peter,
>
> if you have some time I'd be interested in some comments on this
> implementation or else I'll just commit it as is...
>
> Nancy
>
> ===================================================================
>
>
> Estimated hours taken: 25
> Branches: reuse
>
> Add a new selection strategy for direct reuse. Previously only the options
> lifo and random were available. Now the option `graph' is provided. This
> selection strategy builds up the different possible dead deconstruction-
> construction combinations, and using weights, decides which combination will be
> more interesting than the other ones.
> This is roughly inspired from Debray's paper "On Copy Avoidance in
> Single Assignment Languages", with the restriction that we still do not
> allow chopping up a dead cell and reusing the pieces for different
> constructions.
>
> sr_choice_graphing.m:
> (new file) Implementation of the graph selection strategy.
>
> options.m:
> Change the default of structure-reuse-selection into `graph'.
>
> sr_choice.m:
> Add graph as a new selection strategy.
>
> sr_direct.m:
> Call the correct direct-reuse-selection strategy.
>
>
> Index: sr_choice_graphing.m
> ===================================================================
> RCS file: sr_choice_graphing.m
> diff -N sr_choice_graphing.m
> --- /dev/null Wed Nov 15 09:24:47 2000
> +++ sr_choice_graphing.m Thu Jul 5 19:39:36 2001
> @@ -0,0 +1,1114 @@
> +%-----------------------------------------------------------------------------%
> +% Copyright (C) 2001 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% Module: sr_choice_graphing
> +% Main authors: nancy
> +%
> +% Just as in module sr_choice, given a goal annotated with preliminary
> +% possible reuse information, this pass computes the concrete assignements
> +% of which constructions can profit of dead cells coming from deconstructions.
> +% This module is presented as an alternative to sr_choice. The difference
> +% lies in the way the assignements are computed: instead of using lifo/random
> +% the assignment problem is translated into a mapping problem.
> +%
> +% When assigning constructions to dead deconstructions, a table is first
> +% computed. For each dead cell, a value is computed which reflects the gain
> +% a reuse might bring, and the list of constructions involved with reusing it.
> +% The cell with highest value is selected first, the according constructions
> +% are annotated, and the table is recomputed. This process is repeated until
> +% no reusable dead deconstructions are left.
> +%
> +% The value of a dead cell (a specific deconstruction) is computed taking
> +% into account the call graph which simplified only takes into account
> +% construction-unifications, conjunctions, and disjunctions.
> +% The source of the graph is the deconstruction, the leaves are
> +% either constructions, or empty. The branches are either conjunctions
> +% or disjunctions.
> +% The value of the dead cell is then computed as follows:
> +% - value of a conjunction = maximum of the values of each of the
> +% conjunct branches.
> +% Intuitively: if a dead deconstruction is followed by
> +% two constructions which might reuse the dead cell: pick
> +% the one which allows the most potential gain.
> +% - value of a disjunction = average of the value of each of the
> +% disjunct branches.
> +% Intuitively: if a dead deconstruction is followed by
> +% a disjunction with 2 disjuncts. If reuse is only possible
> +% in one of the branches, allowing this reuse means that
> +% a priori reuse will occur in only 50% of the cases.
> +% The value of the disjunct should take this into account.
> +% Without precise notion of which branches are executed
> +% more often, taking the simple average of the values is
> +% a good approximation.
> +% - value of a construction = a value that takes into account
> +% the cost of constructing a new cell and compares it
> +% to the cost of updating a dead cell. If the arities
> +% between the dead and new cell differ, a penalty cost
> +% is added (approximated as the gain one would have had if
> +% the unusable words would have been reused too).
> +% Weights are used to estimate all of these costs and are
> +% hard-coded. I don't think there is any need in making
> +% these an option.
> +%
> +% Once the table is computed, usually the cell with highest value is selected.
> +% To cut the decision between different dead cells with the same
> +% value, we select the dead cell which has the least number of
> +% opportunities to be reused.
> +% E.g.
> +% X can be reused by 5 different constructions,
> +% but reaches its highest value for a construction C1
> +% (value 10).
> +% Y can be reused by only one construction, also C1 (value 10).
> +%
> +% First selecting X (and reusing it with construction C1) would
> +% jeopardize the reuse of Y and leaves us with only one cell reused.
> +% If, on the contrary, one would select Y first, chances are that
> +% after recomputing the table, X can still be reused by other
> +% constructions, hence possibly 2 cells reused.
> +% Even if Y would be of smaller value, selecting Y first would still
> +% be more interesting. Hence, instead of selecting the cell
> +% with highest value, we select the cell with highest
> +% value/degree ratio, degree being the number of constructions at which
> +% the cell could potentially be reused.
> +%
> +% Obtained results:
> +% - as the matchings are now decided based on a little more information, the
> +% obtained results wrt reuse are better, especially when allowing
> +% differing arities.
> +% - yet, sometimes the results might be a little worse than lifo. Both
> +% strategies might decide different reuse-matchings, hence induce different
> +% reuse-constraints (e.g. while one strategy decides to reuse the first
> +% headvariable of a procedure, the other reuses the second headvariable),
> +% which might make that in one case, the constraints can be satisfied,
> +% but the other conditions can not.
> +%
> +% XXX Current shortcoming compared to sr_choice: cells being deconstructed
> +% in the different branches of a disjunction will not be reused after the
> +% the disjunction. In sr_choice, this is made possible.
> +% e.g.:
> +% (
> +% ..., X => f(... ), ... % X dies
> +% ;
> +% ..., X => g(... ), ... % X dies
> +% ),
> +% Y <= f(... ), ... % Y can reuse X
> +% While sr_choice allows Y to reuse X (which is perfectly legal as
> +% it dies in each of the branches of the disjunction), this will not
> +% be discovered at this moment.
> +%
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- module sr_choice_graphing.
> +:- interface.
> +
> +:- import_module io, std_util, list.
> +:- import_module hlds_goal, hlds_module, hlds_pred.
> +:- import_module sr_data.
> +
> +:- import_module sr_choice.
> +
> +:- type background_info.
> +:- pred set_background_info(sr_choice__constraint::in, module_info::in,
> + vartypes::in, background_info::out) is det.
> +
> + % Annotate each deconstruction/construction unification with
> + % whether they die, can potentially reuse a dead cell or are
> + % not possible for reuse.
> +:- pred sr_choice_graphing__process_goal(
> + background_info::in,
> + hlds_goal::in, hlds_goal::out,
> + maybe(list(reuse_condition))::out,
> + io__state::di, io__state::uo) is det.
> +
> +%-----------------------------------------------------------------------------%
> +:- implementation.
> +
> +:- import_module term, map, float, bool, set, require, int.
> +:- import_module type_util, globals, options.
> +:- import_module prog_data.
> +:- import_module hlds_data.
> +
> +%-----------------------------------------------------------------------------%
> +:- type background_info
> + ---> background(
> + constraint :: sr_choice__constraint,
> + module_info :: module_info,
> + vartypes :: vartypes
> + ).
> +
> +set_background_info(Constraint, ModuleInfo, VarTypes, BG):-
> + BG = background(Constraint, ModuleInfo, VarTypes).
> +
> +%-----------------------------------------------------------------------------%
> +
> + % Type meant to keep a table of the dead cells and their
> + % possible reuses.
> +:- type values == map(contextvar, value).
> +
> + % Given a construction for which reuse of a dead cell would
> + % be possible, record the cons-ids the dead cell which it is
> + % reusing might have, as well as the fields which don't need
> + % explicit update.
> + % XXX the cons-ids will always be at most one cons-id. There can
> + % only be more in the case a cell dies in each of the branches
> + % of a disjunction. This is not yet supported.
> +:- type context_reuse_var
> + ---> context_reuse_var(
> + var :: contextvar,
> + cons_ids :: maybe(list(cons_id)),
> + reuse_fields :: maybe(list(bool)),
> + reuse_type :: reuse_type
> + ).
> +
> + % Track extra documentation of the reuse.
> +:- type reuse_type
> + ---> no_reuse
> + ; reuse_type(
> + same_cons :: bool,
> + arity_diff :: int,
> + arity_old :: int, % arity of dead cell
> + arity_new :: int % arity of new cell
> + % arity_diff = arity_old - arity_new
> + ).
> +
I would get rid of the arity_diff field.
:- func arity_diff(reuse_type) = int.
arity_diff(T) = T ^ arity_old - T ^ arity_new.
You can still use this function with the field accessor function, and
there is less chance of you forgetting to update the field.
[snip]
> +%-----------------------------------------------------------------------------%
> +% almost copy/pasted from sr_choice
> +%-----------------------------------------------------------------------------%
Could you factor out the code which is common and place it into a new module
sr_choice_util.
> +
> + % adapted from sr_choice__no_tag_type
> +:- pred no_tag_type(module_info::in, vartypes::in, prog_var::in) is semidet.
> +no_tag_type(ModuleInfo, Vartypes, Var):-
> + map__lookup(Vartypes, Var, VarType),
> + type_is_no_tag_type(ModuleInfo, VarType, _, _).
> +
> +
> + %
> + % already_correct_fields(HasSecTagC, VarsC, HasSecTagR, VarsR)
> + % takes a list of variables, VarsC, which are the arguments for
> + % the cell to be constructed and the list of variables, VarsR,
> + % which are the arguments for the cell to be reused and returns
> + % a list of bool where each yes indicates that argument already
> + % has the correct value stored in it. To do this correctly we
> + % need to know whether each cell has a secondary tag field.
> + %
> +:- func already_correct_fields(bool, prog_vars,
> + bool, prog_vars) = list(bool).
> +
> +already_correct_fields(SecTagC, CurrentCellVars, SecTagR, ReuseCellVars)
> + = Bools ++ list__duplicate(LengthC - LengthB, no) :-
> + Bools = already_correct_fields_2(SecTagC, CurrentCellVars,
> + SecTagR, ReuseCellVars),
> + LengthC = list__length(CurrentCellVars),
> + LengthB = list__length(Bools).
> +
> +:- func already_correct_fields_2(bool, prog_vars, bool, prog_vars) = list(bool).
> +
> +already_correct_fields_2(yes, CurrentCellVars, yes, ReuseCellVars)
> + = equals(CurrentCellVars, ReuseCellVars).
> +already_correct_fields_2(yes, CurrentCellVars, no, ReuseCellVars)
> + = [no | equals(CurrentCellVars, drop_one(ReuseCellVars))].
> +already_correct_fields_2(no, CurrentCellVars, yes, ReuseCellVars)
> + = [no | equals(drop_one(CurrentCellVars), ReuseCellVars)].
> +already_correct_fields_2(no, CurrentCellVars, no, ReuseCellVars)
> + = equals(CurrentCellVars, ReuseCellVars).
> +
> + %
> + % equals(ListA, ListB) produces a list of bools which indicates
> + % whether the corresponding elements from ListA and ListB were
> + % equal. If ListA and ListB are of different lengths, the
> + % resulting list is the length of the shorter of the two.
> + %
> +:- func equals(list(T), list(T)) = list(bool).
> +
> +equals([], []) = [].
> +equals([], [_|_]) = [].
> +equals([_|_], []) = [].
> +equals([X | Xs], [Y | Ys]) = [Bool | equals(Xs, Ys)] :-
> + ( X = Y ->
> + Bool = yes
> + ;
> + Bool = no
> + ).
> +
> +:- func drop_one(list(T)) = list(T).
> +
> +drop_one([]) = [].
> +drop_one([_ | Xs]) = Xs.
> +
> +
> + %
> + % has_secondary_tag(Var, ConsId, HasSecTag) is true iff the
> + % variable, Var, with cons_id, ConsId, requires a remote
> + % secondary tag to distinguish between its various functors.
> + %
> +:- pred has_secondary_tag(module_info::in, vartypes::in,
> + prog_var::in, cons_id::in, bool::out) is det.
> +
> +has_secondary_tag(ModuleInfo, VarTypes, Var, ConsId, SecondaryTag) :-
> + (
> + map__lookup(VarTypes, Var, Type),
> + type_to_type_id(Type, TypeId, _Args)
> + ->
> + module_info_types(ModuleInfo, Types),
> + ( map__search(Types, TypeId, TypeDefn) ->
> + hlds_data__get_type_defn_body(TypeDefn, TypeBody),
> + ( TypeBody = du_type(_, ConsTagValues, _, _) ->
> + (
> + % The search can fail
> + % for such types as
> + % private_builtin:type_info,
> + % if the search fails we
> + % will not have a
> + % secondary tag.
> + map__search(ConsTagValues, ConsId,
> + ConsTag),
> + ConsTag = shared_remote_tag(_, _)
> + ->
> + SecondaryTag = yes
> + ;
> + SecondaryTag = no
> + )
> + ;
> + error("has_secondary_tag: not du type.")
> + )
> + ;
> + % Must be a basic type.
> + SecondaryTag = no
> + )
> + ;
> + error("has_secondary_tag: type_to_type_id failed.")
> +
> + ).
> +
> +
> +%-----------------------------------------------------------------------------%
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list