[m-rev.] [reuse] for review: graph selection strategy

Peter Ross peter.ross at miscrit.be
Thu Jul 5 23:52:51 AEST 2001


On Thu, Jul 05, 2001 at 11:50:56AM +0200, Nancy Mazur wrote:
> 
> Peter, 
> 
> if you have some time I'd be interested in some comments on this
> implementation or else I'll just commit it as is... 
> 
> Nancy
> 
> ===================================================================
> 
> 
> Estimated hours taken: 25
> Branches: reuse
> 
> Add a new selection strategy for direct reuse. Previously only the options
> lifo and random were available. Now the option `graph' is provided. This
> selection strategy builds up the different possible dead deconstruction-
> construction combinations, and using weights, decides which combination will be
> more interesting than the other ones. 
> This is roughly inspired from Debray's paper "On Copy Avoidance in
> Single Assignment Languages", with the restriction that we still do not
> allow chopping up a dead cell and reusing the pieces for different
> constructions. 
> 
> sr_choice_graphing.m:
> 	(new file) Implementation of the graph selection strategy. 
> 
> options.m:
> 	Change the default of structure-reuse-selection into `graph'. 
> 
> sr_choice.m:
> 	Add graph as a new selection strategy.
> 
> sr_direct.m:
> 	Call the correct direct-reuse-selection strategy. 
> 
> 

> Index: sr_choice_graphing.m
> ===================================================================
> RCS file: sr_choice_graphing.m
> diff -N sr_choice_graphing.m
> --- /dev/null	Wed Nov 15 09:24:47 2000
> +++ sr_choice_graphing.m	Thu Jul  5 19:39:36 2001
> @@ -0,0 +1,1114 @@
> +%-----------------------------------------------------------------------------%
> +% Copyright (C) 2001 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% Module:	sr_choice_graphing
> +% Main authors: nancy
> +% 
> +% Just as in module sr_choice, given a goal annotated with preliminary
> +% possible reuse information, this pass computes the concrete assignements
> +% of which constructions can profit of dead cells coming from deconstructions. 
> +% This module is presented as an alternative to sr_choice. The difference
> +% lies in the way the assignements are computed: instead of using lifo/random
> +% the assignment problem is translated into a mapping problem.
> +%
> +% When assigning constructions to dead deconstructions, a table is first
> +% computed. For each dead cell, a value is computed which reflects the gain
> +% a reuse might bring, and the list of constructions involved with reusing it.
> +% The cell with highest value is selected first, the according constructions
> +% are annotated, and the table is recomputed. This process is repeated until
> +% no reusable dead deconstructions are left. 
> +%
> +% The value of a dead cell (a specific deconstruction) is computed taking 
> +% into account the call graph which simplified only takes into account
> +% construction-unifications, conjunctions, and disjunctions. 
> +% The source of the graph is the deconstruction, the leaves are
> +% either constructions, or empty. The branches are either conjunctions
> +% or disjunctions. 
> +% The value of the dead cell is then computed as follows: 
> +% 	- value of a conjunction = maximum of the values of each of the 
> +%		conjunct branches. 
> +% 		Intuitively: if a dead deconstruction is followed by
> +%		two constructions which might reuse the dead cell: pick
> +%		the one which allows the most potential gain. 
> +%	- value of a disjunction = average of the value of each of the
> +%		disjunct branches. 
> +%		Intuitively: if a dead deconstruction is followed by
> +% 		a disjunction with 2 disjuncts. If reuse is only possible
> +% 		in one of the branches, allowing this reuse means that 
> +% 		a priori reuse will occur in only 50% of the cases. 
> +%		The value of the disjunct should take this into account. 
> +%		Without precise notion of which branches are executed
> +%		more often, taking the simple average of the values is 
> +%		a good approximation. 
> +%	- value of a construction = a value that takes into account
> +%	 	the cost of constructing a new cell and compares it
> +%		to the cost of updating a dead cell. If the arities
> +%		between the dead and new cell differ, a penalty cost
> +%		is added (approximated as the gain one would have had if
> +%		the unusable words would have been reused too). 
> +%		Weights are used to estimate all of these costs and are
> +%		hard-coded. I don't think there is any need in making
> +%		these an option. 
> +%
> +% Once the table is computed, usually the cell with highest value is selected.
> +% To cut the decision between different dead cells with the same
> +% value, we select the dead cell which has the least number of
> +% opportunities to be reused. 
> +% E.g. 
> +%	X can be reused by 5 different constructions, 
> +%		but reaches its highest value for a construction C1
> +%		(value 10).
> +%	Y can be reused by only one construction, also C1 (value 10). 
> +%
> +% First selecting X (and reusing it with construction C1) would 
> +% jeopardize the reuse of Y and leaves us with only one cell reused. 
> +% If, on the contrary, one would select Y first, chances are that
> +% after recomputing the table, X can still be reused by other
> +% constructions, hence possibly 2 cells reused. 
> +% Even if Y would be of smaller value, selecting Y first would still 
> +% be more interesting. Hence, instead of selecting the cell 
> +% with highest value, we select the cell with highest
> +% value/degree ratio, degree being the number of constructions at which
> +% the cell could potentially be reused. 
> +%	
> +% Obtained results: 
> +% - as the matchings are now decided based on a little more information, the
> +% obtained results wrt reuse are better, especially when allowing 
> +% differing arities.
> +% - yet, sometimes the results might be a little worse than lifo. Both
> +% strategies might decide different reuse-matchings, hence induce different
> +% reuse-constraints (e.g. while one strategy decides to reuse the first
> +% headvariable of a procedure, the other reuses the second headvariable),
> +% which might make that in one case, the constraints can be satisfied, 
> +% but the other conditions can not. 
> +%
> +% XXX Current shortcoming compared to sr_choice: cells being deconstructed
> +% in the different branches of a disjunction will not be reused after the
> +% the disjunction. In sr_choice, this is made possible. 
> +% e.g.:
> +%	( 
> +%		..., X => f(... ), ... 		% X dies
> +%	; 
> +%		..., X => g(... ), ... 		% X dies
> +%	), 
> +%	Y <= f(... ), ... 			% Y can reuse X
> +% While sr_choice allows Y to reuse X (which is perfectly legal as
> +% it dies in each of the branches of the disjunction), this will not
> +% be discovered at this moment. 
> +%
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- module sr_choice_graphing.
> +:- interface. 
> +
> +:- import_module io, std_util, list. 
> +:- import_module hlds_goal, hlds_module, hlds_pred.
> +:- import_module sr_data.
> +
> +:- import_module sr_choice.
> +
> +:- type background_info. 
> +:- pred set_background_info(sr_choice__constraint::in, module_info::in, 
> +		vartypes::in, background_info::out) is det.
> +
> +	% Annotate each deconstruction/construction unification with
> +	% whether they die, can potentially reuse a dead cell or are
> +	% not possible for reuse. 
> +:- pred sr_choice_graphing__process_goal(
> +		background_info::in, 
> +		hlds_goal::in, hlds_goal::out,
> +		maybe(list(reuse_condition))::out, 
> +		io__state::di, io__state::uo) is det.
> +
> +%-----------------------------------------------------------------------------%
> +:- implementation. 
> +
> +:- import_module term, map, float, bool, set, require, int. 
> +:- import_module type_util, globals, options.
> +:- import_module prog_data.
> +:- import_module hlds_data.
> +
> +%-----------------------------------------------------------------------------%
> +:- type background_info 
> +	---> 	background(
> +			constraint	:: sr_choice__constraint,
> +			module_info	:: module_info, 
> +			vartypes	:: vartypes
> +		).
> +
> +set_background_info(Constraint, ModuleInfo, VarTypes, BG):-
> +	BG = background(Constraint, ModuleInfo, VarTypes). 
> +
> +%-----------------------------------------------------------------------------%
> +
> +	% Type meant to keep a table of the dead cells and their
> +	% possible reuses. 
> +:- type values  == map(contextvar, value). 
> +	
> +	% Given a construction for which reuse of a dead cell would
> +	% be possible, record the cons-ids the dead cell which it is 
> +	% reusing might have, as well as the fields which don't need 
> +	% explicit update. 
> +	% XXX the cons-ids will always be at most one cons-id. There can 
> +	% only be more in the case a cell dies in each of the branches
> +	% of a disjunction. This is not yet supported. 
> +:- type context_reuse_var
> +	---> 	context_reuse_var(
> +			var		:: contextvar,
> +			cons_ids	:: maybe(list(cons_id)),
> +			reuse_fields	:: maybe(list(bool)),
> +			reuse_type	:: reuse_type
> +		).
> +
> +	% Track extra documentation of the reuse. 
> +:- type reuse_type 
> +	---> 	no_reuse
> +	; 	reuse_type(
> +			same_cons	:: bool, 
> +			arity_diff	:: int, 	
> +			arity_old	:: int, 	% arity of dead cell
> +			arity_new	:: int		% arity of new cell
> +				% arity_diff = arity_old - arity_new
> +		). 
> +	
I would get rid of the arity_diff field.

:- func arity_diff(reuse_type) = int.

arity_diff(T) = T ^ arity_old - T ^ arity_new.

You can still use this function with the field accessor function, and
there is less chance of you forgetting to update the field.

[snip]

> +%-----------------------------------------------------------------------------%
> +% almost copy/pasted from sr_choice
> +%-----------------------------------------------------------------------------%
Could you factor out the code which is common and place it into a new module
sr_choice_util.

> +
> +	% adapted from sr_choice__no_tag_type
> +:- pred no_tag_type(module_info::in, vartypes::in, prog_var::in) is semidet.
> +no_tag_type(ModuleInfo, Vartypes, Var):- 
> +	map__lookup(Vartypes, Var, VarType), 
> +	type_is_no_tag_type(ModuleInfo, VarType, _, _).
> +
> +
> +	%
> +	% already_correct_fields(HasSecTagC, VarsC, HasSecTagR, VarsR)
> +	% takes a list of variables, VarsC, which are the arguments for
> +	% the cell to be constructed and the list of variables, VarsR,
> +	% which are the arguments for the cell to be reused and returns
> +	% a list of bool where each yes indicates that argument already
> +	% has the correct value stored in it.  To do this correctly we
> +	% need to know whether each cell has a secondary tag field.
> +	%
> +:- func already_correct_fields(bool, prog_vars,
> +		bool, prog_vars) = list(bool).
> +
> +already_correct_fields(SecTagC, CurrentCellVars, SecTagR, ReuseCellVars)
> +	= Bools ++ list__duplicate(LengthC - LengthB, no) :-
> +		Bools = already_correct_fields_2(SecTagC, CurrentCellVars,
> +				SecTagR, ReuseCellVars),
> +		LengthC = list__length(CurrentCellVars),
> +		LengthB = list__length(Bools).
> +
> +:- func already_correct_fields_2(bool, prog_vars, bool, prog_vars) = list(bool).
> +
> +already_correct_fields_2(yes, CurrentCellVars, yes, ReuseCellVars)
> +	= equals(CurrentCellVars, ReuseCellVars).
> +already_correct_fields_2(yes, CurrentCellVars, no, ReuseCellVars)
> +	= [no | equals(CurrentCellVars, drop_one(ReuseCellVars))].
> +already_correct_fields_2(no, CurrentCellVars, yes, ReuseCellVars) 
> +	= [no | equals(drop_one(CurrentCellVars), ReuseCellVars)].
> +already_correct_fields_2(no, CurrentCellVars, no, ReuseCellVars) 
> +	= equals(CurrentCellVars, ReuseCellVars).
> +
> +	%
> +	% equals(ListA, ListB) produces a list of bools which indicates
> +	% whether the corresponding elements from ListA and ListB were
> +	% equal.  If ListA and ListB are of different lengths, the
> +	% resulting list is the length of the shorter of the two.
> +	%
> +:- func equals(list(T), list(T)) = list(bool).
> +
> +equals([], []) = [].
> +equals([], [_|_]) = [].
> +equals([_|_], []) = [].
> +equals([X | Xs], [Y | Ys]) = [Bool | equals(Xs, Ys)] :-
> +	( X = Y ->
> +		Bool = yes
> +	;
> +		Bool = no
> +	).
> +
> +:- func drop_one(list(T)) = list(T).
> +
> +drop_one([]) = [].
> +drop_one([_ | Xs]) = Xs.
> +
> +	
> +	%
> +	% has_secondary_tag(Var, ConsId, HasSecTag) is true iff the
> +	% variable, Var, with cons_id, ConsId, requires a remote
> +	% secondary tag to distinguish between its various functors.
> +	%
> +:- pred has_secondary_tag(module_info::in, vartypes::in, 
> +		prog_var::in, cons_id::in, bool::out) is det.
> +
> +has_secondary_tag(ModuleInfo, VarTypes, Var, ConsId, SecondaryTag) :- 
> +	(
> +		map__lookup(VarTypes, Var, Type),
> +		type_to_type_id(Type, TypeId, _Args)
> +	->
> +		module_info_types(ModuleInfo, Types),
> +		( map__search(Types, TypeId, TypeDefn) ->
> +			hlds_data__get_type_defn_body(TypeDefn, TypeBody),
> +			( TypeBody = du_type(_, ConsTagValues, _, _) ->
> +				(
> +						% The search can fail
> +						% for such types as
> +						% private_builtin:type_info,
> +						% if the search fails we
> +						% will not have a
> +						% secondary tag.
> +					map__search(ConsTagValues, ConsId,
> +							ConsTag),
> +					ConsTag = shared_remote_tag(_, _)
> +				->
> +					SecondaryTag = yes
> +				;
> +					SecondaryTag = no
> +				)
> +			;
> +				error("has_secondary_tag: not du type.")
> +			)
> +		;
> +				% Must be a basic type.
> +			SecondaryTag = no
> +		)
> +	;
> +		error("has_secondary_tag: type_to_type_id failed.")
> +		
> +	).
> +
> +
> +%-----------------------------------------------------------------------------%

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list