[m-rev.] for review: tupling transformation (part 3)

Julien Fischer juliensf at cs.mu.OZ.AU
Thu Mar 3 18:58:58 AEDT 2005


On Wed, 2 Mar 2005, Julien Fischer wrote:

> Index: compiler/tupling.m
> ===================================================================
> RCS file: compiler/tupling.m
> diff -N compiler/tupling.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ compiler/tupling.m	2 Mar 2005 00:15:49 -0000
> @@ -0,0 +1,1990 @@
> +%-----------------------------------------------------------------------------%
> +% Copyright (C) 2005 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% File: tupling.m.
> +%
> +% Author: wangp.
> +%
> +% This module takes the HLDS and performs a tupling transformation on the
> +% locally-defined procedures.  That is, instead of passing all of the
> +% procedure's arguments separately, it will try to bundle some of them up and
> +% pass them together as a tuple.
> +%
> +% The idea is that some arguments passed to a procedure may not be needed
> +% immediately: between the start of the procedure and the first use of a
> +% given argument there may be a flush point, such as a call to another
> +% procedure.  At these points all values residing in registers that will be
> +% needed later in the procedure will need to be flushed to the stack, to be
> +% restored later.  In some cases, it may be beneficial to refer to some
> +% arguments indirectly through a cell variable.  Flushing the (address of
> +% the) cell variable to the stack is enough to save all the field variables
> +% of the cell.  The downside is that accessing a field variable requires
> +% going through a cell variable (the cost of which may be amortised if
> +% multiple field variables are needed in the same interval).
> +%
> +% Another potentially good reason to pass arguments in a tuple is if many
> +% procedures will be passing the same arguments to each other, e.g. as is
> +% often the case in mutually-recursive procedures.
> +%
> +% This implementation works as follows:
> +%
> +% 1. The module is divided up into its constituent SCCs.  We work our way
> +% through each SCC, starting from the bottommost SCC in the call graph.
> +%
> +% 2. For each SCC, we take guesses at a good tupling scheme for the
> +% procedures in the SCC and count the average number of loads and stores
> +% between the registers and the stack for each given scheme.
> +%
> +% 3. If the best tupling scheme gives us an average number of loads/stores
> +% that compares favourably against the original (untupled) scheme, we go
> +% ahead and make the transformed versions of the procedures in the SCC and
> +% add them to the HLDS.
> +%
> +% 4. After all the SCCs have been processed, we update all calls to the
> +% original procedures to call their transformed versions instead.
> +%
> +% Step 2 in more detail:
> +%
> +% This implementation uses the names of input formal parameters to guess
> +% which values are common between the procedures in an SCC (for SCCs with
> +% more than one procedure).  i.e. if a variable name occurs as an input
> +% argument to more than one procedure in the SCC, those variables
> +% corresponding that name are candidates for tupling.  In the interests of
s/interests/interest/

> +% speeding up compilation times, the implementation only tries to tuple
> +% contiguous runs of the candidate variables.
> +% e.g. if the candidates are [A,B,C,D] these combinations would be tested in
> +% turn, assuming a minimum run length of 3: {A,B,C,D}, {A,B,C}, {B,C,D}
> +%
> +% To count the average number of loads and stores in a procedure we traverse
> +% the procedure's goal, remembering which values are available in registers
s/procedures's goal/procedure's body/


> +% and the stack.  When we reach a branch point, we use the relative
> +% frequencies that each branch was taken in a sample run to weight the costs
> +% incurred in each branch.  The relative frequency data is gathered from the
> +% trace count summary file that must be provided by the user.
> +%
> +% Ideas for further work:
> +%
> +% - Smarter / more aggressive choosing of arguments to tuple
> +% - Inter-SCC analysis
> +% - Inter-module optimisation
> +% - Proper weighting of calls to procedures from within and without the SCC
> +%
> +% This transformation is similar in spirit to the transformation in
> +% stack_opt.m.  It also shares much code with it.
> +%
Add an XXX comment here mentioning that we need to check that mprof
can demangle the names of the transformed procedures correctly; unless
you've already looked at that yourself.

...

> +:- pred tuple_arguments_2(module_info::in, module_info::out, trace_counts::in,
> +	io::di, io::uo) is det.
> +
> +tuple_arguments_2(!ModuleInfo, TraceCounts0, !IO) :-
> +	module_info_globals(!.ModuleInfo, Globals),
> +	% We use the same cost options as for the stack optimisation.

You should add a comment at the point in options.m where
these options are defined mentioning that any changes to them may
require changes here as well.

> +	globals__lookup_int_option(Globals,
> +		optimize_saved_vars_cell_cv_load_cost, CellVarLoadCost),
> +	globals__lookup_int_option(Globals,
> +		optimize_saved_vars_cell_cv_store_cost, CellVarStoreCost),
> +	globals__lookup_int_option(Globals,
> +		optimize_saved_vars_cell_fv_load_cost, FieldVarLoadCost),
> +	globals__lookup_int_option(Globals,
> +		optimize_saved_vars_cell_fv_store_cost, FieldVarStoreCost),
> +	globals__lookup_int_option(Globals,
> +		tuple_costs_ratio, CostsRatio),
> +	globals__lookup_int_option(Globals,
> +		tuple_min_args, MinArgsToTuple),
> +	% These are the costs for untupled variables.  We just assume it is
> +	% the lesser of the cell and field variable costs (usually the field
> +	% variable costs should be smaller).
> +	NormalVarStoreCost = min(CellVarStoreCost, FieldVarStoreCost),
> +	NormalVarLoadCost = min(CellVarLoadCost, FieldVarLoadCost),
> +	TuningParams = tuning_params(
> +		NormalVarLoadCost, NormalVarStoreCost,
> +		CellVarLoadCost, CellVarStoreCost,
> +		FieldVarLoadCost, FieldVarStoreCost,
> +		CostsRatio, MinArgsToTuple),
> +
> +	module_info_name(!.ModuleInfo, ModuleName),
> +	restrict_trace_counts_to_module(ModuleName, TraceCounts0, TraceCounts),
> +
> +	module_info_ensure_dependency_info(!ModuleInfo),
> +	module_info_dependency_info(!.ModuleInfo, DepInfo),
> +	hlds_dependency_info_get_dependency_graph(DepInfo, DepGraph),
> +	hlds_dependency_info_get_dependency_ordering(DepInfo, SCCs),
> +
> +	% Add transformed versions of procedures that we think would be
> +	% beneficial.
> +	list__foldl4(maybe_tuple_scc(TraceCounts, TuningParams, DepGraph),
> +		SCCs, !ModuleInfo, counter__init(0), _,
> +		map__init, TransformMap, !IO),
> +
> +	% Update the callers of the original procedures to call their
> +	% transformed versions instead.  Do the same for the transformed
> +	% procedures themselves.
> +	list__foldl(fix_calls_in_procs(TransformMap), SCCs, !ModuleInfo),
> +	fix_calls_in_transformed_procs(TransformMap, !ModuleInfo).
> +
> +:- pred warn_trace_counts_error(string::in, string::in, io::di, io::uo) is det.
> +
> +warn_trace_counts_error(TraceCountsFile, Reason, !IO) :-
> +	string__format(
> +		"Warning: unable to read trace count summary from %s (%s)\n",
> +		[s(TraceCountsFile), s(Reason)], Message),
> +	report_warning(Message, !IO).
> +
> +%-----------------------------------------------------------------------------%
> +
> +	% This predicate can be used in place of maybe_tuple_scc to evaluate
> +	% and transform each procedure of an SCC individually.  This is to
> +	% mimic the behaviour from an earlier version of this file.
> +	%
Is there really any value in keeping it?  If yes, then add a comment
saying that this procedure is currently unused and is intended for
debugging purposes.

> +:- pred maybe_tuple_scc_individual_procs(trace_counts::in, tuning_params::in,
> +	dependency_graph::in, list(pred_proc_id)::in,
> +	module_info::in, module_info::out, counter::in, counter::out,
> +	transform_map::in, transform_map::out, io::di, io::uo) is det.
> +
> +maybe_tuple_scc_individual_procs(_TraceCounts, _TuningParams, _DepGraph,
> +		[], !ModuleInfo, !Counter, !TransformMap, !IO).
> +maybe_tuple_scc_individual_procs(TraceCounts, TuningParams, DepGraph,
> +		[Proc | Procs], !ModuleInfo, !Counter, !TransformMap, !IO) :-
> +	maybe_tuple_scc(TraceCounts, TuningParams, DepGraph,
> +		[Proc], !ModuleInfo, !Counter, !TransformMap, !IO),
> +	maybe_tuple_scc_individual_procs(TraceCounts, TuningParams, DepGraph,
> +		Procs, !ModuleInfo, !Counter, !TransformMap, !IO).
> +
> +:- pred maybe_tuple_scc(trace_counts::in, tuning_params::in,
> +	dependency_graph::in, list(pred_proc_id)::in(bound([ground | ground])),
> +	module_info::in, module_info::out, counter::in, counter::out,
> +	transform_map::in, transform_map::out, io::di, io::uo) is det.
> +
> +maybe_tuple_scc(TraceCounts, TuningParams, DepGraph, SCC,
> +		!ModuleInfo, !Counter, !TransformMap, !IO) :-
> +	module_info_globals(!.ModuleInfo, Globals),
> +	globals__lookup_bool_option(Globals, very_verbose, VeryVerbose),
> +	(
> +		VeryVerbose = yes,
> +		io.write_string("% Considering tupling in ", !IO),
> +		list__foldl((pred(PredProcId::in, IO0::di, IO::uo) is det :-
> +			PredProcId = proc(PredId, ProcId),
> +			hlds_out__write_pred_proc_id(!.ModuleInfo,
> +				PredId, ProcId, IO0, IO)),
> +			SCC, !IO),
> +		io.write_string("\n", !IO)
> +	;
> +		VeryVerbose = no
> +	),
> +	( scc_has_local_callers(SCC, DepGraph) ->
> +		( SCC = [SingleProc] ->
> +			candidate_headvars_of_proc(!.ModuleInfo, SingleProc,
> +				CandidateHeadVars)
> +		;
> +			common_candidate_headvars_of_procs(!.ModuleInfo, SCC,
> +				CandidateHeadVars)
> +		),
> +		MinArgsToTuple = TuningParams ^ min_args_to_tuple,
> +		( list__length(CandidateHeadVars) < MinArgsToTuple ->
> +			(
> +				VeryVerbose = yes,
> +				io__write_string(
> +					"% Too few candidate headvars\n", !IO)
> +			;
> +				VeryVerbose = no
> +			)
> +		;
> +			maybe_tuple_scc_2(TraceCounts, TuningParams,
> +				SCC, CandidateHeadVars, !ModuleInfo,
> +				!Counter, !TransformMap, !IO, VeryVerbose)
> +		)
> +	;
> +		% No need to work on this SCC if there are no callers to it
> +		% within this module.
There should be an XXX comment there saying that if part of the SCC is
exported by the module then way may eventually want to look it, i.e.
if we ever support intermodule tupling.

...

> +add_transformed_proc(_, no_tupling, !ModuleInfo, !TransformMap, !Counter).
> +
> +:- pred nth_member_lookup(list(T)::in, T::in, int::out) is det.
> +
> +nth_member_lookup(List, Elem, Position) :-
> +	( list__nth_member_search(List, Elem, PositionPrime) ->
> +		Position = PositionPrime
> +	;
> +		error("nth_member_lookup/3: element not found in list")
> +	).
> +
You might as well just add that to the standard library.

...

> +	% In some cases some of the fields variables need to be available at
s/fields/field's/

> +	% the very beginning of the procedure.	The required deconstructions
> +	% for those variables won't show up in the insert map.  To handle this
> +	% we just to insert a deconstruction unification at the start of the
> +	% procedure and let a simplification pass remove it later if not
> +	% required.
> +	%
> +	% We could make build_insert_map add such required unifications
> +	% to the insert map, but record_decisions_in_goal would need to be
> +	% modified as well.
> +	%
> +	deconstruct_tuple(CellVar, FieldVarsList, ProcStartDeconstruct),
> +	ProcStartInsert = insert_spec(ProcStartDeconstruct,
> +		set__from_list(FieldVarsList)),
> +	insert_proc_start_deconstruction(Goal1, Goal2,
> +		VarSet1, VarSet, VarTypes1, VarTypes,
> +		RenameMapB, ProcStartInsert),
> +	rename_vars_in_goal(Goal2, RenameMapB, Goal3),
> +
> +	map__merge(RenameMapA, RenameMapB, RenameMap),
> +	apply_headvar_correction(set__from_list(HeadVars), RenameMap,
> +		Goal3, Goal),
> +	proc_info_set_goal(Goal, !ProcInfo),
> +	proc_info_set_varset(VarSet, !ProcInfo),
> +	proc_info_set_vartypes(VarTypes, !ProcInfo),
> +	requantify_proc(!ProcInfo).
> +
> +:- pred insert_proc_start_deconstruction(hlds_goal::in, hlds_goal::out,
> +	prog_varset::in, prog_varset::out, vartypes::in, vartypes::out,
> +	rename_map::out, insert_spec::in) is det.
> +

...

> +%-----------------------------------------------------------------------------%
> +
> +% The opt_stack_alloc structure is constructed by live_vars.m.  As far as I can
> +% tell we don't need such a thing for this module so we just define some stubs.
> +
> +:- type opt_stack_alloc ---> opt_stack_alloc.
> +
Why give it this name?  It would be more helpful to make the name unique
to this module.

> +:- instance stack_alloc_info(opt_stack_alloc) where [
> +	pred(at_call_site/4) is opt_at_call_site,
> +	pred(at_resume_site/4) is opt_at_resume_site,
> +	pred(at_par_conj/4) is opt_at_par_conj
> +].
> +
> +:- pred opt_at_call_site(need_across_call::in, hlds_goal_info::in,
> +	opt_stack_alloc::in, opt_stack_alloc::out) is det.
> +
> +opt_at_call_site(_NeedAtCall, _GoalInfo, StackAlloc, StackAlloc).
> +
> +:- pred opt_at_resume_site(need_in_resume::in, hlds_goal_info::in,
> +	opt_stack_alloc::in, opt_stack_alloc::out) is det.
> +
> +opt_at_resume_site(_NeedAtResume, _GoalInfo, StackAlloc, StackAlloc).
> +
> +:- pred opt_at_par_conj(need_in_par_conj::in, hlds_goal_info::in,
> +	opt_stack_alloc::in, opt_stack_alloc::out) is det.
> +
> +opt_at_par_conj(_NeedParConj, _GoalInfo, StackAlloc, StackAlloc).
> +
...

> +		% XXX: There is a problem somewhere causing CALL and EXIT
> +		% events not to show up for some procedures in trace count
> +		% files.  The weighting of the procedure's costs is disabled.
> +		% However, if working, it still wouldn't be ideal as we don't
> +		% know how many of the calls to the procedure came from within
> +		% or without the SCC.
Wasn't this occuring because the procedures had been inlined?

> +		/*get_proc_calls(ProcCounts, Weight),*/
> +		Weight = 1,
> +		!:Loads = !.Loads + float(Weight) * ProcLoads,
> +		!:Stores = !.Stores + float(Weight) * ProcStores
> +	;
> +		true
> +	).
> +
...

> +		Unification = construct(CellVar, _ConsId, ArgVars, _ArgModes,
> +			_HowToConstruct, _, _),
> +		cls_require_in_regs(CountInfo, ArgVars, !CountState),
> +		cls_put_in_regs([CellVar], !CountState)
> +	;
> +		Unification = deconstruct(CellVar, _ConsId, ArgVars,
> +			_ArgModes, _, _),
> +		cls_put_in_regs_via_deconstruct(CountInfo, CellVar, ArgVars,
> +			!CountState)
> +	;
> +		Unification = assign(ToVar, FromVar),
> +		cls_require_in_reg(CountInfo, FromVar, !CountState),
> +		cls_put_in_regs([ToVar], !CountState)
> +	;
> +		Unification = simple_test(Var1, Var2),
> +		cls_require_in_regs(CountInfo, [Var1, Var2], !CountState)
> +	;
> +		Unification = complicated_unify(_, _, _),
> +		error("count_load_stores_in_goal: complicated_unify")
Call unexpected/2 instead (and elsewhere in this module).

...

> +			DeconstructFieldVars, !State)
> +	;
> +		TuplingProposal = tupling(_, TupleFieldVars, _),
> +		VarsToLoad = set__difference(
> +			set__from_list(DeconstructFieldVars),
> +			set__from_list(TupleFieldVars)),
> +		( set__non_empty(VarsToLoad) ->
> +			cls_require_var_in_reg_with_cost(CvLoadCost,
> +				DeconstructCellVar, !State),
> +			set__fold(cls_require_var_in_reg_with_cost(FvLoadCost),
> +				VarsToLoad, !State)
> +		;
> +			% All the variables generated by this deconstruction
> +			% can be gotten from the proposed tupling, so it can be
s/gotten/obtained/

Also I would reword that as follows:

	... so the deconstruction can be ignored.

As written it's not entirely clear whether the proposed tupling or the
deconstruction should be ignored.

> +			% ignored.  The costs of loading those variables from
> +			% the tuple will be counted as they come.
> +			true
> +		)
> +	).
> +

...

> +	% This is needed only to satisfy the interface of interval.m
> +	%
> +:- type nothing ---> nothing.
> +
You could use the unit type from std_util to do that.

> +:- instance build_interval_info_acc(nothing) where [
> +	pred(use_cell/8) is tupling__use_cell
> +].
> +
> +:- pred use_cell(prog_var::in, list(prog_var)::in, cons_id::in, hlds_goal::in,
> +	interval_info::in, interval_info::out, nothing::in,
> +	nothing::out) is det.
> +
> +use_cell(_CellVar, _FieldVarList, _ConsId, _Goal, !IntervalInfo, !Nothing).
> +

...

> +%-----------------------------------------------------------------------------%
> +%
> +% Fixing calls to transformed procedures.
> +%
> +
> +:- type transform_map == map(pred_proc_id, transformed_proc).
> +
> +:- type transformed_proc --->
> +	transformed_proc(
> +		transformed_pred_proc_id	:: pred_proc_id,
> +		tuple_cons_type			:: (type),
> +		args_to_tuple			:: list(int),
> +		call_template			:: hlds_goal
> +	).
> +
Please document these types a little more.


> +:- pred fix_calls_in_procs(transform_map::in, list(pred_proc_id)::in,
> +	module_info::in, module_info::out) is det.
> +
> +fix_calls_in_procs(TransformMap, PredProcIds, !ModuleInfo) :-
> +	list__foldl(fix_calls_in_proc(TransformMap),
> +		PredProcIds, !ModuleInfo).
> +
...

> +fix_calls_in_proc(TransformMap, proc(PredId, ProcId), !ModuleInfo) :-
> +	some [!ProcInfo] (
> +		module_info_pred_proc_info(!.ModuleInfo, PredId, ProcId,
> +			PredInfo, !:ProcInfo),
> +		% XXX: Don't modify predicates that were created by type
> +		% specialisation.  This is a last-minute workaround for some
> +		% linking problems that occurred when such predicates in the
> +		% library were made to call tupled procedures.
Do you have a test case for this problem?

> +		pred_info_get_origin(PredInfo, Origin),
> +		( Origin = transformed(type_specialization(_), _, _) ->
> +			true
> +		;
> +			proc_info_goal(!.ProcInfo, Goal0),
> +			proc_info_vartypes(!.ProcInfo, VarTypes0),
> +			proc_info_varset(!.ProcInfo, VarSet0),
> +			fix_calls_in_goal(Goal0, Goal, VarSet0, VarSet,
> +				VarTypes0, VarTypes, TransformMap),
> +			proc_info_set_goal(Goal, !ProcInfo),
> +			proc_info_set_varset(VarSet, !ProcInfo),
> +			proc_info_set_vartypes(VarTypes, !ProcInfo),
> +			requantify_proc(!ProcInfo),
> +			recompute_instmap_delta_proc(yes, !ProcInfo,
> +				!ModuleInfo),
> +			module_info_set_pred_proc_info(PredId, ProcId,
> +				PredInfo, !.ProcInfo, !ModuleInfo)
> +		)
> +	).
> +

...

> +fix_calls_in_goal(par_conj(Goals0) - GoalInfo, par_conj(Goals) - GoalInfo,
> +		!VarSet, !VarTypes, TransformMap) :-
> +	% I am not sure whether parallel conjunctions should be treated
> +	% with fix_calls_in_goal or fix_calls_in_goal_list.  At any rate,
> +	% this is untested.
There should be an XXX in front of that comment.

> +	fix_calls_in_goal_list(Goals0, Goals, !VarSet, !VarTypes,
> +		TransformMap).
> +
> +fix_calls_in_goal(disj(Goals0) - GoalInfo, disj(Goals) - GoalInfo,
> +		!VarSet, !VarTypes, TransformMap) :-
> +	fix_calls_in_goal_list(Goals0, Goals, !VarSet, !VarTypes,
> +		TransformMap).
> +
> +fix_calls_in_goal(switch(Var, CanFail, Cases0) - GoalInfo,
> +		switch(Var, CanFail, Cases) - GoalInfo,
> +		!VarSet, !VarTypes, TransformMap) :-
> +	fix_calls_in_cases(Cases0, Cases, !VarSet, !VarTypes, TransformMap).
> +
> +fix_calls_in_goal(if_then_else(Vars, Cond0, Then0, Else0) - GoalInfo,
> +		if_then_else(Vars, Cond, Then, Else) - GoalInfo,
> +		!VarSet, !VarTypes, TransformMap) :-
> +	fix_calls_in_goal(Cond0, Cond, !VarSet, !VarTypes, TransformMap),
> +	fix_calls_in_goal(Then0, Then, !VarSet, !VarTypes, TransformMap),
> +	fix_calls_in_goal(Else0, Else, !VarSet, !VarTypes, TransformMap).
> +
> +fix_calls_in_goal(shorthand(_) - _, _, !VarSet, !VarTypes, _TransformMap) :-
> +	error("fix_calls_in_goal: unexpected shorthand").
> +
Call unexpected/2.

The changes to the other modules look fine, although I would like the
interval module to include a description of what an interval is at the
head of the module - it's the obvious place to look for it.

Since semester has started again (and you're presumably busy with other
things) I think you should commit this (assuming that it bootchecks)
after addressing the above review comments.

Julien.




--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list