[m-rev.] for review: tupling transformation (part 3)
Peter Wang
wangp at students.cs.mu.OZ.AU
Thu Mar 3 23:32:49 AEDT 2005
Julien Fischer wrote:
>On Wed, 2 Mar 2005, Julien Fischer wrote:
>
>>+ % XXX: There is a problem somewhere causing CALL and EXIT
>>+ % events not to show up for some procedures in trace count
>>+ % files. The weighting of the procedure's costs is disabled.
>>+ % However, if working, it still wouldn't be ideal as we don't
>>+ % know how many of the calls to the procedure came from within
>>+ % or without the SCC.
>>
>>
>Wasn't this occuring because the procedures had been inlined?
>
>
It happens with -O-1, so I doubt it.
>>+fix_calls_in_proc(TransformMap, proc(PredId, ProcId), !ModuleInfo) :-
>>+ some [!ProcInfo] (
>>+ module_info_pred_proc_info(!.ModuleInfo, PredId, ProcId,
>>+ PredInfo, !:ProcInfo),
>>+ % XXX: Don't modify predicates that were created by type
>>+ % specialisation. This is a last-minute workaround for some
>>+ % linking problems that occurred when such predicates in the
>>+ % library were made to call tupled procedures.
>>
>>
>Do you have a test case for this problem?
>
>
Not at the moment, but I'll see.
>The changes to the other modules look fine, although I would like the
>interval module to include a description of what an interval is at the
>head of the module - it's the obvious place to look for it.
>
>
I'll just quote from the stack opt paper.
Thanks for the comments. Interdiff coming up.
diff -u compiler/interval.m compiler/interval.m
--- compiler/interval.m 2 Mar 2005 00:15:37 -0000
+++ compiler/interval.m 3 Mar 2005 12:22:08 -0000
@@ -15,6 +15,17 @@
% to make after particular anchors. More detailed information is in
% stack_opt.m, from where this code was extracted.
%
+% A description of intervals is in the paper "Using the heap to eliminate
+% stack accesses" by Zoltan Somogyi and Peter Stuckey:
+%
+% Definition 3: An interval is a sequence of atomic goals delimited by a
+% left-right pair of anchors, satisfying the property that if forward
+% execution starts at the left anchor and continues without encountering
+% failure (which would initiate backtracking, i.e. backward execution),
+% the next anchor it reaches is the right anchor of the pair. We
+% consider a call to be part of the atomic goals of the interval only if
+% the call site is the right anchor of the interval, not the left
anchor.
+%
%-----------------------------------------------------------------------------%
:- module backend_libs__interval.
diff -u compiler/options.m compiler/options.m
--- compiler/options.m 2 Mar 2005 00:15:45 -0000
+++ compiler/options.m 3 Mar 2005 12:22:19 -0000
@@ -1734,6 +1734,8 @@
long_option("osv-full-path", optimize_saved_vars_cell_full_path).
long_option("osv-on-stack", optimize_saved_vars_cell_on_stack).
long_option("osv-cand-head",
optimize_saved_vars_cell_candidate_headvars).
+ % The next four options are used by tupling.m as well; changes to
+ % them may require changes there as well.
long_option("osv-cvstore-cost",
optimize_saved_vars_cell_cv_store_cost).
long_option("osv-cvload-cost",
optimize_saved_vars_cell_cv_load_cost).
long_option("osv-fvstore-cost",
optimize_saved_vars_cell_fv_store_cost).
diff -u compiler/tupling.m compiler/tupling.m
--- compiler/tupling.m 2 Mar 2005 00:15:49 -0000
+++ compiler/tupling.m 3 Mar 2005 12:22:22 -0000
@@ -52,14 +52,14 @@
% which values are common between the procedures in an SCC (for SCCs with
% more than one procedure). i.e. if a variable name occurs as an input
% argument to more than one procedure in the SCC, those variables
-% corresponding that name are candidates for tupling. In the interests of
+% corresponding that name are candidates for tupling. In the interest of
% speeding up compilation times, the implementation only tries to tuple
% contiguous runs of the candidate variables.
% e.g. if the candidates are [A,B,C,D] these combinations would be
tested in
% turn, assuming a minimum run length of 3: {A,B,C,D}, {A,B,C}, {B,C,D}
%
% To count the average number of loads and stores in a procedure we
traverse
-% the procedure's goal, remembering which values are available in registers
+% the procedure's body, remembering which values are available in registers
% and the stack. When we reach a branch point, we use the relative
% frequencies that each branch was taken in a sample run to weight the
costs
% incurred in each branch. The relative frequency data is gathered
from the
@@ -75,6 +75,9 @@
% This transformation is similar in spirit to the transformation in
% stack_opt.m. It also shares much code with it.
%
+% XXX: we need to check that mprof can demangle the names of the
transformed
+% procedures correctly
+%
%-----------------------------------------------------------------------------%
:- module transform_hlds__tupling.
@@ -215,6 +218,7 @@
% This predicate can be used in place of maybe_tuple_scc to evaluate
% and transform each procedure of an SCC individually. This is to
% mimic the behaviour from an earlier version of this file.
+ % It's currently unused but might be useful for debugging.
%
:- pred maybe_tuple_scc_individual_procs(trace_counts::in,
tuning_params::in,
dependency_graph::in, list(pred_proc_id)::in,
@@ -276,6 +280,9 @@
;
% No need to work on this SCC if there are no callers to it
% within this module.
+ %
+ % XXX: if part of the SCC is exported then we might want
+ % to look at it, for intermodule tupling.
(
VeryVerbose = yes,
io__write_string("% SCC has no local callers\n", !IO)
@@ -649,15 +656,6 @@
add_transformed_proc(_, no_tupling, !ModuleInfo, !TransformMap, !Counter).
-:- pred nth_member_lookup(list(T)::in, T::in, int::out) is det.
-
-nth_member_lookup(List, Elem, Position) :-
- ( list__nth_member_search(List, Elem, PositionPrime) ->
- Position = PositionPrime
- ;
- error("nth_member_lookup/3: element not found in list")
- ).
-
%-----------------------------------------------------------------------------%
:- pred make_transformed_proc(prog_var::in, prog_vars::in, insert_map::in,
@@ -685,7 +683,7 @@
VarTypes0, VarTypes1, map__init, RenameMapA, InsertMap,
MaybeGoalFeature),
- % In some cases some of the fields variables need to be available at
+ % In some cases some of the field variables need to be available at
% the very beginning of the procedure. The required deconstructions
% for those variables won't show up in the insert map. To handle this
% we just to insert a deconstruction unification at the start of the
@@ -883,11 +881,11 @@
TypeInfoLiveness, OptNoReturnCalls),
goal_path__fill_slots(!.ModuleInfo, !ProcInfo),
proc_info_goal(!.ProcInfo, Goal0),
- OptStackAlloc0 = opt_stack_alloc,
+ OptTupleAlloc0 = opt_tuple_alloc,
set__init(FailVars),
set__init(NondetLiveness0),
build_live_sets_in_goal(Goal0, Goal, FailVars,
- AllocData, OptStackAlloc0, _OptStackAlloc,
+ AllocData, OptTupleAlloc0, _OptTupleAlloc,
Liveness0, _Liveness,
NondetLiveness0, _NondetLiveness),
proc_info_set_goal(Goal, !ProcInfo),
@@ -898,29 +896,29 @@
%-----------------------------------------------------------------------------%
-% The opt_stack_alloc structure is constructed by live_vars.m. As far
as I can
+% The opt_tuple_alloc structure is constructed by live_vars.m. As far
as I can
% tell we don't need such a thing for this module so we just define
some stubs.
-:- type opt_stack_alloc ---> opt_stack_alloc.
+:- type opt_tuple_alloc ---> opt_tuple_alloc.
-:- instance stack_alloc_info(opt_stack_alloc) where [
+:- instance stack_alloc_info(opt_tuple_alloc) where [
pred(at_call_site/4) is opt_at_call_site,
pred(at_resume_site/4) is opt_at_resume_site,
pred(at_par_conj/4) is opt_at_par_conj
].
:- pred opt_at_call_site(need_across_call::in, hlds_goal_info::in,
- opt_stack_alloc::in, opt_stack_alloc::out) is det.
+ opt_tuple_alloc::in, opt_tuple_alloc::out) is det.
opt_at_call_site(_NeedAtCall, _GoalInfo, StackAlloc, StackAlloc).
:- pred opt_at_resume_site(need_in_resume::in, hlds_goal_info::in,
- opt_stack_alloc::in, opt_stack_alloc::out) is det.
+ opt_tuple_alloc::in, opt_tuple_alloc::out) is det.
opt_at_resume_site(_NeedAtResume, _GoalInfo, StackAlloc, StackAlloc).
:- pred opt_at_par_conj(need_in_par_conj::in, hlds_goal_info::in,
- opt_stack_alloc::in, opt_stack_alloc::out) is det.
+ opt_tuple_alloc::in, opt_tuple_alloc::out) is det.
opt_at_par_conj(_NeedParConj, _GoalInfo, StackAlloc, StackAlloc).
@@ -1084,7 +1082,8 @@
cls_require_in_regs(CountInfo, [Var1, Var2], !CountState)
;
Unification = complicated_unify(_, _, _),
- error("count_load_stores_in_goal: complicated_unify")
+ unexpected(this_file,
+ "count_load_stores_in_goal: complicated_unify")
).
count_load_stores_in_goal(some(_Vars, _CanRemove, Goal) - _GoalInfo,
CountInfo,
@@ -1111,7 +1110,8 @@
cls_require_flushed(CountInfo, LiveVars, !CountState)
;
ResumePoint = no_resume_point,
- error("count_load_stores_in_goal: no_resume_point for not")
+ unexpected(this_file,
+ "count_load_stores_in_goal: no_resume_point for not")
),
count_load_stores_in_goal(Goal, CountInfo, !CountState).
@@ -1250,7 +1250,8 @@
cls_clobber_regs(Outputs, !CountState)
;
MaybeNeedAcrossCall = no,
- error("handle_call: no need across call")
+ unexpected(this_file,
+ "count_load_stores_for_call: no need across call")
).
%-----------------------------------------------------------------------------%
@@ -1414,9 +1415,10 @@
VarsToLoad, !State)
;
% All the variables generated by this deconstruction
- % can be gotten from the proposed tupling, so it can be
- % ignored. The costs of loading those variables from
- % the tuple will be counted as they come.
+ % can be obtained from the proposed tupling, so the
+ % deconstruction can be ignored. The costs of loading
+ % those variables from the tuple will be counted as
+ % they come.
true
)
).
@@ -1524,21 +1526,18 @@
map__init, set__init, StartMap, EndMap,
SuccMap, VarsMap, map__init),
build_interval_info_in_goal(Goal, IntervalInfo0, IntervalInfo,
- nothing, _).
+ unit, _).
% This is needed only to satisfy the interface of interval.m
%
-:- type nothing ---> nothing.
-
-:- instance build_interval_info_acc(nothing) where [
+:- instance build_interval_info_acc(unit) where [
pred(use_cell/8) is tupling__use_cell
].
:- pred use_cell(prog_var::in, list(prog_var)::in, cons_id::in,
hlds_goal::in,
- interval_info::in, interval_info::out, nothing::in,
- nothing::out) is det.
+ interval_info::in, interval_info::out, unit::in, unit::out) is det.
-use_cell(_CellVar, _FieldVarList, _ConsId, _Goal, !IntervalInfo, !Nothing).
+use_cell(_CellVar, _FieldVarList, _ConsId, _Goal, !IntervalInfo, !Unit).
%-----------------------------------------------------------------------------%
@@ -1603,14 +1602,27 @@
% Fixing calls to transformed procedures.
%
+ % The transform_map structure records which procedures were
+ % transformed into what procedures.
+ %
:- type transform_map == map(pred_proc_id, transformed_proc).
:- type transformed_proc --->
transformed_proc(
transformed_pred_proc_id :: pred_proc_id,
+ % The pred_proc_id of the transformed version of
+ % the procedure.
tuple_cons_type :: (type),
+ % The type of the cell variable created by the
+ % transformation. This will be a tuple type.
args_to_tuple :: list(int),
+ % The argument positions of the original procedure
+ % which were tupled.
call_template :: hlds_goal
+ % A template for a call goal that is used to update
+ % calls of the original procedure to the transformed
+ % procedure instead. The arguments of the template
+ % need to be replaced by the actual arguments.
).
:- pred fix_calls_in_procs(transform_map::in, list(pred_proc_id)::in,
@@ -1729,7 +1741,7 @@
fix_calls_in_goal(par_conj(Goals0) - GoalInfo, par_conj(Goals) - GoalInfo,
!VarSet, !VarTypes, TransformMap) :-
- % I am not sure whether parallel conjunctions should be treated
+ % XXX: I am not sure whether parallel conjunctions should be treated
% with fix_calls_in_goal or fix_calls_in_goal_list. At any rate,
% this is untested.
fix_calls_in_goal_list(Goals0, Goals, !VarSet, !VarTypes,
@@ -1753,7 +1765,7 @@
fix_calls_in_goal(Else0, Else, !VarSet, !VarTypes, TransformMap).
fix_calls_in_goal(shorthand(_) - _, _, !VarSet, !VarTypes,
_TransformMap) :-
- error("fix_calls_in_goal: unexpected shorthand").
+ unexpected(this_file, "fix_calls_in_goal: unexpected shorthand").
%-----------------------------------------------------------------------------%
only in patch2:
unchanged:
--- doc/user_guide.texi 24 Feb 2005 06:07:11 -0000 1.423
+++ doc/user_guide.texi 3 Mar 2005 12:22:36 -0000
@@ -6926,6 +6926,41 @@
@c is a tuple or a type with exactly one functor.
@c Note: this is almost always a pessimization.
+ at c @sp 1
+ at c @item --tuple
+ at c @findex --tuple
+ at c Try to find opportunities for procedures to pass some
+ at c arguments to each other as a tuple rather than as
+ at c individual arguments. It requires the option
+ at c @samp{--tuple-trace-counts-file}, and can be tuned with
+ at c the options @samp{--osv-cvload-cost}, @samp{--osv-cvstore-cost},
+ at c @samp{--osv-fvload-cost}, @samp{--osv-fvstore-cost},
+ at c @samp{--tuple-costs-ratio} and @samp{--tuple-min-args}.
+ at c Note: so far this has mostly a detrimental effect.
+
+ at c @sp 1
+ at c @item --tuple-trace-counts-file
+ at c @findex --tuple-trace-counts-file
+ at c Supply a trace counts summary file for the tupling transformation.
+ at c The summary should be made from a sample run of the program you are
+ at c compiling, compiled without optimizations.
+
+ at c @sp 1
+ at c @item --tuple-costs-ratio
+ at c @findex --tuple-costs-ratio
+ at c A value of 110 for this parameter means the tupling transformation
+ at c will transform a procedure if it thinks that procedure would be
+ at c 10% worse, on average, than whatever transformed version of the
+ at c procedure it has in mind.
+
+ at c @sp 1
+ at c @item --tuple-min-args
+ at c @findex --tuple-min-args
+ at c The minimum number of input arguments that the tupling transformation
+ at c will consider passing together as a tuple. This is mostly to speed
+ at c up the compilation process by not pursuing (presumably) unfruitful
+ at c searches.
+
@end table
@node MLDS backend (MLDS -> MLDS) optimization options
only in patch2:
unchanged:
--- library/list.m 10 Feb 2005 04:10:29 -0000 1.131
+++ library/list.m 3 Mar 2005 12:22:36 -0000
@@ -314,6 +314,11 @@
%
:- pred list__nth_member_search(list(T)::in, T::in, int::out) is semidet.
+ % A deterministic version of list__nth_member_search, which aborts
+ % instead of failing if the element is not found in the list.
+ %
+:- pred list__nth_member_lookup(list(T)::in, T::in, int::out) is det.
+
% list__index*(List, Position, Elem):
% These predicates select an element in a list from it's
% position. The `index0' preds consider the first element
@@ -1025,6 +1030,13 @@
N = P
;
list__nth_member_search_2(Xs, Y, P + 1, N)
+ ).
+
+nth_member_lookup(List, Elem, Position) :-
+ ( list__nth_member_search(List, Elem, PositionPrime) ->
+ Position = PositionPrime
+ ;
+ error("list__nth_member_lookup/3: element not found in list")
).
%-----------------------------------------------------------------------------%
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list