[m-rev.] for review: tupling transformation (part 3)

Peter Wang wangp at students.cs.mu.OZ.AU
Thu Mar 3 23:32:49 AEDT 2005


Julien Fischer wrote:

>On Wed, 2 Mar 2005, Julien Fischer wrote:
>
>>+		% XXX: There is a problem somewhere causing CALL and EXIT
>>+		% events not to show up for some procedures in trace count
>>+		% files.  The weighting of the procedure's costs is disabled.
>>+		% However, if working, it still wouldn't be ideal as we don't
>>+		% know how many of the calls to the procedure came from within
>>+		% or without the SCC.
>>    
>>
>Wasn't this occuring because the procedures had been inlined?
>  
>

It happens with -O-1, so I doubt it.

>>+fix_calls_in_proc(TransformMap, proc(PredId, ProcId), !ModuleInfo) :-
>>+	some [!ProcInfo] (
>>+		module_info_pred_proc_info(!.ModuleInfo, PredId, ProcId,
>>+			PredInfo, !:ProcInfo),
>>+		% XXX: Don't modify predicates that were created by type
>>+		% specialisation.  This is a last-minute workaround for some
>>+		% linking problems that occurred when such predicates in the
>>+		% library were made to call tupled procedures.
>>    
>>
>Do you have a test case for this problem?
>  
>

Not at the moment, but I'll see.

>The changes to the other modules look fine, although I would like the
>interval module to include a description of what an interval is at the
>head of the module - it's the obvious place to look for it.
>  
>

I'll just quote from the stack opt paper.

Thanks for the comments.  Interdiff coming up.


diff -u compiler/interval.m compiler/interval.m
--- compiler/interval.m    2 Mar 2005 00:15:37 -0000
+++ compiler/interval.m    3 Mar 2005 12:22:08 -0000
@@ -15,6 +15,17 @@
 % to make after particular anchors.  More detailed information is in
 % stack_opt.m, from where this code was extracted.
 %
+% A description of intervals is in the paper "Using the heap to eliminate
+% stack accesses" by Zoltan Somogyi and Peter Stuckey:
+%
+%    Definition 3: An interval is a sequence of atomic goals delimited by a
+%    left-right pair of anchors, satisfying the property that if forward
+%    execution starts at the left anchor and continues without encountering
+%    failure (which would initiate backtracking, i.e. backward execution),
+%    the next anchor it reaches is the right anchor of the pair.  We
+%    consider a call to be part of the atomic goals of the interval only if
+%    the call site is the right anchor of the interval, not the left 
anchor.
+%
 %-----------------------------------------------------------------------------%
 
 :- module backend_libs__interval.
diff -u compiler/options.m compiler/options.m
--- compiler/options.m    2 Mar 2005 00:15:45 -0000
+++ compiler/options.m    3 Mar 2005 12:22:19 -0000
@@ -1734,6 +1734,8 @@
 long_option("osv-full-path",        optimize_saved_vars_cell_full_path).
 long_option("osv-on-stack",        optimize_saved_vars_cell_on_stack).
 long_option("osv-cand-head",        
optimize_saved_vars_cell_candidate_headvars).
+    % The next four options are used by tupling.m as well; changes to
+    % them may require changes there as well.
 long_option("osv-cvstore-cost",        
optimize_saved_vars_cell_cv_store_cost).
 long_option("osv-cvload-cost",        
optimize_saved_vars_cell_cv_load_cost).
 long_option("osv-fvstore-cost",        
optimize_saved_vars_cell_fv_store_cost).
diff -u compiler/tupling.m compiler/tupling.m
--- compiler/tupling.m    2 Mar 2005 00:15:49 -0000
+++ compiler/tupling.m    3 Mar 2005 12:22:22 -0000
@@ -52,14 +52,14 @@
 % which values are common between the procedures in an SCC (for SCCs with
 % more than one procedure).  i.e. if a variable name occurs as an input
 % argument to more than one procedure in the SCC, those variables
-% corresponding that name are candidates for tupling.  In the interests of
+% corresponding that name are candidates for tupling.  In the interest of
 % speeding up compilation times, the implementation only tries to tuple
 % contiguous runs of the candidate variables.
 % e.g. if the candidates are [A,B,C,D] these combinations would be 
tested in
 % turn, assuming a minimum run length of 3: {A,B,C,D}, {A,B,C}, {B,C,D}
 %
 % To count the average number of loads and stores in a procedure we 
traverse
-% the procedure's goal, remembering which values are available in registers
+% the procedure's body, remembering which values are available in registers
 % and the stack.  When we reach a branch point, we use the relative
 % frequencies that each branch was taken in a sample run to weight the 
costs
 % incurred in each branch.  The relative frequency data is gathered 
from the
@@ -75,6 +75,9 @@
 % This transformation is similar in spirit to the transformation in
 % stack_opt.m.  It also shares much code with it.
 %
+% XXX: we need to check that mprof can demangle the names of the 
transformed
+% procedures correctly
+%
 %-----------------------------------------------------------------------------%
 
 :- module transform_hlds__tupling.
@@ -215,6 +218,7 @@
     % This predicate can be used in place of maybe_tuple_scc to evaluate
     % and transform each procedure of an SCC individually.  This is to
     % mimic the behaviour from an earlier version of this file.
+    % It's currently unused but might be useful for debugging.
     %
 :- pred maybe_tuple_scc_individual_procs(trace_counts::in, 
tuning_params::in,
     dependency_graph::in, list(pred_proc_id)::in,
@@ -276,6 +280,9 @@
     ;
         % No need to work on this SCC if there are no callers to it
         % within this module.
+        %
+        % XXX: if part of the SCC is exported then we might want
+        % to look at it, for intermodule tupling.
         (
             VeryVerbose = yes,
             io__write_string("% SCC has no local callers\n", !IO)
@@ -649,15 +656,6 @@
 
 add_transformed_proc(_, no_tupling, !ModuleInfo, !TransformMap, !Counter).
 
-:- pred nth_member_lookup(list(T)::in, T::in, int::out) is det.
-
-nth_member_lookup(List, Elem, Position) :-
-    ( list__nth_member_search(List, Elem, PositionPrime) ->
-        Position = PositionPrime
-    ;
-        error("nth_member_lookup/3: element not found in list")
-    ).
-
 %-----------------------------------------------------------------------------%
 
 :- pred make_transformed_proc(prog_var::in, prog_vars::in, insert_map::in,
@@ -685,7 +683,7 @@
         VarTypes0, VarTypes1, map__init, RenameMapA, InsertMap,
         MaybeGoalFeature),
 
-    % In some cases some of the fields variables need to be available at
+    % In some cases some of the field variables need to be available at
     % the very beginning of the procedure.    The required deconstructions
     % for those variables won't show up in the insert map.  To handle this
     % we just to insert a deconstruction unification at the start of the
@@ -883,11 +881,11 @@
             TypeInfoLiveness, OptNoReturnCalls),
         goal_path__fill_slots(!.ModuleInfo, !ProcInfo),
         proc_info_goal(!.ProcInfo, Goal0),
-        OptStackAlloc0 = opt_stack_alloc,
+        OptTupleAlloc0 = opt_tuple_alloc,
         set__init(FailVars),
         set__init(NondetLiveness0),
         build_live_sets_in_goal(Goal0, Goal, FailVars,
-            AllocData, OptStackAlloc0, _OptStackAlloc,
+            AllocData, OptTupleAlloc0, _OptTupleAlloc,
             Liveness0, _Liveness,
             NondetLiveness0, _NondetLiveness),
         proc_info_set_goal(Goal, !ProcInfo),
@@ -898,29 +896,29 @@
 
 %-----------------------------------------------------------------------------%
 
-% The opt_stack_alloc structure is constructed by live_vars.m.  As far 
as I can
+% The opt_tuple_alloc structure is constructed by live_vars.m.  As far 
as I can
 % tell we don't need such a thing for this module so we just define 
some stubs.
 
-:- type opt_stack_alloc ---> opt_stack_alloc.
+:- type opt_tuple_alloc ---> opt_tuple_alloc.
 
-:- instance stack_alloc_info(opt_stack_alloc) where [
+:- instance stack_alloc_info(opt_tuple_alloc) where [
     pred(at_call_site/4) is opt_at_call_site,
     pred(at_resume_site/4) is opt_at_resume_site,
     pred(at_par_conj/4) is opt_at_par_conj
 ].
 
 :- pred opt_at_call_site(need_across_call::in, hlds_goal_info::in,
-    opt_stack_alloc::in, opt_stack_alloc::out) is det.
+    opt_tuple_alloc::in, opt_tuple_alloc::out) is det.
 
 opt_at_call_site(_NeedAtCall, _GoalInfo, StackAlloc, StackAlloc).
 
 :- pred opt_at_resume_site(need_in_resume::in, hlds_goal_info::in,
-    opt_stack_alloc::in, opt_stack_alloc::out) is det.
+    opt_tuple_alloc::in, opt_tuple_alloc::out) is det.
 
 opt_at_resume_site(_NeedAtResume, _GoalInfo, StackAlloc, StackAlloc).
 
 :- pred opt_at_par_conj(need_in_par_conj::in, hlds_goal_info::in,
-    opt_stack_alloc::in, opt_stack_alloc::out) is det.
+    opt_tuple_alloc::in, opt_tuple_alloc::out) is det.
 
 opt_at_par_conj(_NeedParConj, _GoalInfo, StackAlloc, StackAlloc).
 
@@ -1084,7 +1082,8 @@
         cls_require_in_regs(CountInfo, [Var1, Var2], !CountState)
     ;
         Unification = complicated_unify(_, _, _),
-        error("count_load_stores_in_goal: complicated_unify")
+        unexpected(this_file,
+            "count_load_stores_in_goal: complicated_unify")
     ).
 
 count_load_stores_in_goal(some(_Vars, _CanRemove, Goal) - _GoalInfo, 
CountInfo,
@@ -1111,7 +1110,8 @@
         cls_require_flushed(CountInfo, LiveVars, !CountState)
     ;
         ResumePoint = no_resume_point,
-        error("count_load_stores_in_goal: no_resume_point for not")
+        unexpected(this_file,
+            "count_load_stores_in_goal: no_resume_point for not")
     ),
     count_load_stores_in_goal(Goal, CountInfo, !CountState).
 
@@ -1250,7 +1250,8 @@
         cls_clobber_regs(Outputs, !CountState)
     ;
         MaybeNeedAcrossCall = no,
-        error("handle_call: no need across call")
+        unexpected(this_file,
+            "count_load_stores_for_call: no need across call")
     ).
 
 %-----------------------------------------------------------------------------%
@@ -1414,9 +1415,10 @@
                 VarsToLoad, !State)
         ;
             % All the variables generated by this deconstruction
-            % can be gotten from the proposed tupling, so it can be
-            % ignored.  The costs of loading those variables from
-            % the tuple will be counted as they come.
+            % can be obtained from the proposed tupling, so the
+            % deconstruction can be ignored.  The costs of loading
+            % those variables from the tuple will be counted as
+            % they come.
             true
         )
     ).
@@ -1524,21 +1526,18 @@
         map__init, set__init, StartMap, EndMap,
         SuccMap, VarsMap, map__init),
     build_interval_info_in_goal(Goal, IntervalInfo0, IntervalInfo,
-        nothing, _).
+        unit, _).
 
     % This is needed only to satisfy the interface of interval.m
     %
-:- type nothing ---> nothing.
-
-:- instance build_interval_info_acc(nothing) where [
+:- instance build_interval_info_acc(unit) where [
     pred(use_cell/8) is tupling__use_cell
 ].
 
 :- pred use_cell(prog_var::in, list(prog_var)::in, cons_id::in, 
hlds_goal::in,
-    interval_info::in, interval_info::out, nothing::in,
-    nothing::out) is det.
+    interval_info::in, interval_info::out, unit::in, unit::out) is det.
 
-use_cell(_CellVar, _FieldVarList, _ConsId, _Goal, !IntervalInfo, !Nothing).
+use_cell(_CellVar, _FieldVarList, _ConsId, _Goal, !IntervalInfo, !Unit).
 
 %-----------------------------------------------------------------------------%
 
@@ -1603,14 +1602,27 @@
 % Fixing calls to transformed procedures.
 %
 
+    % The transform_map structure records which procedures were
+    % transformed into what procedures.
+    %
 :- type transform_map == map(pred_proc_id, transformed_proc).
 
 :- type transformed_proc --->
     transformed_proc(
         transformed_pred_proc_id    :: pred_proc_id,
+            % The pred_proc_id of the transformed version of
+            % the procedure.
         tuple_cons_type            :: (type),
+            % The type of the cell variable created by the
+            % transformation.  This will be a tuple type.
         args_to_tuple            :: list(int),
+            % The argument positions of the original procedure
+            % which were tupled.
         call_template            :: hlds_goal
+            % A template for a call goal that is used to update
+            % calls of the original procedure to the transformed
+            % procedure instead.  The arguments of the template
+            % need to be replaced by the actual arguments.
     ).
 
 :- pred fix_calls_in_procs(transform_map::in, list(pred_proc_id)::in,
@@ -1729,7 +1741,7 @@
 
 fix_calls_in_goal(par_conj(Goals0) - GoalInfo, par_conj(Goals) - GoalInfo,
         !VarSet, !VarTypes, TransformMap) :-
-    % I am not sure whether parallel conjunctions should be treated
+    % XXX: I am not sure whether parallel conjunctions should be treated
     % with fix_calls_in_goal or fix_calls_in_goal_list.  At any rate,
     % this is untested.
     fix_calls_in_goal_list(Goals0, Goals, !VarSet, !VarTypes,
@@ -1753,7 +1765,7 @@
     fix_calls_in_goal(Else0, Else, !VarSet, !VarTypes, TransformMap).
 
 fix_calls_in_goal(shorthand(_) - _, _, !VarSet, !VarTypes, 
_TransformMap) :-
-    error("fix_calls_in_goal: unexpected shorthand").
+    unexpected(this_file, "fix_calls_in_goal: unexpected shorthand").
 
 %-----------------------------------------------------------------------------%
 
only in patch2:
unchanged:
--- doc/user_guide.texi    24 Feb 2005 06:07:11 -0000    1.423
+++ doc/user_guide.texi    3 Mar 2005 12:22:36 -0000
@@ -6926,6 +6926,41 @@
 @c is a tuple or a type with exactly one functor.
 @c Note: this is almost always a pessimization.
 
+ at c @sp 1
+ at c @item --tuple
+ at c @findex --tuple
+ at c Try to find opportunities for procedures to pass some
+ at c arguments to each other as a tuple rather than as
+ at c individual arguments.  It requires the option
+ at c @samp{--tuple-trace-counts-file}, and can be tuned with
+ at c the options @samp{--osv-cvload-cost}, @samp{--osv-cvstore-cost},
+ at c @samp{--osv-fvload-cost}, @samp{--osv-fvstore-cost},
+ at c @samp{--tuple-costs-ratio} and @samp{--tuple-min-args}.
+ at c Note: so far this has mostly a detrimental effect.
+
+ at c @sp 1
+ at c @item --tuple-trace-counts-file
+ at c @findex --tuple-trace-counts-file
+ at c Supply a trace counts summary file for the tupling transformation.
+ at c The summary should be made from a sample run of the program you are
+ at c compiling, compiled without optimizations.
+
+ at c @sp 1
+ at c @item --tuple-costs-ratio
+ at c @findex --tuple-costs-ratio
+ at c A value of 110 for this parameter means the tupling transformation
+ at c will transform a procedure if it thinks that procedure would be
+ at c 10% worse, on average, than whatever transformed version of the
+ at c procedure it has in mind.
+
+ at c @sp 1
+ at c @item --tuple-min-args
+ at c @findex --tuple-min-args
+ at c The minimum number of input arguments that the tupling transformation
+ at c will consider passing together as a tuple.  This is mostly to speed
+ at c up the compilation process by not pursuing (presumably) unfruitful
+ at c searches.
+
 @end table
 
 @node MLDS backend (MLDS -> MLDS) optimization options
only in patch2:
unchanged:
--- library/list.m    10 Feb 2005 04:10:29 -0000    1.131
+++ library/list.m    3 Mar 2005 12:22:36 -0000
@@ -314,6 +314,11 @@
     %
 :- pred list__nth_member_search(list(T)::in, T::in, int::out) is semidet.
 
+    % A deterministic version of list__nth_member_search, which aborts
+    % instead of failing if the element is not found in the list.
+    %
+:- pred list__nth_member_lookup(list(T)::in, T::in, int::out) is det.
+
     % list__index*(List, Position, Elem):
     %    These predicates select an element in a list from it's
     %    position.  The `index0' preds consider the first element
@@ -1025,6 +1030,13 @@
         N = P
     ;
         list__nth_member_search_2(Xs, Y, P + 1, N)
+    ).
+
+nth_member_lookup(List, Elem, Position) :-
+    ( list__nth_member_search(List, Elem, PositionPrime) ->
+        Position = PositionPrime
+    ;
+        error("list__nth_member_lookup/3: element not found in list")
     ).
 
 %-----------------------------------------------------------------------------%

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list