[m-rev.] diff: Fix costs of recursive calls in seldom-used branches.

Paul Bone pbone at csse.unimelb.edu.au
Fri Jan 21 21:43:25 AEDT 2011


Zoltan and I discussed this change, there's no need for a review.

---

Fix costs of recursive calls in seldom-used branches.

When a recursive call is in a branching goal such that it is executed in some
but not all of the recursive cases we calculate it's per-call cost from the
recursion depth of MaxDepth / Calls.  This can make it a candidate for
parallelisation when only some iterations of a loop (in linear recursion) are
expensive and on average a single iteration is below the threshold for
automatic parallelism.

This is important when considering whether to push goals (such as recursive
calls) into branches to that they're next to goals that are candidates for
parallelisation - when, on average the branching goal is not a candidate for
parallelisation.  When pushing a goal and deciding whether a goal should be
pushed we must also do this calculation.

deep_profiler/analysis_utils.m:
    The recursive call cost for different calls now begins from different
    depths of recursion.  The depth of recursion used is MaxDepth / Calls,
    Which means that a recursive call that is called in all cases still has a
    depth of 1.0

deep_profiler/mdprof_fb.automatic_parallelism.m:
    When pushing a recursive call goal and detecting if it is worth-while to
    push a such a goal, test its per call cost after adjusting it for according
    to the formula above for the new number of calls it will have in this
    context. 

    Fix a module qualification for a call to fold, fold is in set.m not list.m

deep_profiler/measurements.m:
    Add goal_cost_get_total.

Index: deep_profiler/analysis_utils.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/analysis_utils.m,v
retrieving revision 1.9
diff -u -p -b -r1.9 analysis_utils.m
--- deep_profiler/analysis_utils.m	20 Jan 2011 05:35:11 -0000	1.9
+++ deep_profiler/analysis_utils.m	21 Jan 2011 10:33:24 -0000
@@ -261,21 +261,21 @@ build_recursive_call_site_cost_map(Deep,
         RecursionType = rt_not_recursive,
         MaybeRecursiveCallSiteCostMap = ok(map.init)
     ;
-        RecursionType = rt_single(_, _, _AvgMaxDepth, _AvgRecCost, CostFn),
+        RecursionType = rt_single(_, _, MaxDepth, _AvgRecCost, CostFn),
         (
             MaybeDepth = yes(Depth0),
             ( recursion_depth_is_base_case(Depth0) ->
                 MaybeRecursiveCallSiteCostMap = ok(map.init)
             ;
                 % Descend once to move to the depth of the recursive callees.
-                recursion_depth_descend(Depth0, Depth),
-                DepthI = recursion_depth_to_int(Depth),
                 get_recursive_calls_and_counts(Deep, CliquePtr, PDPtr,
                     CallCountsMap),
                 RecursiveCallSiteCostMap = map_values_only(
                     (func(Count) =
                         build_cs_cost_csq_percall(float(Count),
-                            CostFn(DepthI))),
+                                CostFn(DepthI)) :-
+                        DepthI = round_to_int(MaxDepth / float(Count))
+                    ),
                     CallCountsMap),
                 MaybeRecursiveCallSiteCostMap = ok(RecursiveCallSiteCostMap),
 
Index: deep_profiler/mdprof_fb.automatic_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/mdprof_fb.automatic_parallelism.m,v
retrieving revision 1.35
diff -u -p -b -r1.35 mdprof_fb.automatic_parallelism.m
--- deep_profiler/mdprof_fb.automatic_parallelism.m	21 Jan 2011 08:38:45 -0000	1.35
+++ deep_profiler/mdprof_fb.automatic_parallelism.m	21 Jan 2011 10:36:24 -0000
@@ -1017,6 +1017,18 @@ conj_get_conjunctions_worth_parallelisin
         SinglesSoFar1 = Singles
     ;
         Costly = is_not_costly_goal,
+        % This goal might be costly if it is pushed into the cotexted
+        % of one of SinglesSoFar.  This is common for recursive goals.
+        filter(single_context_makes_goal_costly(Info, Conj), SinglesSoFar0,
+            SinglesSoFarMakeConjCostly),
+        (
+            SinglesSoFarMakeConjCostly = []
+        ;
+            SinglesSoFarMakeConjCostly = [_ | _],
+            !:RevSingleCands = [ConjNum - SinglesSoFarMakeConjCostly |
+                !.RevSingleCands]
+        ),
+
         (
             SinglesSoFar0 = [],
             Singles = [],
@@ -1041,6 +1053,16 @@ conj_get_conjunctions_worth_parallelisin
         Conjs0, Conjs, ConjNum + 1, SinglesSoFar1, SinglesSoFar,
         !RevSingleCands, !CandidatesBelow, !Pushes, !MessagesBelow).
 
+:- pred single_context_makes_goal_costly(implicit_parallelism_info::in,
+    pard_goal_detail::in, pard_goal_detail::in) is semidet.
+
+single_context_makes_goal_costly(Info, Goal, Single) :-
+    SingleCost = Single ^ goal_annotation ^ pgd_cost,
+    SingleCount = goal_cost_get_calls(SingleCost),
+    fix_goal_counts(Info, SingleCount, Goal, ConjNewCounts),
+    identify_costly_goal(ConjNewCounts ^ goal_annotation,
+        is_costly_goal).
+
     % Given a conjunction with two or more costly goals (identified by
     % CostlyGoalsIndexes), check whether executing the conjunction in parallel
     % can yield a speedup.
@@ -1202,13 +1224,18 @@ merge_same_level_pushes(MainCandidate, [
     pard_goal_detail::in, list(pard_goal_detail)::in,
     list(goal_path_step)::out, list(pard_goal_detail)::out) is det.
 
-push_goals_create_candidate(_Info, RevCurPathSteps,
-        [], GoalToPushInto, GoalsToPush,
+push_goals_create_candidate(Info, RevCurPathSteps,
+        [], GoalToPushInto, GoalsToPush0,
         RevCandidateGoalPathSteps, CandidateConjs) :-
     RevCandidateGoalPathSteps = RevCurPathSteps,
+    % The pushed goals will have different costs in this context, in particular
+    % the number of times they're called varies, This affects the per-call
+    % costs of recursive calls.
+    Calls = goal_cost_get_calls(GoalToPushInto ^ goal_annotation ^ pgd_cost),
+    map(fix_goal_counts(Info, Calls), GoalsToPush0, GoalsToPush),
     CandidateConjs = [GoalToPushInto | GoalsToPush].
 push_goals_create_candidate(Info, RevCurPathSteps,
-        [HeadRelStep | TailRelSteps], GoalToPushInto, GoalsToPush,
+        [HeadRelStep | TailRelSteps], GoalToPushInto, GoalsToPush0,
         RevCandidateGoalPathSteps, CandidateConjs) :-
     GoalToPushInto = goal_rep(GoalExpr, _, _),
     (
@@ -1219,6 +1246,12 @@ push_goals_create_candidate(Info, RevCur
                 % Conjoin GoalsToPush not with just the expensive goal,
                 % but with the whole conjunction containing it.
                 RevCandidateGoalPathSteps = RevCurPathSteps,
+                % The pushed goals will have different costs in this context,
+                % in particular the number of times they're called varies, This
+                % affects the per-call costs of recursive calls.
+                Cost = GoalToPushInto ^ goal_annotation ^ pgd_cost,
+                Calls = goal_cost_get_calls(Cost),
+                map(fix_goal_counts(Info, Calls), GoalsToPush0, GoalsToPush),
                 CandidateConjs = Goals ++ GoalsToPush
             ;
                 TailRelSteps = [_ | _],
@@ -1226,7 +1259,7 @@ push_goals_create_candidate(Info, RevCur
                 ( Tail = [SubGoal] ->
                     push_goals_create_candidate(Info,
                         [HeadRelStep | RevCurPathSteps],
-                        TailRelSteps, SubGoal, GoalsToPush,
+                        TailRelSteps, SubGoal, GoalsToPush0,
                         RevCandidateGoalPathSteps, CandidateConjs)
                 ;
                     unexpected($module, $pred, "push into non-last conjunct")
@@ -1240,7 +1273,7 @@ push_goals_create_candidate(Info, RevCur
         ( GoalExpr = disj_rep(Goals) ->
             list.index1_det(Goals, N, SubGoal),
             push_goals_create_candidate(Info, [HeadRelStep | RevCurPathSteps],
-                TailRelSteps, SubGoal, GoalsToPush,
+                TailRelSteps, SubGoal, GoalsToPush0,
                 RevCandidateGoalPathSteps, CandidateConjs)
         ;
             unexpected($module, $pred, "not disj")
@@ -1251,7 +1284,7 @@ push_goals_create_candidate(Info, RevCur
             list.index1_det(Cases, N, Case),
             Case = case_rep(_, _, SubGoal),
             push_goals_create_candidate(Info, [HeadRelStep | RevCurPathSteps],
-                TailRelSteps, SubGoal, GoalsToPush,
+                TailRelSteps, SubGoal, GoalsToPush0,
                 RevCandidateGoalPathSteps, CandidateConjs)
         ;
             unexpected($module, $pred, "not switch")
@@ -1260,7 +1293,7 @@ push_goals_create_candidate(Info, RevCur
         HeadRelStep = step_ite_then,
         ( GoalExpr = ite_rep(_, Then, _) ->
             push_goals_create_candidate(Info, [HeadRelStep | RevCurPathSteps],
-                TailRelSteps, Then, GoalsToPush,
+                TailRelSteps, Then, GoalsToPush0,
                 RevCandidateGoalPathSteps, CandidateConjs)
         ;
             unexpected($module, $pred, "not ite_then")
@@ -1269,7 +1302,7 @@ push_goals_create_candidate(Info, RevCur
         HeadRelStep = step_ite_else,
         ( GoalExpr = ite_rep(_, _, Else) ->
             push_goals_create_candidate(Info, [HeadRelStep | RevCurPathSteps],
-                TailRelSteps, Else, GoalsToPush,
+                TailRelSteps, Else, GoalsToPush0,
                 RevCandidateGoalPathSteps, CandidateConjs)
         ;
             unexpected($module, $pred, "not ite_else")
@@ -1286,7 +1319,7 @@ push_goals_create_candidate(Info, RevCur
         HeadRelStep = step_scope(_),
         ( GoalExpr = scope_rep(SubGoal, _) ->
             push_goals_create_candidate(Info, [HeadRelStep | RevCurPathSteps],
-                TailRelSteps, SubGoal, GoalsToPush,
+                TailRelSteps, SubGoal, GoalsToPush0,
                 RevCandidateGoalPathSteps, CandidateConjs)
         ;
             unexpected($module, $pred, "not scope")
@@ -1309,6 +1342,36 @@ push_goals_create_candidate(Info, RevCur
         unexpected($module, $pred, "atomic_orelse")
     ).
 
+:- pred fix_goal_counts(implicit_parallelism_info::in, int::in,
+    pard_goal_detail::in, pard_goal_detail::out) is det.
+
+fix_goal_counts(Info, Count, !Goal) :-
+    Annotation0 = !.Goal ^ goal_annotation,
+    Cost0 = Annotation0 ^ pgd_cost,
+    (
+        GoalType = Annotation0 ^ pgd_pg_type,
+        GoalType = pgt_call(_, CostAndCallees),
+        % XXX This doesn't work if this is a non-atomic goal containing a
+        % recursive call.
+        member(Callee, CostAndCallees ^ cac_callees),
+        % The call is recursive if it calls into the current clique.
+        Info ^ ipi_clique = Callee ^ c_clique
+    ->
+        % for recursive calls.
+        CostTotal = goal_cost_get_total(Cost0),
+        PercallCost = CostTotal / float(Count)
+    ;
+        PercallCost = goal_cost_get_percall(Cost0)
+    ),
+    Cost = call_goal_cost(Count, PercallCost),
+    !Goal ^ goal_annotation ^ pgd_cost := Cost,
+    ( goal_cost_above_par_threshold(Info, Cost) ->
+        AboveThreshold = cost_above_par_threshold
+    ;
+        AboveThreshold = cost_not_above_par_threshold
+    ),
+    !Goal ^ goal_annotation ^ pgd_cost_above_threshold := AboveThreshold.
+
 :- pred pardgoals_build_candidate_conjunction(implicit_parallelism_info::in,
     program_location::in, list(goal_path_step)::in,
     maybe(push_goal)::in, list(pard_goal_detail)::in,
@@ -2535,7 +2598,7 @@ build_dependency_graph([PG | PGs], ConjN
     InsertForConjNum = ( pred(InstVar::in, MapI0::in, MapI::out) is det :-
         svmap.det_insert(InstVar, ConjNum, MapI0, MapI)
     ),
-    list.fold(InsertForConjNum, InstVars, !VarDepMap),
+    set.fold(InsertForConjNum, InstVars, !VarDepMap),
 
     build_dependency_graph(PGs, ConjNum + 1, !VarDepMap, !Graph).
 
Index: deep_profiler/measurements.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/measurements.m,v
retrieving revision 1.26
diff -u -p -b -r1.26 measurements.m
--- deep_profiler/measurements.m	13 Jan 2011 04:44:28 -0000	1.26
+++ deep_profiler/measurements.m	21 Jan 2011 10:33:24 -0000
@@ -185,6 +185,8 @@
 
 :- func goal_cost_get_percall(goal_cost_csq) = float.
 
+:- func goal_cost_get_total(goal_cost_csq) = float.
+
 :- func goal_cost_get_calls(goal_cost_csq) = int.
 
 %-----------------------------------------------------------------------------%
@@ -837,6 +839,11 @@ goal_cost_get_percall(non_trivial_goal(C
         cost_get_percall(float(Calls), Cost)
     ).
 
+goal_cost_get_total(dead_goal) = 0.0.
+goal_cost_get_total(trivial_goal(_)) = 0.0.
+goal_cost_get_total(non_trivial_goal(Cost, Calls)) =
+    cost_get_total(float(Calls), Cost).
+
 goal_cost_get_calls(dead_goal) = 0.
 goal_cost_get_calls(trivial_goal(Calls)) = Calls.
 goal_cost_get_calls(non_trivial_goal(_, Calls)) = Calls.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 489 bytes
Desc: Digital signature
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20110121/bf81647f/attachment.sig>


More information about the reviews mailing list