[m-rev.] diff in progress: branch and bound

Zoltan Somogyi zs at csse.unimelb.edu.au
Fri Sep 16 18:33:36 AEST 2011
Previous message: [m-rev.] diff: add --output-c-compiler-type
Next message: [m-rev.] diff in progress: branch and bound
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Simplify the branch-and-bound search.

This diff temporarily disables the code that adjusts the edges of the middle
region, which is why it won't be committed by itself. It will be committed
together with later changes that will simplify and (I expect) speed up
the calculation of costs by making the calculation incremental where possible.

Zoltan.

deep_profiler/autopar_types.m:
	Do not include the structure containing all the static information
	about the current autoparallelisation task in incomplete
	parallelisations. All the places where we need the static information
	either already have it or can be trivially given it.

	Comment out a field of a data structure that looks to be vital,
	but is not currently used.

	Put functions that operate on a data structure next to that data
	structure.

deep_profiler/autopar_calc_overlap.m:
	Conform to the changes in autopar_types.m.

deep_profiler/autopar_find_best_par.m:
	Implement the branch-and-bound search directly, exploring alternative
	branches in two conjuncts, not two disjuncts. Since passing information
	gathered from the exploration of one conjunct to the next conjunct
	is trivial, while the same is far from true for disjuncts, this greatly
	simplifies the code.

	Change the statistics we gather: stop collecting information (open vs
	closed branches) that does not tell us anything meaningful, and start
	collection information (tests on incomplete branches that led or did
	not lead to pruning) that IS useful.

	Rename some data types to better reflect their meaning.
	Avoid unnecessary conversions from incomplete to complete
	parallelisations.

	Give some predicates more meaningful names.

	Put related predicates in this module next to each other.

deep_profiler/autopar_search_goals.m:
	Conform to the changes in autopar_find_best_par.m.

deep_profiler/branch_and_bound.m:
	This module is left unchanged, but it is not used or needed anymore.

cvs diff: Diffing .
Index: autopar_calc_overlap.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/deep_profiler/autopar_calc_overlap.m,v
retrieving revision 1.3
diff -u -b -r1.3 autopar_calc_overlap.m
--- autopar_calc_overlap.m	3 May 2011 04:35:00 -0000	1.3
+++ autopar_calc_overlap.m	16 Sep 2011 06:15:53 -0000
@@ -19,15 +19,16 @@
 
 :- import_module mdprof_fb.automatic_parallelism.autopar_types.
 
-    % calculate_parallel_cost(Info, !Parallelisation).
+    % calculate_parallel_cost(Info, !Parallelisation, CostData).
     %
     % Analyse the parallel conjuncts and determine their likely performance.
     %
     % This is the new parallel execution overlap algorithm, it is general and
     % therefore we also use it for independent conjunctions.
     %
-:- pred calculate_parallel_cost(parallelisation_cost_data::out,
-    incomplete_parallelisation::in, incomplete_parallelisation::out) is det.
+:- pred calculate_parallel_cost(implicit_parallelism_info::in,
+    incomplete_parallelisation::in, incomplete_parallelisation::out,
+    parallelisation_cost_data::out) is det.
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
@@ -57,7 +58,7 @@
 
 %----------------------------------------------------------------------------%
 
-calculate_parallel_cost(CostData, !Parallelisation) :-
+calculate_parallel_cost(Info, !Parallelisation, CostData) :-
     ParConj = ip_get_par_conjs(!.Parallelisation),
     NumCalls = !.Parallelisation ^ ip_num_calls,
 
@@ -70,7 +71,6 @@
         (func(P0, MaybeCost) = P0 ^ ip_maybe_goals_after_cost := MaybeCost),
         ip_get_goals_after, CostAfterPercall, NumCalls, !Parallelisation),
 
-    Info = !.Parallelisation ^ ip_info,
     Opts = Info ^ ipi_opts,
     SparkCost = Opts ^ cpcp_sparking_cost,
     SparkDelay = Opts ^ cpcp_sparking_delay,
@@ -185,7 +185,7 @@
             calculate_dependent_parallel_cost_2(Info, !.ProductionsMap),
             ConsumptionsAndProductionsList, 0.0, LastSeqConsumeTime,
             StartTime, LastParConsumeTime, StartTime, LastResumeTime,
-            [], RevExecution0, map.init, ConsumptionsMap),
+            [], RevExecution0, map.init, _ConsumptionsMap),
 
         % Calculate the point at which this conjunct finishes execution
         % and complete the RevExecutions structure..
@@ -206,7 +206,7 @@
 
         CostBPar = CostB + SparkCost,
         Execution = [StartTime - (StartTime + CostB)],
-        ConsumptionsMap = init,
+        % ConsumptionsMap = init,
         CostSignals = 0.0,
         CostWaits = 0.0,
         DeadTime = 0.0
@@ -226,9 +226,10 @@
     list.foldl3(get_productions_map(RightProducedVars), Conjunct, StartTime, _,
         Execution, _, !ProductionsMap),
 
-    DepConjExec = dependent_conjunct_execution(Execution,
-        !.ProductionsMap, ConsumptionsMap),
-    !:Overlap = peo_conjunction(!.Overlap, DepConjExec, Vars),
+    % XXX Currently unused.
+    % DepConjExec = dependent_conjunct_execution(Execution,
+    %     !.ProductionsMap, ConsumptionsMap),
+    !:Overlap = peo_conjunction(!.Overlap, /* DepConjExec, */ Vars),
     !:ConjNum = !.ConjNum + 1.
 
     % calculate_dependent_parallel_cost_2(Info, ProductionsMap,
Index: autopar_find_best_par.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/deep_profiler/autopar_find_best_par.m,v
retrieving revision 1.5
diff -u -b -r1.5 autopar_find_best_par.m
--- autopar_find_best_par.m	6 May 2011 05:03:27 -0000	1.5
+++ autopar_find_best_par.m	16 Sep 2011 06:15:05 -0000
@@ -27,18 +27,18 @@
 :- import_module list.
 :- import_module maybe.
 
-:- type best_parallelisation
-    --->    bp_parallel_execution(
-                bp_goals_before         :: list(pard_goal_detail),
-                bp_par_conjs            :: list(seq_conj(pard_goal_detail)),
-                bp_goals_after          :: list(pard_goal_detail),
-                bp_is_dependent         :: conjuncts_are_dependent,
-                bp_par_exec_metrics     :: parallel_exec_metrics
+:- type full_parallelisation
+    --->    fp_parallel_execution(
+                fp_goals_before         :: list(pard_goal_detail),
+                fp_par_conjs            :: list(seq_conj(pard_goal_detail)),
+                fp_goals_after          :: list(pard_goal_detail),
+                fp_is_dependent         :: conjuncts_are_dependent,
+                fp_par_exec_metrics     :: parallel_exec_metrics
             ).
 
 :- pred find_best_parallelisation(implicit_parallelism_info::in,
     program_location::in, list(pard_goal_detail)::in,
-    maybe(best_parallelisation)::out,
+    maybe(full_parallelisation)::out,
     cord(message)::in, cord(message)::out) is det.
 
 %-----------------------------------------------------------------------------%
@@ -46,18 +46,20 @@
 
 :- implementation.
 
-:- import_module branch_and_bound.
+% XXX :- import_module branch_and_bound.
 :- import_module mdbcomp.program_representation.
 :- import_module mdprof_fb.automatic_parallelism.autopar_calc_overlap.
 :- import_module mdprof_fb.automatic_parallelism.autopar_search_goals. % XXX
 :- import_module measurements.
 
 :- import_module array.
+:- import_module benchmarking.
 :- import_module digraph.
 :- import_module float.
 :- import_module io.
 :- import_module int.
 :- import_module map.
+:- import_module pair.
 :- import_module require.
 :- import_module set.
 :- import_module string.
@@ -208,7 +210,7 @@
     identify_costly_goals(Goals, 0, CostlyGoalsIndexes),
     (
         CostlyGoalsIndexes = [FirstCostlyGoalIndexPrime | OtherIndexes],
-        last(OtherIndexes, LastCostlyGoalIndexPrime)
+        list.last(OtherIndexes, LastCostlyGoalIndexPrime)
     ->
         FirstCostlyGoalIndex = FirstCostlyGoalIndexPrime,
         LastCostlyGoalIndex = LastCostlyGoalIndexPrime
@@ -224,7 +226,7 @@
         ; Detism = cc_multidet_rep
         ),
 
-        % Phase 3: Process the middle section into groups.
+        % Phase 4: Process the middle section into groups.
         foldl_sub_array(preprocess_conjunction_into_groups, GoalsArray,
             FirstCostlyGoalIndex, LastCostlyGoalIndex, [], RevGoalGroups),
         list.reverse(RevGoalGroups, GoalGroups),
@@ -269,10 +271,10 @@
     InstMapInfo = PG ^ goal_annotation ^ pgd_inst_map_info,
 
     % For each variable consumed by a goal we find out which goals instantiate
-    % that variable and add them as it's dependencies along with their
+    % that variable and add them as its dependencies along with their
     % dependencies.  NOTE: We only consider variables that are read
-    % and not those that are set.  This is safe because we only bother
-    % analysing single assignment code.
+    % and not those that are set.  This is safe because we only analyse
+    % single assignment code.
     RefedVars = InstMapInfo ^ im_consumed_vars,
     digraph.add_vertex(ConjNum, ThisConjKey, !Graph),
     MaybeAddEdge = ( pred(RefedVar::in, GraphI0::in, GraphI::out) is det :-
@@ -286,9 +288,9 @@
     ),
     list.foldl(MaybeAddEdge, set.to_sorted_list(RefedVars), !Graph),
 
-    % For each variable instantiated by this goal add it to the VarDepMap with
-    % this goal as it's instantiator.  That is a maping from the variable to
-    % the conj num.
+    % For each variable instantiated by this goal, add it to the VarDepMap
+    % with this goal as its instantiator. That is a mapping from the variable
+    % to the conj num.
     InstVars = InstMapInfo ^ im_bound_vars,
     InsertForConjNum = ( pred(InstVar::in, MapI0::in, MapI::out) is det :-
         map.det_insert(InstVar, ConjNum, MapI0, MapI)
@@ -381,7 +383,7 @@
     %
 :- pred find_best_parallelisation_complete_bnb(implicit_parallelism_info::in,
     program_location::in, best_par_algorithm_simple::in,
-    goals_for_parallelisation::in, maybe(best_parallelisation)::out) is det.
+    goals_for_parallelisation::in, maybe(full_parallelisation)::out) is det.
 
 find_best_parallelisation_complete_bnb(Info, Location, Algorithm,
         PreprocessedGoals, MaybeBestParallelisation) :-
@@ -389,33 +391,40 @@
         io.format("D: Find best parallelisation for:\n%s\n",
             [s(LocationString)], !IO),
         location_to_string(1, Location, LocationCord),
-        string.append_list(list(LocationCord), LocationString),
+        string.append_list(cord.list(LocationCord), LocationString),
         NumGoals = size(PreprocessedGoals ^ gfp_goals),
         NumGoalsBefore = PreprocessedGoals ^ gfp_first_costly_goal,
-        NumGoalsAfter = NumGoals - PreprocessedGoals ^ gfp_last_costly_goal - 1,
+        NumGoalsAfter = NumGoals -
+            PreprocessedGoals ^ gfp_last_costly_goal - 1,
         NumGoalsMiddle = NumGoals - NumGoalsBefore - NumGoalsAfter,
         io.format("D: Problem size (before, middle, after): %d,%d,%d\n",
             [i(NumGoalsBefore), i(NumGoalsMiddle), i(NumGoalsAfter)], !IO),
         io.flush_output(!IO)
     ),
 
-    branch_and_bound(
-        generate_parallelisations(Info, Algorithm, PreprocessedGoals),
-        parallelisation_get_objective_value,
-        Solutions, Profile),
+    promise_equivalent_solutions [GenParTime, EqualBestSolns, Profile] (
+        benchmark_det(
+            generate_parallelisations(Info, Algorithm),
+            PreprocessedGoals, EqualBestSolns - Profile, 1, GenParTime)
+    ),
 
     trace [compile_time(flag("debug_branch_and_bound")), io(!IO)] (
         io.format("D: Solutions: %d\n",
-            [i(set.count(Solutions))], !IO),
-        io.format("D: Branch and bound profile: %s\n\n",
+            [i(list.length(EqualBestSolns))], !IO),
+        io.format("D: Branch and bound profile: %s\n",
             [s(string(Profile))], !IO),
+        io.format("D: Time: %d ms\n\n",
+            [i(GenParTime)], !IO),
         io.flush_output(!IO)
     ),
 
-    ( set.remove_least(BestParallelisation, Solutions, _) ->
+    (
+        EqualBestSolns = [BestIncompleteParallelisation | _],
+        finalise_parallelisation(BestIncompleteParallelisation,
+            BestParallelisation),
         MaybeBestParallelisation = yes(BestParallelisation)
     ;
-        % Solutions is empty.
+        EqualBestSolns = [],
         ParalleliseDepConjs = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs,
         (
             ParalleliseDepConjs = parallelise_dep_conjs(_),
@@ -424,307 +433,339 @@
             ParalleliseDepConjs = do_not_parallelise_dep_conjs,
             % Try again to get the best dependent parallelisation.
             % This is used for guided parallelisation.
-            TempInfo = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs :=
+            UpdatedInfo = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs :=
                 parallelise_dep_conjs(estimate_speedup_by_overlap),
-            find_best_parallelisation_complete_bnb(TempInfo, Location,
+            find_best_parallelisation_complete_bnb(UpdatedInfo, Location,
                 Algorithm, PreprocessedGoals, MaybeBestParallelisation)
         )
     ).
 
-    % The objective function for the branch and bound search.
-    % This is ParTime + ParOverheads * 2.  That is we are willing to pay
-    % 1 unit of parallel overheads to get a 2 unit improvement
-    % of parallel execution time.
+    % Profiling information for an execution of the solver.
     %
-:- func parallelisation_get_objective_value(best_parallelisation) = float.
-
-parallelisation_get_objective_value(Parallelisation) = Value :-
-    Metrics = Parallelisation ^ bp_par_exec_metrics,
-    Value = Metrics ^ pem_par_time +
-        parallel_exec_metrics_get_overheads(Metrics) * 2.0.
+:- type bnb_profile
+    --->    bnb_profile(
+                bnbp_incomplete_good_enough         :: int,
+                bnbp_incomplete_not_good_enough     :: int,
+                bnbp_complete_best_solution         :: int,
+                bnbp_complete_equal_solution        :: int,
+                bnbp_complete_worse_solution        :: int,
+                bnbp_complete_non_solution          :: int
+            ).
+
+    % The equal best solutions found so far (if we have found some solutions),
+    % and the value of the objective function for these solutions.
+    % The objective function represents a cost, so we look for solutions
+    % with the smallest possible value of the objective function.
+    %
+:- type best_solutions(T)
+    --->    no_best_solutions
+    ;       best_solutions(
+                bs_solutions            :: list(T),
+                bs_objective_value      :: float
+            ).
 
-:- impure pred generate_parallelisations(implicit_parallelism_info::in,
+:- pred generate_parallelisations(implicit_parallelism_info::in,
     best_par_algorithm_simple::in, goals_for_parallelisation::in,
-    bnb_state(best_parallelisation)::in, best_parallelisation::out) is nondet.
+    pair(list(incomplete_parallelisation), bnb_profile)::out) is det.
 
 generate_parallelisations(Info, Algorithm, GoalsForParallelisation,
-        BNBState, BestParallelisation) :-
-    some [!Parallelisation, !GoalGroups] (
-        start_building_parallelisation(Info, GoalsForParallelisation,
-            !:Parallelisation),
+        EqualBestSolns - FinalProfile) :-
+    some [!GoalGroups, !MaybeBestSolns, !Profile] (
+        start_building_parallelisation(GoalsForParallelisation,
+            IncompleteParallelisation0),
+
+        % Set the last scheduled goal to the goal at the end of the first
+        % group, popping the first group off the list. This initialises the
+        % parallelisation with the first goal group occurring first in the
+        % first parallel conjunction.
+        %
+        % We do this outside of the loop below because the first goal group
+        % will always be added to the first (initially empty) parallel
+        % conjunct; it does not make sense to have it start a new parallel
+        % conjunct.
 
         !:GoalGroups = GoalsForParallelisation ^ gfp_groups,
-        start_first_par_conjunct(!GoalGroups, !Parallelisation),
-        impure generate_parallelisations_body(Info, BNBState, Algorithm,
-            !.GoalGroups, !Parallelisation),
-
-        ( semipure should_expand_search(BNBState, Algorithm) ->
-            % Try to push goals into the first and last parallel conjuncts
-            % from outside the parallel conjunction.
-            semipure add_goals_into_first_par_conj(BNBState, !Parallelisation),
-            semipure add_goals_into_last_par_conj(BNBState, !Parallelisation)
+        (
+            !.GoalGroups = [],
+            unexpected($module, $pred, "no goal groups")
         ;
-            true
-        ),
-
-        finalise_parallelisation(!.Parallelisation, BestParallelisation)
+            !.GoalGroups = [Group | !:GoalGroups],
+            gg_get_details(Group, Index, Num, _),
+            LastScheduledGoal = Index + Num - 1,
+            IncompleteParallelisation1 =
+                IncompleteParallelisation0 ^ ip_last_scheduled_goal
+                    := LastScheduledGoal
     ),
-    semipure test_incomplete_solution(BNBState, BestParallelisation).
-
-:- pred start_building_parallelisation(implicit_parallelism_info::in,
-    goals_for_parallelisation::in,
-    incomplete_parallelisation::out) is det.
 
-start_building_parallelisation(Info, PreprocessedGoals, Parallelisation) :-
-    GoalsArray = PreprocessedGoals ^ gfp_goals,
-    FirstParGoal = PreprocessedGoals ^ gfp_first_costly_goal,
-    LastParGoal = PreprocessedGoals ^ gfp_last_costly_goal,
-    NumCalls = PreprocessedGoals ^ gfp_num_calls,
-    DependencyGraphs = PreprocessedGoals ^ gfp_dependency_graphs,
-    Parallelisation = incomplete_parallelisation(Info, GoalsArray,
-        FirstParGoal, LastParGoal, FirstParGoal, [], NumCalls,
-        DependencyGraphs, no, no, no).
+        !:MaybeBestSolns = no_best_solutions,
+        !:Profile = bnb_profile(0, 0, 0, 0, 0, 0),
 
-    % Finalise the parallelisation.
-    %
-:- pred finalise_parallelisation(incomplete_parallelisation::in,
-    best_parallelisation::out) is det.
+        generate_parallelisations_loop(Info, Algorithm, !.GoalGroups,
+            IncompleteParallelisation1, !MaybeBestSolns, !Profile),
 
-finalise_parallelisation(Incomplete, Best) :-
-    GoalsBefore = ip_get_goals_before(Incomplete),
-    GoalsAfter = ip_get_goals_after(Incomplete),
-
-    MaybeCostData = Incomplete ^ ip_maybe_par_cost_data,
     (
-        MaybeCostData = yes(CostData)
+            !.MaybeBestSolns = no_best_solutions,
+            EqualBestSolns = []
     ;
-        MaybeCostData = no,
-        unexpected($module, $pred, "parallelisation has no cost data")
+            !.MaybeBestSolns = best_solutions(EqualBestSolns, _)
     ),
-    CostData = parallelisation_cost_data(_, Overlap, Metrics0, _),
 
-    Metrics = finalise_parallel_exec_metrics(Metrics0),
-    par_conj_overlap_is_dependent(Overlap, IsDependent),
-    ParConjs = ip_get_par_conjs(Incomplete),
-    Best = bp_parallel_execution(GoalsBefore, ParConjs,
-        GoalsAfter, IsDependent, Metrics).
-
-%----------------------------------------------------------------------------%
+%       ( semipure should_expand_search(BNBState, Algorithm) ->
+%           % Try to push goals into the first and last parallel conjuncts
+%           % from outside the parallel conjunction.
+%           semipure add_goals_into_first_par_conj(BNBState, !Parallelisation),
+%           semipure add_goals_into_last_par_conj(BNBState, !Parallelisation)
+%       ;
+%           true
+%       ),
 
-:- semipure pred add_goals_into_first_par_conj(
-    bnb_state(best_parallelisation)::in,
-    incomplete_parallelisation::in, incomplete_parallelisation::out) is multi.
-
-add_goals_into_first_par_conj(BNBState, !Parallelisation) :-
-    FirstGoal0 = !.Parallelisation ^ ip_first_par_goal,
-    (
-        FirstGoal0 > 0,
-        Goals = !.Parallelisation ^ ip_goals,
-        Goal = lookup(Goals, FirstGoal0 - 1),
-        can_parallelise_goal(Goal),
-
-        % There are goals before the parallel conjunction that can be included
-        % in the parallel conjunction.
-        add_one_goal_into_first_par_conj(!Parallelisation),
-        semipure test_parallelisation(BNBState, !Parallelisation),
-        semipure add_goals_into_first_par_conj(BNBState, !Parallelisation)
-    ;
-        true
+        FinalProfile = !.Profile
     ).
 
-:- semipure pred add_goals_into_last_par_conj(
-    bnb_state(best_parallelisation)::in,
-    incomplete_parallelisation::in, incomplete_parallelisation::out) is multi.
-
-add_goals_into_last_par_conj(BNBState, !Parallelisation) :-
-    NumGoals = ip_get_num_goals(!.Parallelisation),
-    LastParGoal = !.Parallelisation ^ ip_last_par_goal,
-    (
-        LastParGoal < NumGoals - 1,
-        Goals = !.Parallelisation ^ ip_goals,
-        Goal = lookup(Goals, LastParGoal + 1),
-        can_parallelise_goal(Goal),
-
-        % Try to move a goal from after the parallelisation into the
-        % parallelisation.
-        add_one_goal_into_last_par_conj(!Parallelisation),
-        semipure test_parallelisation(BNBState, !Parallelisation),
-        semipure add_goals_into_last_par_conj(BNBState, !Parallelisation)
-    ;
-        true
-    ).
+:- pred generate_parallelisations_loop(implicit_parallelism_info::in,
+    best_par_algorithm_simple::in, list(goal_group(goal_classification))::in,
+    incomplete_parallelisation::in,
+    best_solutions(incomplete_parallelisation)::in,
+    best_solutions(incomplete_parallelisation)::out,
+    bnb_profile::in, bnb_profile::out) is det.
 
-    % Set the last scheduled goal to the goal at the end of the first group,
-    % popping the first group off the list. This initialises the
-    % parallelisation with the first goal group occurring first in the first
-    % parallel conjunction.
-    %
-    % This is done outside of the loop below since the first goal group will
-    % always be added to the first (initially empty) parallel conjunction.
-    %
-:- pred start_first_par_conjunct(
-    list(goal_group(T))::in, list(goal_group(T))::out,
-    incomplete_parallelisation::in, incomplete_parallelisation::out) is det.
-
-start_first_par_conjunct(!GoalGroups, !Parallelisation) :-
-    (
-        !.GoalGroups = [],
-        unexpected($module, $pred, "no goal groups")
+generate_parallelisations_loop(_, _, [],
+        !.IncompleteParallelisation, !MaybeBestSolns, !Profile) :-
+    % Verify that we have generated at least two parallel conjuncts.
+    ( ip_get_num_parallel_conjuncts(!.IncompleteParallelisation) >= 2 ->
+        maybe_update_best_complete_parallelisation(!.IncompleteParallelisation,
+            !MaybeBestSolns, !Profile)
     ;
-        !.GoalGroups = [Group | !:GoalGroups],
-        gg_get_details(Group, Index, Num, _),
-        LastScheduledGoal = Index + Num - 1,
-        !Parallelisation ^ ip_last_scheduled_goal := LastScheduledGoal
+        % This is not a solution, so do not try to update !MaybeBestSolns.
+        !Profile ^ bnbp_complete_non_solution :=
+            !.Profile ^ bnbp_complete_non_solution + 1
     ).
-
-:- impure pred generate_parallelisations_body(implicit_parallelism_info::in,
-    bnb_state(best_parallelisation)::in, best_par_algorithm_simple::in,
-    list(goal_group(goal_classification))::in,
-    incomplete_parallelisation::in, incomplete_parallelisation::out) is nondet.
-
-generate_parallelisations_body(_, _, _, [], !Parallelisation) :-
-    % Verify that we've generated at least two parallel conjuncts.
-    ip_get_num_parallel_conjuncts(!.Parallelisation) >= 2.
-generate_parallelisations_body(Info, BNBState, Algorithm,
-        [GoalGroup | GoalGroups], !Parallelisation) :-
-    LastScheduledGoal0 = !.Parallelisation ^ ip_last_scheduled_goal,
+generate_parallelisations_loop(Info, Algorithm, [GoalGroup | GoalGroups],
+        !.IncompleteParallelisation, !MaybeBestSolns, !Profile) :-
+    LastScheduledGoal0 = !.IncompleteParallelisation ^ ip_last_scheduled_goal,
     gg_get_details(GoalGroup, _Index, Num, _Classification),
 
     LastScheduledGoal = LastScheduledGoal0 + Num,
     some [!AddToLastParallelisation, !AddToNewParallelisation] (
-        !:AddToLastParallelisation = !.Parallelisation,
-        !:AddToNewParallelisation = !.Parallelisation,
+        !:AddToLastParallelisation = !.IncompleteParallelisation,
+        !:AddToNewParallelisation = !.IncompleteParallelisation,
 
         % Consider adding this goal to the last parallel conjunct.
         !AddToLastParallelisation ^ ip_last_scheduled_goal
             := LastScheduledGoal,
-        score_parallelisation(BNBState, MaybeAddToLastScore,
-            !AddToLastParallelisation),
+        update_incomplete_parallelisation_cost(Info, !AddToLastParallelisation,
+            MaybeAddToLastCost), 
 
         % Consider putting this goal into a new parallel conjunct.
-        ParConjsRevLastGoal0 = !.Parallelisation ^ ip_par_conjs_rev_last_goal,
+        ParConjsRevLastGoal0 =
+            !.IncompleteParallelisation ^ ip_par_conjs_rev_last_goal,
         ParConjsRevLastGoal = [LastScheduledGoal0 | ParConjsRevLastGoal0],
         !AddToNewParallelisation ^ ip_par_conjs_rev_last_goal :=
             ParConjsRevLastGoal,
         !AddToNewParallelisation ^ ip_last_scheduled_goal := LastScheduledGoal,
-        score_parallelisation(BNBState, MaybeAddToNewScore,
-            !AddToNewParallelisation),
+        update_incomplete_parallelisation_cost(Info, !AddToNewParallelisation,
+            MaybeAddToNewCost), 
 
         (
-            MaybeAddToLastScore = yes(AddToLastScore),
-            (
-                MaybeAddToNewScore = yes(AddToNewScore),
+            MaybeAddToLastCost = yes(AddToLastCost),
                 (
-                    % Smaller scores are better.
-                    AddToNewScore > AddToLastScore
-                ->
+                MaybeAddToNewCost = yes(AddToNewCost),
+                ( AddToNewCost > AddToLastCost ->
                     % Adding to the last parallel conjunct is better.
-                    BestOption = !.AddToLastParallelisation,
-                    MaybeSndBestOption = yes(!.AddToNewParallelisation)
+                    Best0 = !.AddToLastParallelisation,
+                    MaybeNextBest0 = yes(!.AddToNewParallelisation)
                 ;
                     % Adding to a new parallel conjunct is better.
-                    BestOption = !.AddToNewParallelisation,
-                    MaybeSndBestOption = yes(!.AddToLastParallelisation)
+                    Best0 = !.AddToNewParallelisation,
+                    MaybeNextBest0 = yes(!.AddToLastParallelisation)
                 )
             ;
-                MaybeAddToNewScore = no,
+                MaybeAddToNewCost = no,
                 % Adding to the last parallel conjunct is the only option.
-                BestOption = !.AddToLastParallelisation,
-                MaybeSndBestOption = no
+                Best0 = !.AddToLastParallelisation,
+                MaybeNextBest0 = no
             )
         ;
-            MaybeAddToLastScore = no,
+            MaybeAddToLastCost = no,
             % Adding to a new parallel conjunct is the only option.
-            BestOption = !.AddToNewParallelisation,
-            MaybeSndBestOption = no
+            Best0 = !.AddToNewParallelisation,
+            MaybeNextBest0 = no
         )
     ),
 
     (
-        MaybeSndBestOption = no,
-        !:Parallelisation = BestOption
-    ;
-        MaybeSndBestOption = yes(SndBestOption),
-        (
-            % Should an alternative branch be created here?
-            semipure should_expand_search(BNBState, Algorithm)
+        % Can we create an alternative branch here?
+        MaybeNextBest0 = yes(NextBest0),
+        % Should we create an alternative branch here?
+        should_expand_search(Algorithm, !.Profile)
         ->
             % Create a branch.
-            impure add_alternative(BNBState),
-            % This tries the leftmost disjunct first, so try the best option
-            % there.
+        incomplete_parallelisation_is_good_enough(Info, !.MaybeBestSolns,
+            Best0, Best, !Profile, BestGoodEnough),
+        (
+            BestGoodEnough = is_good_enough,
+            generate_parallelisations_loop(Info, Algorithm,
+                GoalGroups, Best, !MaybeBestSolns, !Profile)
+        ;
+            BestGoodEnough = is_not_good_enough
+        ),
+
+        incomplete_parallelisation_is_good_enough(Info, !.MaybeBestSolns,
+            NextBest0, NextBest, !Profile, NextBestGoodEnough),
             (
-                !:Parallelisation = BestOption
+            NextBestGoodEnough = is_good_enough,
+            generate_parallelisations_loop(Info, Algorithm,
+                GoalGroups, NextBest, !MaybeBestSolns, !Profile)
             ;
-                impure close_alternative(BNBState),
-                !:Parallelisation = SndBestOption
+            NextBestGoodEnough = is_not_good_enough
             )
         ;
-            !:Parallelisation = BestOption
+        incomplete_parallelisation_is_good_enough(Info, !.MaybeBestSolns,
+            Best0, Best, !Profile, BestGoodEnough),
+        (
+            BestGoodEnough = is_good_enough,
+            generate_parallelisations_loop(Info, Algorithm,
+                GoalGroups, Best, !MaybeBestSolns, !Profile)
+        ;
+            BestGoodEnough = is_not_good_enough
         )
-    ),
+    ).
 
-    semipure test_parallelisation(BNBState, !Parallelisation),
+:- pred start_building_parallelisation(goals_for_parallelisation::in,
+    incomplete_parallelisation::out) is det.
 
-    impure generate_parallelisations_body(Info, BNBState, Algorithm,
-        GoalGroups, !Parallelisation).
+start_building_parallelisation(PreprocessedGoals, Parallelisation) :-
+    GoalsArray = PreprocessedGoals ^ gfp_goals,
+    FirstParGoal = PreprocessedGoals ^ gfp_first_costly_goal,
+    LastParGoal = PreprocessedGoals ^ gfp_last_costly_goal,
+    NumCalls = PreprocessedGoals ^ gfp_num_calls,
+    DependencyGraphs = PreprocessedGoals ^ gfp_dependency_graphs,
+    Parallelisation = incomplete_parallelisation(GoalsArray,
+        FirstParGoal, LastParGoal, FirstParGoal, [], NumCalls,
+        DependencyGraphs, no, no, no).
+
+    % Finalise the parallelisation.
+    %
+:- pred finalise_parallelisation(incomplete_parallelisation::in,
+    full_parallelisation::out) is det.
+
+finalise_parallelisation(Incomplete, Best) :-
+    GoalsBefore = ip_get_goals_before(Incomplete),
+    GoalsAfter = ip_get_goals_after(Incomplete),
+
+    MaybeCostData = Incomplete ^ ip_maybe_par_cost_data,
+    (
+        MaybeCostData = yes(CostData)
+    ;
+        MaybeCostData = no,
+        unexpected($module, $pred, "parallelisation has no cost data")
+    ),
+    CostData = parallelisation_cost_data(_, Overlap, Metrics0, _),
+
+    Metrics = finalise_parallel_exec_metrics(Metrics0),
+    par_conj_overlap_is_dependent(Overlap, IsDependent),
+    ParConjs = ip_get_par_conjs(Incomplete),
+    Best = fp_parallel_execution(GoalsBefore, ParConjs, GoalsAfter,
+        IsDependent, Metrics).
 
     % True if we should expand the search for parallelisation alternatives by
     % creating a choice point.
     %
-:- semipure pred should_expand_search(bnb_state(T)::in,
-    best_par_algorithm_simple::in) is semidet.
+:- pred should_expand_search(best_par_algorithm_simple::in, bnb_profile::in)
+    is semidet.
 
-should_expand_search(BNBState, Algorithm) :-
+should_expand_search(Algorithm, Profile) :-
     Algorithm = bpas_complete(MaybeLimit),
     (
         MaybeLimit = yes(Limit),
-        semipure num_alternatives(BNBState, Open, Closed),
-        Open + Closed < Limit
+        NumIcompleteTests =
+            Profile ^ bnbp_incomplete_not_good_enough +
+            Profile ^ bnbp_incomplete_good_enough,
+        NumIcompleteTests < Limit
     ;
         MaybeLimit = no
     ).
 
+:- pred maybe_update_best_complete_parallelisation(
+    incomplete_parallelisation::in,
+    best_solutions(incomplete_parallelisation)::in,
+    best_solutions(incomplete_parallelisation)::out,
+    bnb_profile::in, bnb_profile::out) is det.
+
+maybe_update_best_complete_parallelisation(CurSoln, !MaybeBestSolns,
+        !Profile) :-
+    CurSolnCost = incomplete_parallelisation_cost(CurSoln),
+    (
+        !.MaybeBestSolns = no_best_solutions,
+        !Profile ^ bnbp_complete_best_solution :=
+            !.Profile ^ bnbp_complete_best_solution + 1
+    ;
+        !.MaybeBestSolns = best_solutions(BestSolns0, BestCost0),
+        ( CurSolnCost < BestCost0 ->
+            !:MaybeBestSolns = best_solutions([CurSoln], CurSolnCost),
+            !Profile ^ bnbp_complete_best_solution :=
+                !.Profile ^ bnbp_complete_best_solution + 1
+        ; CurSolnCost = BestCost0 ->
+            BestSolns = [CurSoln | BestSolns0],
+            !:MaybeBestSolns = best_solutions(BestSolns, BestCost0),
+            !Profile ^ bnbp_complete_equal_solution :=
+                !.Profile ^ bnbp_complete_equal_solution + 1
+        ;
+            % We do not update !MaybeBestSolns.
+            !Profile ^ bnbp_complete_worse_solution :=
+                !.Profile ^ bnbp_complete_worse_solution + 1
+        )
+    ).
+
+:- type is_good_enough
+    --->    is_not_good_enough
+    ;       is_good_enough.
+
     % Test the parallelisation against the best one known to the branch and
     % bound solver.
     %
-:- semipure pred test_parallelisation(bnb_state(best_parallelisation)::in,
-    incomplete_parallelisation::in, incomplete_parallelisation::out)
-    is semidet.
-
-test_parallelisation(BNBState, !Parallelisation) :-
-    calculate_parallel_cost(CostData, !Parallelisation),
-    Info = !.Parallelisation ^ ip_info,
-    test_dependance(Info, CostData),
-    % XXX: We shouldn't need to finalize the parallelisation before testing it.
-    % This is a limitation of the branch and bound module.
-    finalise_parallelisation(!.Parallelisation, TestParallelisation),
-    semipure test_incomplete_solution(BNBState, TestParallelisation).
-
-    % Score the parallelisation.
-    %
-:- pred score_parallelisation(bnb_state(best_parallelisation)::in,
-    maybe(float)::out,
-    incomplete_parallelisation::in, incomplete_parallelisation::out) is det.
-
-score_parallelisation(BNBState, MaybeScore, !Parallelisation) :-
-    calculate_parallel_cost(CostData, !Parallelisation),
-    Info = !.Parallelisation ^ ip_info,
-    ( test_dependance(Info, CostData) ->
-        finalise_parallelisation(!.Parallelisation, TestParallelisation),
-        score_solution(BNBState, TestParallelisation, Score),
-        MaybeScore = yes(Score)
+:- pred incomplete_parallelisation_is_good_enough(
+    implicit_parallelism_info::in,
+    best_solutions(incomplete_parallelisation)::in,
+    incomplete_parallelisation::in, incomplete_parallelisation::out,
+    bnb_profile::in, bnb_profile::out, is_good_enough::out) is det.
+
+incomplete_parallelisation_is_good_enough(Info, MaybeBestSolns,
+        !IncompleteParallelisation, !Profile, GoodEnough) :-
+    calculate_parallel_cost(Info, !IncompleteParallelisation, CostData),
+    ( test_dependence(Info, CostData) ->
+        (
+            MaybeBestSolns = no_best_solutions,
+            !Profile ^ bnbp_incomplete_good_enough :=
+                !.Profile ^ bnbp_incomplete_good_enough + 1,
+            GoodEnough = is_good_enough
+        ;
+            MaybeBestSolns = best_solutions(_, BestSolnCost),
+            CurIncompleteCost =
+                incomplete_parallelisation_cost(!.IncompleteParallelisation),
+            ( CurIncompleteCost > BestSolnCost ->
+                !Profile ^ bnbp_incomplete_not_good_enough :=
+                    !.Profile ^ bnbp_incomplete_not_good_enough + 1,
+                GoodEnough = is_not_good_enough
+            ;
+                !Profile ^ bnbp_incomplete_good_enough :=
+                    !.Profile ^ bnbp_incomplete_good_enough + 1,
+                GoodEnough = is_good_enough
+            )
+        )
     ;
-        MaybeScore = no
+        !Profile ^ bnbp_incomplete_not_good_enough :=
+            !.Profile ^ bnbp_incomplete_not_good_enough + 1,
+        GoodEnough = is_not_good_enough
     ).
 
-    % Test that the parallelisation includes dependant parallelism
+    % Test that the parallelisation includes dependent parallelism
     % only if permitted by the user.
     %
-:- pred test_dependance(implicit_parallelism_info::in,
+:- pred test_dependence(implicit_parallelism_info::in,
     parallelisation_cost_data::in) is semidet.
 
-test_dependance(Info, CostData) :-
+test_dependence(Info, CostData) :-
     Overlap = CostData ^ pcd_par_exec_overlap,
     ParalleliseDepConjs = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs,
     par_conj_overlap_is_dependent(Overlap, IsDependent),
@@ -738,11 +779,11 @@
     conjuncts_are_dependent::out) is det.
 
 par_conj_overlap_is_dependent(peo_empty_conjunct, conjuncts_are_independent).
-par_conj_overlap_is_dependent(peo_conjunction(Left, _, VarSet0), IsDependent) :-
+par_conj_overlap_is_dependent(peo_conjunction(Left, VarSet0), IsDependent) :-
     par_conj_overlap_is_dependent(Left, IsDependent0),
     (
         IsDependent0 = conjuncts_are_dependent(VarSetLeft),
-        VarSet = union(VarSet0, VarSetLeft),
+        VarSet = set.union(VarSet0, VarSetLeft),
         IsDependent = conjuncts_are_dependent(VarSet)
     ;
         IsDependent0 = conjuncts_are_independent,
@@ -753,12 +794,106 @@
         )
     ).
 
+    % Compute the cost of the parallelisation.
+    %
+:- pred update_incomplete_parallelisation_cost(implicit_parallelism_info::in,
+    incomplete_parallelisation::in, incomplete_parallelisation::out,
+    maybe(float)::out) is det.
+
+update_incomplete_parallelisation_cost(Info, !IncompleteParallelisation,
+        MaybeCost) :-
+    calculate_parallel_cost(Info, !IncompleteParallelisation, CostData),
+    ( test_dependence(Info, CostData) ->
+        Cost = incomplete_parallelisation_cost(!.IncompleteParallelisation),
+        MaybeCost = yes(Cost)
+    ;
+        MaybeCost = no
+    ).
+
+:- func incomplete_parallelisation_cost(incomplete_parallelisation) = float.
+
+incomplete_parallelisation_cost(IncompleteParallelisation) = Cost :-
+    MaybeCostData = IncompleteParallelisation ^ ip_maybe_par_cost_data,
+    (
+        MaybeCostData = yes(CostData)
+    ;
+        MaybeCostData = no,
+        unexpected($module, $pred,
+            "incomplete parallelisation has no cost data")
+    ),
+    IncompleteMetrics = CostData ^ pcd_par_exec_metrics,
+    FullMetrics = finalise_parallel_exec_metrics(IncompleteMetrics),
+    Cost = full_parallelisation_metrics_cost(FullMetrics).
+
+    % The objective function for the branch and bound search.
+    % This is ParTime + ParOverheads * 2. That is we are willing to pay
+    % 1 unit of parallel overheads to get a 2 unit improvement
+    % of parallel execution time.
+    %
+    % XXX This looks wrong, for two reasons. First, it would be simpler
+    % and faster to just multiply the costs of all the overheads by 2.
+    % Second, the fudge factor should be configurable.
+    %
+:- func full_parallelisation_metrics_cost(parallel_exec_metrics) = float.
+
+full_parallelisation_metrics_cost(FullMetrics) = Cost :-
+    Cost = FullMetrics ^ pem_par_time +
+        parallel_exec_metrics_get_overheads(FullMetrics) * 2.0.
+
+:- func full_parallelisation_cost(full_parallelisation) = float.
+
+full_parallelisation_cost(FullParallelisation) = Cost :-
+    FullMetrics = FullParallelisation ^ fp_par_exec_metrics,
+    Cost = full_parallelisation_metrics_cost(FullMetrics).
+
 %----------------------------------------------------------------------------%
 
-:- pred add_one_goal_into_first_par_conj(incomplete_parallelisation::in,
-    incomplete_parallelisation::out) is det.
+% :- semipure pred add_goals_into_first_par_conj(
+%     bnb_state(full_parallelisation)::in,
+%     incomplete_parallelisation::in, incomplete_parallelisation::out) is multi.
+% 
+% add_goals_into_first_par_conj(BNBState, !Parallelisation) :-
+%     FirstGoal0 = !.Parallelisation ^ ip_first_par_goal,
+%     (
+%         FirstGoal0 > 0,
+%         Goals = !.Parallelisation ^ ip_goals,
+%         Goal = lookup(Goals, FirstGoal0 - 1),
+%         can_parallelise_goal(Goal),
+% 
+%         % There are goals before the parallel conjunction that can be included
+%         % in the parallel conjunction.
+%         add_one_goal_into_first_par_conj(!Parallelisation),
+%         semipure test_parallelisation(BNBState, !Parallelisation),
+%         semipure add_goals_into_first_par_conj(BNBState, !Parallelisation)
+%     ;
+%         true
+%     ).
+% 
+% :- semipure pred add_goals_into_last_par_conj(
+%     bnb_state(full_parallelisation)::in,
+%     incomplete_parallelisation::in, incomplete_parallelisation::out) is multi.
+% 
+% add_goals_into_last_par_conj(BNBState, !Parallelisation) :-
+%     NumGoals = ip_get_num_goals(!.Parallelisation),
+%     LastParGoal = !.Parallelisation ^ ip_last_par_goal,
+%     (
+%         LastParGoal < NumGoals - 1,
+%         Goals = !.Parallelisation ^ ip_goals,
+%         Goal = lookup(Goals, LastParGoal + 1),
+%         can_parallelise_goal(Goal),
+% 
+%         % Try to move a goal from after the parallelisation into the
+%         % parallelisation.
+%         add_one_goal_into_last_par_conj(!Parallelisation),
+%         semipure test_parallelisation(BNBState, !Parallelisation),
+%         semipure add_goals_into_last_par_conj(BNBState, !Parallelisation)
+%     ;
+%         true
+%     ).
 
-:- pred add_one_goal_into_last_par_conj(incomplete_parallelisation::in,
+%----------------------------------------------------------------------------%
+
+:- pred add_one_goal_into_first_par_conj(incomplete_parallelisation::in,
     incomplete_parallelisation::out) is det.
 
 add_one_goal_into_first_par_conj(!Parallelisation) :-
@@ -768,6 +903,9 @@
     !Parallelisation ^ ip_maybe_goals_before_cost := no,
     !Parallelisation ^ ip_maybe_par_cost_data := no.
 
+:- pred add_one_goal_into_last_par_conj(incomplete_parallelisation::in,
+    incomplete_parallelisation::out) is det.
+
 add_one_goal_into_last_par_conj(!Parallelisation) :-
     LastGoal0 = !.Parallelisation ^ ip_last_par_goal,
     LastGoal = LastGoal0 + 1,
@@ -781,11 +919,11 @@
 :- mode foldl_sub_array(pred(in, in, in, out) is det,
     in, in, in, in, out) is det.
 
-foldl_sub_array(Pred, Array, Index, Last, !A) :-
+foldl_sub_array(Pred, Array, Index, Last, !Acc) :-
     ( Index =< Last ->
-        X = lookup(Array, Index),
-        Pred(Index, X, !A),
-        foldl_sub_array(Pred, Array, Index + 1, Last, !A)
+        array.lookup(Array, Index, X),
+        Pred(Index, X, !Acc),
+        foldl_sub_array(Pred, Array, Index + 1, Last, !Acc)
     ;
         true
     ).
Index: autopar_search_goals.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/deep_profiler/autopar_search_goals.m,v
retrieving revision 1.4
diff -u -b -r1.4 autopar_search_goals.m
--- autopar_search_goals.m	10 May 2011 04:12:27 -0000	1.4
+++ autopar_search_goals.m	15 Sep 2011 08:50:43 -0000
@@ -719,7 +719,7 @@
         FirstConjNum = 1,
         ParalleliseDepConjs = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs,
         SpeedupThreshold = Info ^ ipi_opts ^ cpcp_speedup_threshold,
-        BestParallelisation = bp_parallel_execution(GoalsBefore, ParConjs,
+        BestParallelisation = fp_parallel_execution(GoalsBefore, ParConjs,
             GoalsAfter, IsDependent, Metrics),
         Speedup = parallel_exec_metrics_get_speedup(Metrics),
         Calls = Metrics ^ pem_num_calls,
Index: autopar_types.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/deep_profiler/autopar_types.m,v
retrieving revision 1.2
diff -u -b -r1.2 autopar_types.m
--- autopar_types.m	27 Jan 2011 08:03:53 -0000	1.2
+++ autopar_types.m	16 Sep 2011 06:37:27 -0000
@@ -30,7 +30,6 @@
 :- import_module var_use_analysis.
 
 :- import_module array.
-:- import_module assoc_list.
 :- import_module digraph.
 :- import_module lazy.
 :- import_module list.
@@ -100,23 +99,6 @@
     ;       pgt_other_atomic_goal
     ;       pgt_non_atomic_goal.
 
-:- func ip_get_goals_before(incomplete_parallelisation) =
-    list(pard_goal_detail).
-
-:- func ip_get_goals_after(incomplete_parallelisation) =
-    list(pard_goal_detail).
-
-:- func ip_get_par_conjs(incomplete_parallelisation) =
-    list(seq_conj(pard_goal_detail)).
-
-:- func ip_get_num_goals(incomplete_parallelisation) = int.
-
-:- func ip_get_num_parallel_conjuncts(incomplete_parallelisation) = int.
-
-:- func ip_get_num_goals_middle(incomplete_parallelisation) = int.
-
-:- func ip_calc_sharedvars_set(incomplete_parallelisation) = set(var_rep).
-
     % Build sets of produced and consumed vars for a conjunct in a conjunction.
     % Use with foldl to build these sets up for the whole conjunction.  At the
     % end of a conjunction there may be variables in the intersection of the
@@ -181,8 +163,6 @@
 
 :- type incomplete_parallelisation
     --->    incomplete_parallelisation(
-                ip_info                     :: implicit_parallelism_info,
-
                 ip_goals                    :: array(pard_goal_detail),
 
                 % The index of the first goal in the parallelised goals,
@@ -199,11 +179,11 @@
                 ip_last_scheduled_goal      :: int,
 
                 % The index of the last goal in each of the parallel conjuncts.
-                % the very last parallel conjunct is donated by
+                % The very last parallel conjunct is denoted by
                 % ip_last_par_goal.
                 ip_par_conjs_rev_last_goal  :: list(int),
 
-                % The number of calls into this conjunction.
+                % The number of calls in this conjunction.
                 ip_num_calls                :: int,
 
                 % Dependency relationships between goals.
@@ -225,7 +205,7 @@
                 pcd_productions_map     :: map(var_rep, float)
             ).
 
-    % This datastructure represents the execution of dependent conjuncts,
+    % This data structure represents the execution of dependent conjuncts,
     % it tracks which variables are produced and consumed.
     %
     % TODO: Implement a pretty printer for this data.
@@ -234,27 +214,46 @@
     --->    peo_empty_conjunct
     ;       peo_conjunction(
                 poec_left_conjunct      :: parallel_execution_overlap,
-                poec_right_conjunct     :: dependent_conjunct_execution,
+
+                % XXX currently unused.
+                % poec_right_conjunct   :: dependent_conjunct_execution,
 
                 % The variables produced by the left conjunct and
                 % consumed by the right conjunct.
                 poec_dependent_vars     :: set(var_rep)
             ).
 
-:- type dependent_conjunct_execution
-    --->    dependent_conjunct_execution(
-                % Pairs of start and stop times of the execution.
-                % Assume that the list is not sorted.
-                dce_execution           :: assoc_list(float, float),
-
-                % The variable productions. This may be a superset of the
-                % dependent variables.
-                dce_productions         :: map(var_rep, float),
-
-                % The variable consumptions. This will contain only
-                % references for those variables that will become futures.
-                dce_consumptions        :: map(var_rep, float)
-            ).
+% :- type dependent_conjunct_execution
+%     --->    dependent_conjunct_execution(
+%                 % Pairs of start and stop times of the execution.
+%                 % Assume that the list is not sorted.
+%                 dce_execution           :: assoc_list(float, float),
+% 
+%                 % The variable productions. This may be a superset of the
+%                 % dependent variables.
+%                 dce_productions         :: map(var_rep, float),
+% 
+%                 % The variable consumptions. This will contain only
+%                 % references for those variables that will become futures.
+%                 dce_consumptions        :: map(var_rep, float)
+%             ).
+
+:- func ip_get_goals_before(incomplete_parallelisation) =
+    list(pard_goal_detail).
+
+:- func ip_get_goals_after(incomplete_parallelisation) =
+    list(pard_goal_detail).
+
+:- func ip_get_par_conjs(incomplete_parallelisation) =
+    list(seq_conj(pard_goal_detail)).
+
+:- func ip_get_num_goals(incomplete_parallelisation) = int.
+
+:- func ip_get_num_parallel_conjuncts(incomplete_parallelisation) = int.
+
+:- func ip_get_num_goals_middle(incomplete_parallelisation) = int.
+
+:- func ip_calc_sharedvars_set(incomplete_parallelisation) = set(var_rep).
 
 %-----------------------------------------------------------------------------%
 
@@ -262,16 +261,47 @@
 
 :- import_module int.
 
+%-----------------------------------------------------------------------------%
+
+identify_costly_goal(Annotation, Costly) :-
+    CostAboveThreshold = Annotation ^ pgd_cost_above_threshold,
+    (
+        CostAboveThreshold = cost_above_par_threshold,
+        % TODO: distinguish between compound goals with one costly branch,
+        % and compound goals where all branches are costly.
+        % TODO: Provide information about how many costly goals are within
+        % the goal so that we can try to parallelise each of those against
+        % an outer costly goal.
+        Costly = is_costly_goal
+    ;
+        CostAboveThreshold = cost_not_above_par_threshold,
+        Costly = is_not_costly_goal
+    ).
+
+identify_costly_goals([], _, []).
+identify_costly_goals([Goal | Goals], Index, Indexes) :-
+    identify_costly_goals(Goals, Index + 1, Indexes0),
+    identify_costly_goal(Goal ^ goal_annotation, Costly),
+    (
+        Costly = is_costly_goal,
+        Indexes = [Index | Indexes0]
+    ;
+        Costly = is_not_costly_goal,
+        Indexes = Indexes0
+    ).
+
+%-----------------------------------------------------------------------------%
+
 ip_get_goals_before(Parallelisation) = GoalsBefore :-
     Goals = Parallelisation ^ ip_goals,
     FirstParGoalIndex = Parallelisation ^ ip_first_par_goal,
-    fetch_items(Goals, 0, FirstParGoalIndex - 1, GoalsBefore).
+    array.fetch_items(Goals, 0, FirstParGoalIndex - 1, GoalsBefore).
 
 ip_get_goals_after(Parallelisation) = GoalsAfter :-
     Goals = Parallelisation ^ ip_goals,
     LastParGoalIndex = Parallelisation ^ ip_last_par_goal,
-    NumGoals = size(Goals),
-    fetch_items(Goals, LastParGoalIndex + 1, NumGoals - 1, GoalsAfter).
+    NumGoals = array.size(Goals),
+    array.fetch_items(Goals, LastParGoalIndex + 1, NumGoals - 1, GoalsAfter).
 
 ip_get_par_conjs(Incomplete) = ParConjs :-
     Goals = Incomplete ^ ip_goals,
@@ -291,10 +321,10 @@
     fetch_items(Array, First, Last, Goals),
     Conj = seq_conj(Goals).
 
-ip_get_num_goals(Incomplete) = size(Incomplete ^ ip_goals).
+ip_get_num_goals(Incomplete) = array.size(Incomplete ^ ip_goals).
 
 ip_get_num_parallel_conjuncts(Incomplete) =
-    length(Incomplete ^ ip_par_conjs_rev_last_goal) + 1.
+    list.length(Incomplete ^ ip_par_conjs_rev_last_goal) + 1.
 
 ip_get_num_goals_middle(Incomplete) = LastParGoal - FirstParGoal + 1 :-
     FirstParGoal = Incomplete ^ ip_first_par_goal,
@@ -312,7 +342,7 @@
 build_sharedvars_set(seq_conj(Conjs), !BoundVars, !SharedVars) :-
     list.foldl2(conj_produced_and_consumed_vars, Conjs,
         set.init, ProducedVars, set.init, ConsumedVars),
-    % The new shared vars are previously bound variables that are cosumed in
+    % The new shared vars are previously bound variables that are consumed in
     % this conjunct.  This must be calculated before !BoundVars is updated.
     SharedVars = set.intersect(!.BoundVars, ConsumedVars),
     !:SharedVars = set.union(!.SharedVars, SharedVars),
@@ -324,32 +354,3 @@
     !:Consumed = set.union(!.Consumed, InstMapInfo ^ im_consumed_vars).
 
 %-----------------------------------------------------------------------------%
-
-identify_costly_goal(Annotation, Costly) :-
-    CostAboveThreshold = Annotation ^ pgd_cost_above_threshold,
-    (
-        CostAboveThreshold = cost_above_par_threshold,
-        % TODO: distinguish between compound goals with one costly branch,
-        % and compound goals where all branches are costly.
-        % TODO: Provide information about how many costly goals are within
-        % the goal so that we can try to parallelise each of those against
-        % an outer costly goal.
-        Costly = is_costly_goal
-    ;
-        CostAboveThreshold = cost_not_above_par_threshold,
-        Costly = is_not_costly_goal
-    ).
-
-identify_costly_goals([], _, []).
-identify_costly_goals([Goal | Goals], Index, Indexes) :-
-    identify_costly_goals(Goals, Index + 1, Indexes0),
-    identify_costly_goal(Goal ^ goal_annotation, Costly),
-    (
-        Costly = is_costly_goal,
-        Indexes = [Index | Indexes0]
-    ;
-        Costly = is_not_costly_goal,
-        Indexes = Indexes0
-    ).
-
-%-----------------------------------------------------------------------------%
Index: create_report.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/deep_profiler/create_report.m,v
retrieving revision 1.35
diff -u -b -r1.35 create_report.m
cvs diff: Diffing notes
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
Previous message: [m-rev.] diff: add --output-c-compiler-type
Next message: [m-rev.] diff in progress: branch and bound
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list