[m-rev.] For post-commit review: Automatically parallelise non-atomic goals.

Paul Bone pbone at csse.unimelb.edu.au
Thu Oct 14 15:04:51 AEDT 2010
Previous message: [m-rev.] diff: post_typecheck and parallelism
Next message: [m-rev.] for review: mention winmercury on the release page
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
For post-commit review by Zoltan.

---

Automatically parallelise non-atomic goals.

This patch allws the feedback too to recommend the parallelisation of
non-atomic goals.

mdbcomp/feedback.automatic_parallelism.m:
    Remove the concept of 'partitions' from the candidate parallel conjunction
    type.  We no-longer divide conjunctions into partitions before
    parallelising them.

mdbcomp/feedback.m:
    Increment the feedback format version number.

compiler/implicit_parallelism.m:
    Conform to changes in mdbcomp/feedback.automatic_parallelism.m.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Allow the non-atomic goals to be parallelised against one-another.

    Modify the goal annotations used internally, many annotations used only for
    calls are now used for any goal type.

    Variable use information is now stored in a map from variable name to lazy
    use data for every goal, not just for the arguments of calls.

    Do not partition conjunctions before attempting to parallelise them.

    Make the adjust_time_for_waits tolerate floating point errors more easily.

    Format costs with commas and, in most cases, two decimal places.

deep_profiler/var_use_analysis.m:
    Export a new predicate var_first_use that computes the first use of a
    variable within a goal.  This predicate uses a new typeclass to retrieve
    coverage data from any goal that can implement the typeclass.

deep_profiler/measurements.m:
    Added a new abstract type for measuring the cost of a goal, goal_cost_csq.
    This is like cs_cost_csq except that it can represent trivial goals (which
    don't have a call count).

deep_profiler/coverage.m:
    Added deterministic versions of the get_coverage_* predicates.

deep_profiler/program_representation_utils.m:
    Made initial_inst_map more generic in its type signature.

    Add a new predicate, atomic_goal_is_call/2 which can be used instead of a
    large switch on an atomic_goal_rep value.

deep_profiler/message.m:
    Rename a message type to make it more general, this is required now that we
    compute variable use information for arbitrary goals, not just calls.

library/list.m:
    Add map3_foldl.

NEWS:
    Announced change to list.m.

Index: NEWS
===================================================================
RCS file: /home/mercury1/repository/mercury/NEWS,v
retrieving revision 1.537
diff -u -p -b -r1.537 NEWS
--- NEWS	7 Oct 2010 05:03:12 -0000	1.537
+++ NEWS	14 Oct 2010 03:58:50 -0000
@@ -20,9 +20,14 @@ Changes to the Mercury standard library:
   into a new standard library module called `lazy'.  It has also been made
   backend-agnostic.
 
-* We have added a new predicate to the list module of the standard library.
-  list.member_index0/3.  It is like list.member/2 except that it also takes a
-  parameter representing the zero-based index of the element within the list.
+* We have made changes to the list. module of the standard library:
+
+  + We added a new predicate list.member_index0/3.  It is like list.member/2
+    except that it also takes a parameter representing the zero-based index of
+    the element within the list.
+
+  + We added a new predicate list.map3_foldl/7 which maps over a list producing
+    three lists and one folded value.
 
 Changes to the Mercury compiler:
 
Index: compiler/implicit_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/implicit_parallelism.m,v
retrieving revision 1.23
diff -u -p -b -r1.23 implicit_parallelism.m
--- compiler/implicit_parallelism.m	11 Oct 2010 00:49:21 -0000	1.23
+++ compiler/implicit_parallelism.m	14 Oct 2010 03:58:50 -0000
@@ -456,26 +456,25 @@ maybe_parallelise_conj(ProgRepInfo, VarT
     Goal0 = hlds_goal(GoalExpr0, _GoalInfo0),
     % We've reached the point indicated by the goal path, Find the
     % conjuncts that we wish to parallelise.
-    PartitionNum = CPC ^ cpc_partition_number,
+    cpc_get_first_goal(CPC, FirstGoalRep),
     (
         GoalExpr0 = conj(plain_conj, Conjs0),
         flatten_conj(Conjs0, Conjs1),
-        find_partition(PartitionNum, Conjs1, yes(PartitionInConj))
+        find_first_goal(FirstGoalRep, Conjs1, ProgRepInfo, VarTable, Instmap0, 
+            found_first_goal(GoalsBefore, FirstGoal, OtherGoals))
     ->
-        GoalsBeforePartition = PartitionInConj ^ pic_goals_before,
         GoalsBeforeInstDeltas = map(
             (func(G) = goal_info_get_instmap_delta(G ^ hlds_goal_info)),
-            GoalsBeforePartition),
+            GoalsBefore),
         foldl(apply_instmap_delta_sv, GoalsBeforeInstDeltas,
             Instmap0, Instmap),
-        Partition0 = PartitionInConj ^ pic_partition,
-        build_par_conjunction(ProgRepInfo, VarTable, Instmap, Partition0, CPC,
-            MaybeParConjunction),
-        (
-            MaybeParConjunction = ok(ParConjunction),
-            GoalsAfterPartition = PartitionInConj ^ pic_goals_after,
-            Conjs = GoalsBeforePartition ++ ParConjunction ++
-                GoalsAfterPartition, 
+        build_par_conjunction(ProgRepInfo, VarTable, Instmap, 
+            [FirstGoal | OtherGoals], CPC, MaybeParConjunction),
+        (
+            MaybeParConjunction = ok(
+                par_conjunction_and_remaining_goals(ParConjunction,
+                RemainingGoals)),
+            Conjs = GoalsBefore ++ ParConjunction ++ RemainingGoals, 
             GoalExpr = conj(plain_conj, Conjs),
             MaybeGoal = ok(hlds_goal(GoalExpr, Goal0 ^ hlds_goal_info))
         ;
@@ -487,65 +486,69 @@ maybe_parallelise_conj(ProgRepInfo, VarT
                 ++ "perhaps the program has changed")
     ).
 
-:- type partition_in_conj
-    --->    partition_in_conj(
-                pic_goals_before        :: hlds_goals,
-                pic_partition           :: hlds_goals,
-                pic_goals_after         :: hlds_goals
-            ).
-
-:- pred find_partition(int::in, list(hlds_goal)::in,
-    maybe(partition_in_conj)::out) is det.
-
-find_partition(_, [], no).
-find_partition(PartitionNum0, [Goal | Goals], MaybePartition) :-
-    ( PartitionNum0 = 1 ->
-        % We've found the correct partition.
-        find_end_of_partition(Goals, Partition, GoalsAfter),
-        MaybePartition = yes(
-            partition_in_conj([], [ Goal | Partition ], GoalsAfter))
-    ;
-        goal_is_atomic(Goal, GoalIsAtomic),
+:- pred cpc_get_first_goal(candidate_par_conjunction::in, pard_goal::out) 
+    is det.
+
+cpc_get_first_goal(CPC, FirstGoal) :-
+    GoalsBefore = CPC ^ cpc_goals_before,
         (
-            GoalIsAtomic = goal_is_atomic,
-            PartitionNum = PartitionNum0
+        GoalsBefore = [FirstGoal | _]
         ;
-            GoalIsAtomic = goal_is_nonatomic,
-            PartitionNum = PartitionNum0 - 1
-        ),
-        find_partition(PartitionNum, Goals, MaybePartition0),
+        GoalsBefore = [],
+        ParConj = CPC ^ cpc_conjs,
         (
-            MaybePartition0 = yes(PartitionInConj0),
-            PartitionInConj0 = 
-                partition_in_conj(GoalsBefore0, Partition, GoalsAfter),
-            PartitionInConj = 
-                partition_in_conj([Goal | GoalsBefore0], Partition, GoalsAfter),
-            MaybePartition = yes(PartitionInConj)
+            ParConj = [FirstParConj | _],
+            FirstParConj = seq_conj([FirstGoalPrime | _])
+        ->
+            FirstGoal = FirstGoalPrime
         ;
-            MaybePartition0 = no,
-            MaybePartition = MaybePartition0
+            error(this_file ++ "Candidate parallel conjunction is empty")
         )
     ).
 
-:- pred find_end_of_partition(list(hlds_goal)::in, list(hlds_goal)::out, 
-    list(hlds_goal)::out) is det.
+:- type find_first_goal_result
+    --->    did_not_find_first_goal
+    ;       found_first_goal(
+                ffg_goals_before        :: hlds_goals,
+                ffg_goal                :: hlds_goal,
+                ffg_goals_after         :: hlds_goals
+            ).
+
+:- pred find_first_goal(pard_goal::in, list(hlds_goal)::in,
+    prog_rep_info::in, var_table::in, instmap::in,
+    find_first_goal_result::out) is det.
 
-find_end_of_partition([], [], []).
-find_end_of_partition([ Goal | Goals ], Partition, GoalsAfter) :-
-    goal_is_atomic(Goal, GoalIsAtomic),
+find_first_goal(_, [], _, _, _, did_not_find_first_goal).
+find_first_goal(GoalRep, [Goal | Goals], ProcRepInfo, VarTable, !.Instmap,
+        Result) :-
     (
-        GoalIsAtomic = goal_is_atomic,
-        find_end_of_partition(Goals, Partition0, GoalsAfter),
-        Partition = [ Goal | Partition0 ]
+        pard_goal_match_hlds_goal(ProcRepInfo, VarTable, !.Instmap, GoalRep, 
+            Goal) 
+    ->
+        Result = found_first_goal([], Goal, Goals)
     ;
-        GoalIsAtomic = goal_is_nonatomic,
-        Partition = [],
-        GoalsAfter = [Goal | Goals]
+        InstmapDelta = goal_info_get_instmap_delta(Goal ^ hlds_goal_info),
+        apply_instmap_delta_sv(InstmapDelta, !Instmap),
+        find_first_goal(GoalRep, Goals, ProcRepInfo, VarTable, !.Instmap,
+            Result0),
+        (
+            Result0 = did_not_find_first_goal,
+            Result = did_not_find_first_goal
+        ;
+            Result0 = found_first_goal(GoalsBefore0, _, _),
+            Result = Result0 ^ ffg_goals_before := [Goal | GoalsBefore0]
+        )
+    ).
+
+:- type par_conjunction_and_remaining_goals
+    --->    par_conjunction_and_remaining_goals(
+                pcrg_par_conjunction            :: hlds_goals,
+                pcrg_remaining_goals            :: hlds_goals
     ).
 
 :- pred build_par_conjunction(prog_rep_info::in, var_table::in, instmap::in,
     hlds_goals::in, candidate_par_conjunction::in, 
-    maybe_error(hlds_goals)::out) is det.
+    maybe_error(par_conjunction_and_remaining_goals)::out) is det.
 
 build_par_conjunction(ProcRepInfo, VarTable, Instmap0, !.Goals, CPC,
         MaybeParConjunction) :-
@@ -568,16 +571,12 @@ build_par_conjunction(ProcRepInfo, VarTa
             MaybeParConjuncts = yes(ParConjuncts),
             (
                 MaybeGoalsAfter = yes(GoalsAfter),
-                ( !.Goals = [] ->
-
                     create_conj_from_list(ParConjuncts, parallel_conj,
-                        ParConjunction),
-                    MaybeParConjunction = ok(GoalsBefore ++ 
-                        [ParConjunction | GoalsAfter])
-                ;
-                    MaybeParConjunction = error("There where goals left-over after "
-                        ++ "constructing the parallel conjunction")
-                )
+                    ParConjunction0),
+                ParConjunction = GoalsBefore ++ [ParConjunction0 | GoalsAfter],
+                MaybeParConjunction = ok(
+                    par_conjunction_and_remaining_goals(ParConjunction,
+                    !.Goals))
             ;
                 MaybeGoalsAfter = no,
                 MaybeParConjunction = error("The goals after the parallel "
Index: deep_profiler/coverage.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/coverage.m,v
retrieving revision 1.8
diff -u -p -b -r1.8 coverage.m
--- deep_profiler/coverage.m	7 Oct 2010 02:38:09 -0000	1.8
+++ deep_profiler/coverage.m	14 Oct 2010 03:58:50 -0000
@@ -46,6 +46,11 @@
     is semidet.
 :- pred get_coverage_after(coverage_info::in, int::out) is semidet.
 
+:- pred get_coverage_before_det(coverage_info::in, int::out) is det.
+:- pred get_coverage_before_and_after_det(coverage_info::in, 
+    int::out, int::out) is det.
+:- pred get_coverage_after_det(coverage_info::in, int::out) is det.
+
 %----------------------------------------------------------------------------%
     
     % This is similar to the coverage_point type in
@@ -117,6 +122,33 @@ get_coverage_after(coverage_known_zero, 
 get_coverage_after(coverage_known_same(After), After).
 get_coverage_after(coverage_known_after(After), After).
 
+get_coverage_before_det(Coverage, Before) :-
+    ( get_coverage_before(Coverage, BeforePrime) ->
+        Before = BeforePrime
+    ;
+        complete_coverage_error
+    ).
+
+get_coverage_before_and_after_det(Coverage, Before, After) :-
+    ( get_coverage_before_and_after(Coverage, BeforePrime, AfterPrime) ->
+        Before = BeforePrime,
+        After = AfterPrime
+    ;
+        complete_coverage_error
+    ).
+
+get_coverage_after_det(Coverage, After) :-
+    ( get_coverage_after(Coverage, AfterPrime) ->
+        After = AfterPrime
+    ;
+        complete_coverage_error
+    ).
+
+:- pred complete_coverage_error is erroneous.
+
+complete_coverage_error :-
+    error(this_file ++ "Expected complete coverage information").
+
 %-----------------------------------------------------------------------------%
 
 coverage_point_arrays_to_list(StaticArray, DynamicArray, CoveragePoints) :-
@@ -1227,3 +1259,9 @@ before_coverage(Count) = 
     ).
 
 %----------------------------------------------------------------------------%
+
+:- func this_file = string.
+
+this_file = "coverage.m: ".
+
+%----------------------------------------------------------------------------%
Index: deep_profiler/mdprof_fb.automatic_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/mdprof_fb.automatic_parallelism.m,v
retrieving revision 1.18
diff -u -p -b -r1.18 mdprof_fb.automatic_parallelism.m
--- deep_profiler/mdprof_fb.automatic_parallelism.m	10 Oct 2010 04:19:53 -0000	1.18
+++ deep_profiler/mdprof_fb.automatic_parallelism.m	14 Oct 2010 03:58:50 -0000
@@ -63,6 +63,8 @@
 
 :- import_module analysis_utils.
 :- import_module branch_and_bound.
+:- import_module coverage.
+:- import_module create_report.
 :- import_module measurement_units.
 :- import_module measurements.
 :- import_module program_representation_utils.
@@ -132,8 +134,9 @@ pard_goal_detail_to_pard_goal(!Goal) :-
 pard_goal_detail_annon_to_pard_goal_annon(PGD, PG) :-
     PGT = PGD ^ pgd_pg_type,
     (
-        PGT = pgt_call(CostCSQ, CostAboveThreshold, _, _),
-        CostPercall = cs_cost_get_percall(CostCSQ),
+        PGT = pgt_call(_, _),
+        CostPercall = goal_cost_get_percall(PGD ^ pgd_cost),
+        CostAboveThreshold = PGD ^ pgd_cost_above_threshold,
         PG = pard_goal_call(CostPercall, CostAboveThreshold)
     ;
         PGT = pgt_other_atomic_goal,
@@ -174,21 +177,25 @@ pard_goal_detail_annon_to_pard_goal_anno
                 pgd_inst_map_info       :: inst_map_info,
                     % The inst map info attached to the original goal.
 
-                pgd_original_path       :: goal_path
+                pgd_original_path           :: goal_path,
                     % The original goal path of this goal.
-            ).
 
-:- inst pard_goal_detail(T)
-    ---> pard_goal_detail(T, ground, ground).
+                pgd_coverage                :: coverage_info,
+                    % Coverage data for this goal.
 
-:- type pard_goal_type 
-    --->    pgt_call(
-                pgtc_cost                   :: cs_cost_csq,
+                pgd_cost                    :: goal_cost_csq,
                     % The per-call cost of this call in call sequence counts.
                 
-                pgtc_coat_above_threshold   :: cost_above_par_threshold,
+                pgd_cost_above_threshold    :: cost_above_par_threshold,
+
+                pgd_var_production_map      :: map(var_rep, lazy(var_use_info)),
+                pgd_var_consumption_map     :: map(var_rep, lazy(var_use_info))
+                    % Variable production and consumption information.
+            ).
             
-                pgtc_args                   :: list(var_mode_and_use),
+:- type pard_goal_type 
+    --->    pgt_call(
+                pgtc_args                   :: list(var_and_mode),
                     % The argument modes and use information.
 
                 pgtc_call_site              :: cost_and_callees
@@ -197,34 +204,17 @@ pard_goal_detail_annon_to_pard_goal_anno
     ;       pgt_other_atomic_goal
     ;       pgt_non_atomic_goal.
 
-:- inst pgt_call 
-    --->    pgt_call(ground, bound(cost_above_par_threshold), ground, 
-                ground).
-
-:- inst pgt_atomic_goal
-    --->    pgt_call(ground, ground, ground, ground)
-    ;       pgt_other_atomic_goal.
-    
-    % A variable, it's mode and it's usage in the callee.  The mode
-    % information is also summarised within the variable use information.
+    % A variable and its mode.
     %
-:- type var_mode_and_use
-    --->    var_mode_and_use(
+:- type var_and_mode
+    --->    var_and_mode(
                 vmu_var                 :: var_rep,
-                vmu_mode                :: var_mode_rep,
-                vmu_use                 :: lazy(var_use_info)
+                vmu_mode                :: var_mode_rep
             ).
 
 :- type candidate_par_conjunctions ==
     map(string_proc_label, candidate_par_conjunctions_proc(pard_goal_detail)).
 
-:- type pard_goals_partition
-    --->    pard_goals_partition(
-                pgp_goals               :: list(pard_goal_detail),
-                pgp_partition_num       :: int,
-                pgp_first_conj_num      :: int
-            ).
-
 %----------------------------------------------------------------------------%
 %
 % Recurse the call graph searching for parallelisation opportunities.
@@ -577,8 +567,11 @@ candidate_parallel_conjunctions_proc(Opt
             % find it's procedure representation.
             Candidates = map.init
         ;
-            progrep_search_proc(ProgRep, ProcLabel, ProcRep) 
-        ->
+            create_dynamic_procrep_coverage_report(Deep, PDPtr,
+                MaybeCoverageReport),
+            (
+                MaybeCoverageReport = ok(CoverageReport),
+                ProcRep = CoverageReport ^ prci_proc_rep,
             ProcRep ^ pr_defn = ProcDefnRep,
             ProcDefnRep ^ pdr_goal = Goal0,
             ProcDefnRep ^ pdr_var_table = VarTable,
@@ -590,17 +583,21 @@ candidate_parallel_conjunctions_proc(Opt
             Info = implicit_parallelism_info(Deep, ProgRep, Opts, CliquePtr,
                 CallSitesMap, RecursiveCallSiteCostMap, RecursionType,
                 VarTable, ProcLabel),
-            goal_annotate_with_instmap(Goal0, Goal,
+                some [!Goal] (
+                    !:Goal = Goal0,
+                    goal_annotate_with_instmap(!Goal,
                 initial_inst_map(ProcDefnRep), _FinalInstMap,
                 SeenDuplicateInstantiation, _ConsumedVars, _BoundVars),
-            goal_get_conjunctions_worth_parallelising(Info, Goal,
-                empty_goal_path, Candidates0, MessagesA),
-            !:Messages = !.Messages ++ MessagesA,
+                    goal_to_pard_goal(Info, empty_goal_path, !Goal, !Messages),
+                    goal_get_conjunctions_worth_parallelising(Info, 
+                        empty_goal_path, !.Goal, _, Candidates0, MessagesA),
+                    !:Messages = !.Messages ++ MessagesA
+                ),
             (
                 SeenDuplicateInstantiation =
                     have_not_seen_duplicate_instantiation,
-                list.foldl(
-                    build_candidate_par_conjunction_maps(ProcLabel, VarTable),
+                    list.foldl(build_candidate_par_conjunction_maps(ProcLabel,
+                            VarTable),
                     Candidates0, map.init, Candidates)
             ;
                 SeenDuplicateInstantiation = seen_duplicate_instantiation,
@@ -610,15 +607,27 @@ candidate_parallel_conjunctions_proc(Opt
                     !Messages)
             )
         ;
-            % Builtin procedures cannot be found in the program representation,
-            % and cannot be parallelised either.
+                MaybeCoverageReport = error(Error),
             Candidates = map.init,
-            append_message(proc(ProcLabel), warning_cannot_lookup_proc_defn,
-                !Messages)
+                append_message(proc(ProcLabel),
+                    error_coverage_procrep_error(Error), !Messages)
+            )
         ),
         Messages = !.Messages
     ).
 
+:- type coverage_and_instmap_info
+    --->    coverage_and_instmap_info(
+                cai_coverage                :: coverage_info,
+                cai_inst_map_info           :: inst_map_info
+            ).
+
+:- instance goal_annotation_add_instmap(coverage_info, 
+            coverage_and_instmap_info) where [
+        add_instmap(InstMap, Coverage, 
+            coverage_and_instmap_info(Coverage, InstMap))
+    ].
+
 :- pred build_candidate_par_conjunction_maps(string_proc_label::in,
     var_table::in, candidate_par_conjunction(pard_goal_detail)::in, 
     candidate_par_conjunctions::in, candidate_par_conjunctions::out) is det.
@@ -640,123 +649,142 @@ build_candidate_par_conjunction_maps(Pro
     svmap.set(ProcLabel, CandidateProc, !Map).
 
 :- pred goal_get_conjunctions_worth_parallelising(
-    implicit_parallelism_info::in, goal_rep(inst_map_info)::in, goal_path::in,
+    implicit_parallelism_info::in, goal_path::in,
+    pard_goal_detail::in, pard_goal_detail::out,
     list(candidate_par_conjunction(pard_goal_detail))::out,
     cord(message)::out) is det.
 
-goal_get_conjunctions_worth_parallelising(Info, Goal, GoalPath, Candidates,
+goal_get_conjunctions_worth_parallelising(Info, GoalPath, !Goal, Candidates,
         Messages) :-
-    Goal = goal_rep(GoalExpr, _, _),
+    GoalExpr0 = !.Goal ^ goal_expr_rep,
     (
         (
-            GoalExpr = conj_rep(Conjuncts),
-            conj_get_conjunctions_worth_parallelising(Info, 
-                Conjuncts, GoalPath, 1, CandidatesA, MessagesA),
-            conj_build_candidate_conjunctions(Info, Conjuncts,
-                GoalPath, MessagesB, CandidatesB),
-            Messages = MessagesA ++ MessagesB,
-            Candidates = CandidatesA ++ CandidatesB
-        ;
-            GoalExpr = disj_rep(Disjuncts),
-            disj_get_conjunctions_worth_parallelising(Info,
-                Disjuncts, GoalPath, 1, Candidates, Messages)
-        ;
-            GoalExpr = switch_rep(_, _, Cases),
-            switch_case_get_conjunctions_worth_parallelising(Info, Cases,
-                GoalPath, 1, Candidates, Messages)
-        ;
-            GoalExpr = ite_rep(Cond, Then, Else),
-            ite_get_conjunctions_worth_parallelising(Info, Cond, Then, Else,
-                GoalPath, Candidates, Messages)
+            GoalExpr0 = conj_rep(Conjs0),
+            map3_foldl(conj_get_conjunctions_worth_parallelising(Info,
+                    GoalPath), 
+                Conjs0, Conjs, Candidatess, Messagess, 1, _),
+            conj_build_candidate_conjunctions(Info, GoalPath, Conjs,
+                Cost, MessagesB, MaybeCandidate),
+            GoalExpr = conj_rep(Conjs),
+            Messages = cord_list_to_cord(Messagess) ++ MessagesB,
+            (
+                MaybeCandidate = yes(Candidate),
+                Candidates = [Candidate | condense(Candidatess)]
+            ;
+                MaybeCandidate = no,
+                Candidates = condense(Candidatess)
+            )
+        ;
+            GoalExpr0 = disj_rep(Disjs0),
+            map3_foldl(disj_get_conjunctions_worth_parallelising(Info, 
+                    GoalPath),
+                Disjs0, Disjs, Candidatess, Messagess, 1, _),
+            disj_calc_cost(Disjs, Cost),
+            GoalExpr = disj_rep(Disjs),
+            Messages = cord_list_to_cord(Messagess),
+            Candidates = condense(Candidatess)
+        ;
+            GoalExpr0 = switch_rep(Var, CanFail, Cases0),
+            map3_foldl(switch_case_get_conjunctions_worth_parallelising(Info, 
+                    GoalPath),
+                Cases0, Cases, Candidatess, Messagess, 1, _),
+            get_coverage_before_det(!.Goal ^ goal_annotation ^ pgd_coverage, 
+                CountBefore),
+            switch_calc_cost(Cases, CountBefore, Cost),
+            GoalExpr = switch_rep(Var, CanFail, Cases),
+            Messages = cord_list_to_cord(Messagess),
+            Candidates = condense(Candidatess)
         ;
-            GoalExpr = scope_rep(SubGoal, MaybeCut),
+            GoalExpr0 = ite_rep(Cond0, Then0, Else0),
+            ite_get_conjunctions_worth_parallelising(Info, GoalPath,
+                Cond0, Cond, Then0, Then, Else0, Else, Candidates, Messages),
+            ite_calc_cost(Cond, Then, Else, Cost),
+            GoalExpr = ite_rep(Cond, Then, Else)
+        ;
+            GoalExpr0 = scope_rep(SubGoal0, MaybeCut),
             ScopeGoalPath = 
                 goal_path_add_at_end(GoalPath, step_scope(MaybeCut)),
-            goal_get_conjunctions_worth_parallelising(Info, SubGoal,
-                ScopeGoalPath, Candidates, Messages) 
+            goal_get_conjunctions_worth_parallelising(Info, ScopeGoalPath,
+                SubGoal0, SubGoal, Candidates, Messages),
+            Cost = SubGoal ^ goal_annotation ^ pgd_cost,
+            GoalExpr = scope_rep(SubGoal, MaybeCut)
         ;
-            GoalExpr = negation_rep(SubGoal),
+            GoalExpr0 = negation_rep(SubGoal0),
             NegGoalPath = goal_path_add_at_end(GoalPath, step_neg),
-            goal_get_conjunctions_worth_parallelising(Info, SubGoal, 
-                NegGoalPath, Candidates, Messages) 
-        )
+            goal_get_conjunctions_worth_parallelising(Info, NegGoalPath,
+                SubGoal0, SubGoal, Candidates, Messages),
+            Cost = SubGoal ^ goal_annotation ^ pgd_cost,
+            GoalExpr = negation_rep(SubGoal)
+        ),
+        !Goal ^ goal_annotation ^ pgd_cost := Cost
     ;
-        GoalExpr = atomic_goal_rep(_, _, _, _),
+        GoalExpr0 = atomic_goal_rep(_, _, _, _),
         Messages = cord.empty,
-        Candidates = []
-    ).
+        Candidates = [],
+        GoalExpr = GoalExpr0
+    ),
+    !Goal ^ goal_expr_rep := GoalExpr.
 
 :- pred conj_get_conjunctions_worth_parallelising(
-    implicit_parallelism_info::in, list(goal_rep(inst_map_info))::in,
-    goal_path::in, int::in,
+    implicit_parallelism_info::in, goal_path::in, 
+    pard_goal_detail::in, pard_goal_detail::out,
     list(candidate_par_conjunction(pard_goal_detail))::out,
-    cord(message)::out) is det.
+    cord(message)::out, int::in, int::out) is det.
 
-conj_get_conjunctions_worth_parallelising(_, [], _, _, [], cord.empty).
-conj_get_conjunctions_worth_parallelising(Info, [Conj | Conjs], GoalPath,
-        ConjunctNum, Candidates, Messages) :-
-    ConjGoalPath = goal_path_add_at_end(GoalPath, step_conj(ConjunctNum)),
-    goal_get_conjunctions_worth_parallelising(Info, Conj, ConjGoalPath,
-        CandidatesHead, MessagesHead), 
-    
-    conj_get_conjunctions_worth_parallelising(Info, Conjs, GoalPath,
-        ConjunctNum+1, CandidatesTail, MessagesTail),
-
-    Candidates = CandidatesHead ++ CandidatesTail,
-    Messages = MessagesHead ++ MessagesTail.
+conj_get_conjunctions_worth_parallelising(Info, GoalPath, !Conj, Candidates,
+        Messages, !ConjNum) :-
+    ConjGoalPath = goal_path_add_at_end(GoalPath, step_conj(!.ConjNum)),
+    goal_get_conjunctions_worth_parallelising(Info, ConjGoalPath, !Conj,
+        Candidates, Messages),
+    !:ConjNum = !.ConjNum + 1.
 
 :- pred disj_get_conjunctions_worth_parallelising(
-    implicit_parallelism_info::in, list(goal_rep(inst_map_info))::in,
-    goal_path::in, int::in,
-    list(candidate_par_conjunction(pard_goal_detail))::out, cord(message)::out) 
-    is det.
+    implicit_parallelism_info::in, goal_path::in, 
+    pard_goal_detail::in, pard_goal_detail::out,
+    list(candidate_par_conjunction(pard_goal_detail))::out, 
+    cord(message)::out, int::in, int::out) is det.
 
-disj_get_conjunctions_worth_parallelising(_, [], _, _, [], cord.empty).
-disj_get_conjunctions_worth_parallelising(Info, [Disj | Disjs], GoalPath, DisjNum,
-        Candidates, Messages) :-
-    DisjGoalPath = goal_path_add_at_end(GoalPath, step_disj(DisjNum)),
-    goal_get_conjunctions_worth_parallelising(Info, Disj, DisjGoalPath,
-        HeadCandidates, HeadMessages),
-    disj_get_conjunctions_worth_parallelising(Info, Disjs, GoalPath, 
-        DisjNum + 1, TailCandidates, TailMessages),
-    Candidates = HeadCandidates ++ TailCandidates,
-    Messages = HeadMessages ++ TailMessages.
+disj_get_conjunctions_worth_parallelising(Info, GoalPath, !Disj, Candidates,
+        Messages, !DisjNum) :-
+    DisjGoalPath = goal_path_add_at_end(GoalPath, step_disj(!.DisjNum)),
+    goal_get_conjunctions_worth_parallelising(Info, DisjGoalPath, !Disj,
+        Candidates, Messages),
+    !:DisjNum = !.DisjNum + 1.
 
 :- pred switch_case_get_conjunctions_worth_parallelising(
-    implicit_parallelism_info::in, list(case_rep(inst_map_info))::in,
-    goal_path::in, int::in,
+    implicit_parallelism_info::in, goal_path::in,
+    case_rep(pard_goal_detail_annotation)::in,
+    case_rep(pard_goal_detail_annotation)::out,
     list(candidate_par_conjunction(pard_goal_detail))::out, 
-    cord(message)::out) is det.
+    cord(message)::out, int::in, int::out) is det.
 
-switch_case_get_conjunctions_worth_parallelising(_, [], _, _, [],
-        cord.empty).
-switch_case_get_conjunctions_worth_parallelising(Info, [Case | Cases], GoalPath,
-        CaseNum, Candidates, Messages) :-
-    Case = case_rep(_, _, Goal),
-    CaseGoalPath = goal_path_add_at_end(GoalPath, step_switch(CaseNum, no)),
-    goal_get_conjunctions_worth_parallelising(Info, Goal, CaseGoalPath,
-        HeadCandidates, HeadMessages),
-    switch_case_get_conjunctions_worth_parallelising(Info, Cases, GoalPath, 
-        CaseNum + 1, TailCandidates, TailMessages),
-    Candidates = HeadCandidates ++ TailCandidates,
-    Messages = HeadMessages ++ TailMessages.
+switch_case_get_conjunctions_worth_parallelising(Info, GoalPath, !Case, 
+        Candidates, Messages, !CaseNum) :-
+    Goal0 = !.Case ^ cr_case_goal,
+    CaseGoalPath = goal_path_add_at_end(GoalPath, step_switch(!.CaseNum, no)),
+    goal_get_conjunctions_worth_parallelising(Info, CaseGoalPath, Goal0, Goal,
+        Candidates, Messages),
+    !Case ^ cr_case_goal := Goal,
+    !:CaseNum = !.CaseNum + 1.
 
 :- pred ite_get_conjunctions_worth_parallelising(
-    implicit_parallelism_info::in, goal_rep(inst_map_info)::in,
-    goal_rep(inst_map_info)::in, goal_rep(inst_map_info)::in, goal_path::in,
+    implicit_parallelism_info::in,  goal_path::in,
+    pard_goal_detail::in, pard_goal_detail::out, 
+    pard_goal_detail::in, pard_goal_detail::out, 
+    pard_goal_detail::in, pard_goal_detail::out, 
     list(candidate_par_conjunction(pard_goal_detail))::out, cord(message)::out)
     is det.
 
-ite_get_conjunctions_worth_parallelising(Info, Cond, Then, Else, GoalPath,
+ite_get_conjunctions_worth_parallelising(Info, GoalPath, !Cond, !Then, !Else, 
         Candidates, Messages) :-
     CondGoalPath = goal_path_add_at_end(GoalPath, step_ite_cond),
     ThenGoalPath = goal_path_add_at_end(GoalPath, step_ite_then),
     ElseGoalPath = goal_path_add_at_end(GoalPath, step_ite_else),
-    goal_get_conjunctions_worth_parallelising(Info, Cond, CondGoalPath, 
+    goal_get_conjunctions_worth_parallelising(Info, CondGoalPath, !Cond,
         CondCandidates, CondMessages),
-    goal_get_conjunctions_worth_parallelising(Info, Then, ThenGoalPath,
+    goal_get_conjunctions_worth_parallelising(Info, ThenGoalPath, !Then,
         ThenCandidates, ThenMessages),
-    goal_get_conjunctions_worth_parallelising(Info, Else, ElseGoalPath,
+    goal_get_conjunctions_worth_parallelising(Info, ElseGoalPath, !Else,
         ElseCandidates, ElseMessages),
     Candidates = CondCandidates ++ ThenCandidates ++ ElseCandidates,
     Messages = CondMessages ++ ThenMessages ++ ElseMessages.
@@ -765,142 +793,93 @@ ite_get_conjunctions_worth_parallelising
     % of calls we've found and make any parallelisation decisions.
     %
 :- pred conj_build_candidate_conjunctions(implicit_parallelism_info::in,
-    list(goal_rep(inst_map_info))::in, goal_path::in,
-    cord(message)::out, 
-    list(candidate_par_conjunction(pard_goal_detail))::out) is det.
+    goal_path::in, list(pard_goal_detail)::in, goal_cost_csq::out,
+    cord(message)::out, maybe(candidate_par_conjunction(pard_goal_detail))::out)
+    is det.
 
-conj_build_candidate_conjunctions(Info, Conjs, GoalPath, Messages, 
-        Candidates) :-
+conj_build_candidate_conjunctions(Info, GoalPath, Conjs, Cost, Messages, 
+        MaybeCandidate) :-
     ProcLabel = Info ^ ipi_proc_label,
     Location = goal(ProcLabel, GoalPath),
     some [!Messages] 
     (
         !:Messages = cord.empty,
 
-        map_foldl2(goal_to_pard_goal(Info, GoalPath, 
-            ( func(Num) = step_conj(Num) ) ), Conjs, PardGoals, 1, _,
-            !Messages),
-        foldl(count_costly_calls, PardGoals, 0, NumCostlyCalls),
-        ( NumCostlyCalls > 1 -> 
+        % Preprocess the conjunction to find the costly calls and where they
+        % are.
+        foldl2(identify_costly_goals, Conjs, 1, _,
+            no_costly_goals, CostlyGoalsInfo), 
+        (
+            ( CostlyGoalsInfo = no_costly_goals
+            ; CostlyGoalsInfo = one_costly_goal(_)
+            ),
+            conj_calc_cost(Conjs, Cost),
+            MaybeCandidate = no
+        ;
+            CostlyGoalsInfo = many_costly_goals(_, _, NumCostlyCalls),
+            
             append_message(Location,
                 info_found_conjs_above_callsite_threshold(NumCostlyCalls),
                 !Messages), 
-            % We don't parallelise across non-atomic goals, so split a list
-            % of pard goals into partitions where non-atomic goals separate
-            % the partitions.
-            partition_pard_goals(Location, PardGoals, [], _, 
-                1, _NumPartitions, 0, _, [], PartitionedGoals, !Messages),
-            map(pardgoals_build_candidate_conjunction(Info, Location,
-                    GoalPath), 
-                PartitionedGoals, MaybeCandidates),
-            filter_map(maybe_is_yes, MaybeCandidates, Candidates),
+
+            pardgoals_build_candidate_conjunction(Info, Location, GoalPath, 
+                Conjs, MaybeCandidate),
+            (
+                MaybeCandidate = yes(Candidate),
             append_message(Location,
-                info_found_n_conjunctions_with_positive_speedup(
-                    length(Candidates)), !Messages)
+                    info_found_n_conjunctions_with_positive_speedup(1), 
+                    !Messages),
+                ExecMetrics = Candidate ^ cpc_par_exec_metrics,
+                Cost = call_goal_cost(ExecMetrics ^ pem_num_calls,
+                    ExecMetrics ^ pem_par_time)
         ;
-            Candidates = []
+                MaybeCandidate = no,
+                conj_calc_cost(Conjs, Cost)
+            )
         ),
         Messages = !.Messages
     ).
 
-:- pred count_costly_calls(pard_goal_detail::in, int::in, int::out) is det.
-
-count_costly_calls(Goal, !NumCostlyCalls) :-
-    identify_costly_call(Goal, Costly),
-    (
-        Costly = is_costly_goal,
-        !:NumCostlyCalls = !.NumCostlyCalls + 1
-    ;
-        Costly = is_not_costly_goal
-    ;
-        Costly = is_non_atomic_goal
+:- type costly_goals_info
+    --->    no_costly_goals
+    ;       one_costly_goal(
+                ocg_index               :: int
+            )
+    ;       many_costly_goals(
+                ocg_first_index         :: int,
+                ocg_last_index          :: int,
+                ocg_mum_goals           :: int
     ).
 
-:- pred partition_pard_goals(program_location::in, 
-    list(pard_goal_detail)::in,
-    list(pard_goal_detail)::in, list(pard_goal_detail)::out,
-    int::in, int::out, int::in, int::out,
-    list(pard_goals_partition)::in, list(pard_goals_partition)::out,
-    cord(message)::in, cord(message)::out) is det.
+:- pred identify_costly_goals(pard_goal_detail::in, int::in, int::out, 
+    costly_goals_info::in, costly_goals_info::out) is det.
 
-partition_pard_goals(Location, [], !Partition, !PartitionNum, !NumCostlyCalls,
-        !Partitions, !Messages) :-
-    ( !.NumCostlyCalls > 1 ->
-        partition_pard_goals_build_partition(!.Partition, !.PartitionNum,
-            Partition),
-        !:Partitions = [ Partition | !.Partitions ]
-    ;
-        true     
-    ),
-    ( !.PartitionNum \= 1 ->
-        append_message(Location,
-            info_split_conjunction_into_partitions(!.PartitionNum), !Messages)
-    ;
-        true
-    ),
-    !:Partition = [],
-    reverse(!Partitions).
-partition_pard_goals(Location, [ PG | PGs ], !Partition, !PartitionNum,
-        !NumCostlyCalls, !Partitions, !Messages) :-
-    PGType = PG ^ goal_annotation ^ pgd_pg_type,
-    (
+identify_costly_goals(Goal, !Index, !CostlyGoalsInfo) :-
+    identify_costly_goal(Goal, Costly),
         (
-            PGType = pgt_call(_, CostAboveThreshold, _, _),
-            (
-                CostAboveThreshold = cost_above_par_threshold,
-                !:NumCostlyCalls = !.NumCostlyCalls + 1
-            ;
-                CostAboveThreshold = cost_not_above_par_threshold
-            )
-        ;
-            PGType = pgt_other_atomic_goal
+        ( Costly = is_costly_atomic_goal
+        ; Costly = is_costly_compound_goal
         ),
-        !:Partition = [ PG | !.Partition ]
-    ;
-        PGType = pgt_non_atomic_goal,
-        ( !.NumCostlyCalls > 1 ->
-            partition_pard_goals_build_partition(!.Partition, !.PartitionNum,
-                Partition),
-            !:Partitions = [ Partition | !.Partitions ]
-        ;
-            append_message(Location,
-                notice_partition_does_not_have_costly_calls(!.PartitionNum,
-                    !.NumCostlyCalls), !Messages)
-        ),
-        !:PartitionNum = !.PartitionNum + 1,
-        !:NumCostlyCalls = 0,
-        !:Partition = [] 
-    ),
-    partition_pard_goals(Location, PGs, !Partition, !PartitionNum,
-        !NumCostlyCalls, !Partitions, !Messages).
-
-:- pred partition_pard_goals_build_partition(list(pard_goal_detail)::in,
-    int::in, pard_goals_partition::out) is det.
-
-partition_pard_goals_build_partition(RevGoals, PartitionNum, Partition) :-
-    reverse(RevGoals, Goals),
     (
-        Goals = [FirstGoal | _],
-        FirstGoalPath = FirstGoal ^ goal_annotation ^ pgd_original_path,
-        (
-            step_conj(ConjNumPrime) = goal_path_get_last(FirstGoalPath)
-        ->
-            ConjNum = ConjNumPrime
+            !.CostlyGoalsInfo = no_costly_goals,
+            !:CostlyGoalsInfo = one_costly_goal(!.Index)
         ;
-            error(this_file ++ "Expected goal to be part of a conjunction")
+            !.CostlyGoalsInfo = one_costly_goal(FirstIndex),
+            !:CostlyGoalsInfo = many_costly_goals(FirstIndex, !.Index, 2)
+        ;
+            !.CostlyGoalsInfo = many_costly_goals(FirstIndex, _, Num),
+            !:CostlyGoalsInfo = many_costly_goals(FirstIndex, !.Index, Num+1)
         )
     ;
-        Goals = [],
-        error(this_file ++ "Trying to build empty goal partition")
+        Costly = is_not_costly_goal
     ),
-    Partition = 
-        pard_goals_partition(Goals, PartitionNum, ConjNum).
+    !:Index = !.Index + 1.
 
 :- pred pardgoals_build_candidate_conjunction(implicit_parallelism_info::in,
-    program_location::in, goal_path::in, pard_goals_partition::in,
+    program_location::in, goal_path::in, list(pard_goal_detail)::in,
     maybe(candidate_par_conjunction(pard_goal_detail))::out) is det.
 
-pardgoals_build_candidate_conjunction(Info, Location, GoalPath, GoalsPartition,
+pardgoals_build_candidate_conjunction(Info, Location, GoalPath, Goals,
         MaybeCandidate) :-
     % Setting up the first parallel conjunct is a different algorithm to the
     % latter ones, at this point we have the option of moving goals from before
@@ -909,15 +888,14 @@ pardgoals_build_candidate_conjunction(In
     % efficient.  However if goals within other parallel conjuncts depend on
     % them and don't depend upon the first costly call then this would make the
     % conjunction dependent when it could be independent.
-    pard_goals_partition(Goals, PartNum, FirstConjNum) = GoalsPartition,
-    find_best_parallelisation(Info, Location, PartNum, Goals,
-        BestParallelisation),
+    find_best_parallelisation(Info, Location, Goals, BestParallelisation),
+    FirstConjNum = 1,
     ParalleliseDepConjs = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs,
     BestParallelisation = bp_parallel_execution(GoalsBefore, ParConjs,
         GoalsAfter, IsDependent, Metrics),
     Speedup = parallel_exec_metrics_get_speedup(Metrics),
     Candidate = candidate_par_conjunction(goal_path_to_string(GoalPath),
-        PartNum, FirstConjNum, IsDependent, GoalsBefore, ParConjs, GoalsAfter,
+        FirstConjNum, IsDependent, GoalsBefore, ParConjs, GoalsAfter,
         Metrics),
     (
         Speedup > 1.0,
@@ -967,11 +945,10 @@ pardgoals_build_candidate_conjunction(In
             ).
 
 :- pred find_best_parallelisation(implicit_parallelism_info::in, 
-    program_location::in, int::in, 
-    list(pard_goal_detail)::in, best_parallelisation::out) is det.
+    program_location::in, list(pard_goal_detail)::in, 
+    best_parallelisation::out) is det.
 
-find_best_parallelisation(Info, Location, PartNum, Goals,
-        BestParallelisation) :-
+find_best_parallelisation(Info, Location, Goals, BestParallelisation) :-
     % Decide which algorithm to use.
     ConjunctionSize = length(Goals),
     choose_algorithm(Info, ConjunctionSize, Algorithm),
@@ -979,11 +956,11 @@ find_best_parallelisation(Info, Location
     preprocess_conjunction(Goals, Algorithm, PreprocessedGoals),
     (
         Algorithm = bpa_complete_bnb(_),
-        find_best_parallelisation_complete_bnb(Info, Location, PartNum,
+        find_best_parallelisation_complete_bnb(Info, Location, 
             PreprocessedGoals, BestParallelisation)
     ;
         Algorithm = bpa_greedy,
-        find_best_parallelisation_greedy(Info, Location, PartNum,
+        find_best_parallelisation_greedy(Info, Location, 
             PreprocessedGoals, BestParallelisation)
     ).
 
@@ -1072,8 +1049,8 @@ new_group([G | Gs], P) = GoalGroup :-
                     list(pard_goal_detail),
 
                 gfp_dependency_graphs       :: dependency_graphs,
-                gfp_costly_call_indexes     :: list(int),
-                gfp_num_calls               :: float
+                gfp_costly_goal_indexes     :: list(int),
+                gfp_num_calls               :: int
             ).
 
 :- inst goals_for_parallelisation
@@ -1090,50 +1067,50 @@ preprocess_conjunction(Goals0, Algorithm
     % Phase 1: Build a dependency map.
     build_dependency_graphs(Goals0, DependencyGraphs),
     % Phase 2: Find the costly calls.
-    identify_costly_calls(Goals0, Algorithm, 1, GoalGroups,
-        CostlyCallsIndexes),
+    preprocess_conjunction_into_groups(Goals0, Algorithm, 1, GoalGroups,
+        CostlyGoalsIndexes),
 
     % Get the number of calls into this conjunction.
     (
-        CostlyCallsIndexes = [FirstCostlyCallIndex | _],
-        list.index1(Goals0, FirstCostlyCallIndex, FirstCostlyCall),
-        GoalType = FirstCostlyCall ^ goal_annotation ^ pgd_pg_type,
-        GoalType = pgt_call(_, _, _, CallSite)
+        CostlyGoalsIndexes = [FirstCostlyGoalIndex | _],
+        list.index1(Goals0, FirstCostlyGoalIndex, FirstCostlyGoal)
     ->
-        NumCalls = cs_cost_get_calls(CallSite ^ cac_cost)
+        Cost = FirstCostlyGoal ^ goal_annotation ^ pgd_cost,
+        NumCalls = goal_cost_get_calls(Cost)
     ;
         error(this_file ++ "Expected call goal")
     ),
 
     GoalsForParallelisation = goals_for_parallelisation(GoalGroups,
-        Goals0, DependencyGraphs, CostlyCallsIndexes, NumCalls).
+        Goals0, DependencyGraphs, CostlyGoalsIndexes, NumCalls).
 
-    % identify_costly_calls(Goals, 1, GoalGroups, SortedCostlyIndexes).
+    % identify_costly_goals(Goals, 1, GoalGroups, SortedCostlyIndexes).
     %
     % GoalGroups are Goals divided into groups of single costly calls and
     % multiple goals in-between these calls.  SortedCostlyIndexes are the
     % indexes of the costly calls in the original list (starting at 1).  This
     % predicate is undefined if any of the goals in Goals are non-atomic.
     %
-:- pred identify_costly_calls(list(pard_goal_detail)::in,
+:- pred preprocess_conjunction_into_groups(list(pard_goal_detail)::in,
     best_par_algorithm::in, int::in,
     list(goal_group(goal_classification))::out, list(int)::out) is det.
 
-identify_costly_calls([], _, _, [], []).
-identify_costly_calls([Goal | Goals], Alg, Index, GoalGroups, Indexes) :-
-    identify_costly_calls(Goals, Alg, Index+1, GoalGroups0, Indexes0),
-    identify_costly_call(Goal, Costly),
+preprocess_conjunction_into_groups([], _, _, [], []).
+preprocess_conjunction_into_groups([Goal | Goals], Alg, Index, GoalGroups,
+        Indexes) :-
+    preprocess_conjunction_into_groups(Goals, Alg, Index+1, GoalGroups0,
+        Indexes0),
+    identify_costly_goal(Goal, Costly),
     (
-        Costly = is_costly_goal,
+        ( Costly = is_costly_atomic_goal
+        ; Costly = is_costly_compound_goal
+        ),
         GoalClassification = gc_costly_goals,
         Indexes = [Index | Indexes0]
     ;
         Costly = is_not_costly_goal,
         GoalClassification = gc_cheap_goals,
         Indexes = Indexes0
-    ;
-        Costly = is_non_atomic_goal,
-        error(this_file ++ "Unexpected pgt_non_atomic_goal")
     ),
     (
         Alg = bpa_greedy,
@@ -1176,7 +1153,8 @@ start_building_parallelisation(Info, Num
     SparkCost = Info ^ ipi_opts ^ cpcp_sparking_cost,
     SparkDelay = Info ^ ipi_opts ^ cpcp_sparking_delay,
     ContextWakeupDelay = Info ^ ipi_opts ^ cpcp_context_wakeup_delay,
-    foldl(pardgoal_calc_cost, GoalsBefore, 0.0, CostBefore),
+    conj_calc_cost(GoalsBefore, CostBefore0),
+    CostBefore = goal_cost_get_percall(CostBefore0),
     Metrics = init_empty_parallel_exec_metrics(CostBefore, NumCalls, 
         float(SparkCost), float(SparkDelay), float(ContextWakeupDelay)),
     Overlap = peo_empty_conjunct, 
@@ -1192,7 +1170,8 @@ start_building_parallelisation(Info, Num
 finalise_parallelisation(GoalsAfter, !Parallelisation) :-
     !.Parallelisation = incomplete_parallelisation(GoalsBefore, Conjuncts,
         Overlap, Metrics0, _),
-    foldl(pardgoal_calc_cost, GoalsAfter, 0.0, CostAfter),
+    conj_calc_cost(GoalsAfter, CostAfter0),
+    CostAfter = goal_cost_get_percall(CostAfter0),
     Metrics = finalise_parallel_exec_metrics(Metrics0, CostAfter),
     par_conj_overlap_is_dependent(Overlap, IsDependent),
     !:Parallelisation = bp_parallel_execution(GoalsBefore, Conjuncts,
@@ -1203,10 +1182,10 @@ finalise_parallelisation(GoalsAfter, !Pa
     % Find the best parallelisation using the branch and bound algorithm.
     %
 :- pred find_best_parallelisation_complete_bnb(implicit_parallelism_info::in,
-    program_location::in, int::in, goals_for_parallelisation::in,
+    program_location::in, goals_for_parallelisation::in,
     best_parallelisation::out) is det.
 
-find_best_parallelisation_complete_bnb(Info, Location, PartNum,
+find_best_parallelisation_complete_bnb(Info, Location, 
         PreprocessedGoals, BestParallelisation) :-
     PreprocessedGoals = goals_for_parallelisation(GoalGroups, _, 
         DependencyMaps, CostlyCallsIndexes, NumCalls),
@@ -1217,8 +1196,8 @@ find_best_parallelisation_complete_bnb(I
     ),
     
     branch_and_bound(
-        generate_parallelisations(Info, Location, PartNum, LastCostlyCallIndex, 
-            round_to_int(NumCalls), DependencyMaps, GoalGroups),
+        generate_parallelisations(Info, Location, LastCostlyCallIndex, 
+            NumCalls, DependencyMaps, GoalGroups),
         parallelisation_get_objective_value,
         Solutions, Profile),
     
@@ -1253,7 +1232,7 @@ find_best_parallelisation_complete_bnb(I
             % used for guided parallelisation.
             TempInfo = Info ^ ipi_opts ^ cpcp_parallelise_dep_conjs := 
                 parallelise_dep_conjs_overlap,
-            find_best_parallelisation_complete_bnb(TempInfo, Location, PartNum,
+            find_best_parallelisation_complete_bnb(TempInfo, Location,
                 PreprocessedGoals, BestParallelisation)
         )
     ).
@@ -1269,11 +1248,11 @@ parallelisation_get_objective_value(Para
     Value = Metrics ^ pem_par_time + Metrics ^ pem_par_overheads * 2.0.
 
 :- semipure pred generate_parallelisations(implicit_parallelism_info::in,
-    program_location::in, int::in, int::in, int::in, dependency_graphs::in,
+    program_location::in, int::in, int::in, dependency_graphs::in,
     list(goal_group(goal_classification))::in, 
     bnb_state(best_parallelisation)::in, best_parallelisation::out) is nondet.
 
-generate_parallelisations(Info, _Location, _PartNum, LastCostlyCallIndex,
+generate_parallelisations(Info, _Location, LastCostlyCallIndex,
         NumCalls, DependencyMaps, !.GoalGroups, BNBState, 
         BestParallelisation) :-
     some [!GoalNum, !Parallelisation] (
@@ -1424,11 +1403,11 @@ generate_parallel_conjunct(Goals, !GoalN
     % conjunction.
     %
 :- pred find_best_parallelisation_greedy(implicit_parallelism_info::in,
-    program_location::in, int::in, 
+    program_location::in, 
     goals_for_parallelisation::in(goals_for_parallelisation), 
     best_parallelisation::out) is det.
 
-find_best_parallelisation_greedy(Info, _Location, _PartNum,
+find_best_parallelisation_greedy(Info, _Location, 
         PreprocessedGoals, !:Parallelisation) :-
     some [!GoalGroups, !ConjNum] (
         PreprocessedGoals = goals_for_parallelisation(!:GoalGroups, _,
@@ -1453,8 +1432,8 @@ find_best_parallelisation_greedy(Info, _
             !:GoalGroups = [ FirstGroup | !.GoalGroups ],
             GoalsBeforeConj = []
         ),
-        start_building_parallelisation(Info, round_to_int(NumCalls), 
-            GoalsBeforeConj, !:Parallelisation),
+        start_building_parallelisation(Info, NumCalls, GoalsBeforeConj,
+            !:Parallelisation),
 
         build_parallel_conjuncts_greedy(Info, DependencyMaps,
             CostlyCallIndexes, 0, [], [], LastParConj, !ConjNum,
@@ -1820,7 +1799,8 @@ calculate_parallel_cost_step(Info, IsInn
    
     ProductionsMap0 = !.Parallelisation ^ ip_productions_map,
 
-    foldl(pardgoal_calc_cost, Goals, 0.0, CostB),
+    conj_calc_cost(Goals, CostB0),
+    CostB = goal_cost_get_percall(CostB0),
     foldl(pardgoal_consumed_vars_accum, Goals, set.init,
         RightConsumedVars),
     ProducedVars = 
@@ -2000,32 +1980,6 @@ par_conj_overlap_is_dependent(peo_conjun
         )
     ).
 
-:- pred pardgoal_calc_cost(pard_goal_detail::in, float::in, float::out) 
-    is det.
-
-pardgoal_calc_cost(Goal, !Cost) :-
-    GoalType = Goal ^ goal_annotation ^ pgd_pg_type,
-    (
-        GoalType = pgt_call(Cost, _, _, _),
-        ( cs_cost_get_calls(Cost) > 0.0 ->
-            !:Cost = !.Cost + cs_cost_get_percall(Cost)
-        ;
-            % Goals that are never called have no cost
-            true
-        )
-    ;
-        GoalType = pgt_other_atomic_goal,
-        % Atomic goals are usually trivial but for the purposes of calculating
-        % the overlap of dependent conjunctions we'd like variable
-        % production/consumption information to be in order even among atomic
-        % goals.  Therefore atomic goals have a cost of 1.0.  This must be
-        % included here so we can compare costs properly.
-        !:Cost = !.Cost + 1.0
-    ;
-        GoalType = pgt_non_atomic_goal,
-        error(this_file ++ "unexpected non atomic goal")
-    ).
-
 :- type dependency_graphs
     ---> dependency_graphs(
             dm_forward              :: digraph(int),
@@ -2124,7 +2078,7 @@ get_productions_map(Goal, !Time, !Execut
     BoundVars = InstMapInfo ^ im_bound_vars,
     adjust_time_for_waits(!Time, !Executions),
     fold(var_production_time_to_map(!.Time, Goal), BoundVars, !Map),
-    pardgoal_calc_cost(Goal, !Time).
+    !:Time = !.Time + goal_cost_get_percall(Goal ^ goal_annotation ^ pgd_cost).
 
 :- pred adjust_time_for_waits(float::in, float::out, 
     assoc_list(float, float)::in, assoc_list(float, float)::out) is det.
@@ -2133,10 +2087,10 @@ adjust_time_for_waits(!Time, !Executions
     (
         !.Executions = [ Execution | NextExecution ],
         ( Start - End ) = Execution,
-        ( !.Time < Start ->
+        ( (!.Time + adjust_time_for_waits_epsilon) < Start ->
             error("adjust_time_for_waits: " ++
                 "Time occurs before the current execution")
-        ; !.Time < End ->
+        ; !.Time =< (End + adjust_time_for_waits_epsilon) ->
             % The production is within the current execution, no adjustment is
             % necessary.
             true
@@ -2161,13 +2115,11 @@ adjust_time_for_waits_2(LastEnd, !Time, 
         % Do the adjustment.
         !:Time = !.Time + (Start - LastEnd),
 
-% This has been commented out as it is easily triggered by floating point
-% rounding errors and I don't know enough about ieee754 to fix it.
-%        ( !.Time < Start ->
-%            error(format("adjust_time_for_waits: Adjustment didn't work, " ++
-%                "time occurs before the current execution. " ++
-%                "Time: %f, Start: %f.", [f(!.Time), f(Start)]))
-        ( !.Time < End ->
+        ( (!.Time + adjust_time_for_waits_epsilon) < Start ->
+            error(format("adjust_time_for_waits: Adjustment didn't work, " ++
+                "time occurs before the current execution. " ++
+                "Time: %f, Start: %f.", [f(!.Time), f(Start)]))
+        ; !.Time =< (End + adjust_time_for_waits_epsilon) ->
             % The adjustment worked.
             true
         ;
@@ -2180,6 +2132,10 @@ adjust_time_for_waits_2(LastEnd, !Time, 
         error("adjust_time_for_waits: Ran out of executions")
     ).
 
+:- func adjust_time_for_waits_epsilon = float.
+
+adjust_time_for_waits_epsilon = 0.0001.
+
     % var_production_time_to_map(TimeBefore, Goal, Var, !Map).
     %
     % Find the latest production time of Var in Goal, and add TimeBefore + the
@@ -2190,18 +2146,8 @@ adjust_time_for_waits_2(LastEnd, !Time, 
     var_rep::in, map(var_rep, float)::in, map(var_rep, float)::out) is det.
 
 var_production_time_to_map(TimeBefore, Goal, Var, !Map) :-
-    solutions(var_first_use_time(find_production, TimeBefore, Goal, Var), 
-        Times),
-    (
-        Times = [Time],
-        % A production can only occur once in a call's arguments, therefore
-        % there is only one solution here.
-        svmap.det_insert(Var, Time, !Map)
-    ;
-        Times = [_, _ | _],
-        error(this_file ++ 
-            "Too many solutions for var_first_use_time for a production")
-    ).
+    var_first_use_time(find_production, TimeBefore, Goal, Var, Time),
+    svmap.det_insert(Var, Time, !Map).
 
     % foldl(get_consumptions_list(Vars), Goals, 0.0, _, [], RevConsumptions),
     %
@@ -2228,21 +2174,13 @@ get_consumptions_list(Goal, !Vars, !Time
         ), ConsumptionTimes0, ConsumptionTimes),
     !:List = ConsumptionTimes ++ !.List,
     !:Vars = difference(!.Vars, ConsumptionVars),
-    pardgoal_calc_cost(Goal, !Time).
+    !:Time = !.Time + goal_cost_get_percall(Goal ^ goal_annotation ^ pgd_cost).
 
 :- pred var_consumptions(float::in, pard_goal_detail::in, var_rep::in,
     pair.pair(var_rep, float)::out) is det.
 
 var_consumptions(TimeBefore, Goal, Var, Var - Time) :-
-    % This will only have multiple solutions where one variable appears a list
-    % of call arguments more than once.
-    solutions(var_first_use_time(find_consumption, TimeBefore, Goal, Var),
-        Times),
-    % The earliest consumption is the consumption that matters.  solutions/2
-    % returns the solutions in ascending sorted order so the first one will be
-    % the earliest one.
-    Times = [FirstTime | OtherTimes],
-    Time = foldl(float.min, OtherTimes, FirstTime).
+    var_first_use_time(find_consumption, TimeBefore, Goal, Var, Time).
 
 :- type find_production_or_consumption
     --->    find_production
@@ -2256,17 +2194,17 @@ var_consumptions(TimeBefore, Goal, Var, 
     %   Time is Time0 + the time that Goal first consumes Var.
     %
 :- pred var_first_use_time(find_production_or_consumption::in, 
-    float::in, pard_goal_detail::in, var_rep::in, float::out) is multi.
+    float::in, pard_goal_detail::in, var_rep::in, float::out) is det.
 
 var_first_use_time(FindProdOrCons, TimeBefore, Goal, Var, Time) :-
-    GoalType = Goal ^ goal_annotation ^ pgd_pg_type,
     (
-        GoalType = pgt_call(Cost, _, Args, _),
-        (
-            member(Arg, Args),
-            Arg = var_mode_and_use(Var, _, LazyUse)
-        ->
-            CostPercall = cs_cost_get_percall(Cost),
+        FindProdOrCons = find_production,
+        Map = Goal ^ goal_annotation ^ pgd_var_production_map
+    ;
+        FindProdOrCons = find_consumption,
+        Map = Goal ^ goal_annotation ^ pgd_var_consumption_map
+    ),
+    map.lookup(Map, Var, LazyUse),
             Use = force(LazyUse),
             UseType = Use ^ vui_use_type,
             (
@@ -2297,36 +2235,11 @@ var_first_use_time(FindProdOrCons, TimeB
                 % XXX: How often does this occur?
                 (
                     FindProdOrCons = find_production,
-                    UseTime = CostPercall
-                ;
-                    FindProdOrCons = find_consumption,
-                    UseTime = 0.0
-                )
-            )
-        ;
-            (
-                FindProdOrCons = find_production,
-                error("var_first_use_time: "
-                    ++ "Couldn't find var in arguments of call")
-            ;
-                FindProdOrCons = find_consumption,
-                % This must be a higher order call where the variable being
-                % consued is the higher order value.
-                UseTime = 0.0
-            )
-        )
-    ;
-        GoalType = pgt_other_atomic_goal,
-        (
-            FindProdOrCons = find_production,
-            UseTime = 1.0
+            UseTime = goal_cost_get_percall(Goal ^ goal_annotation ^ pgd_cost)
         ;
             FindProdOrCons = find_consumption,
             UseTime = 0.0
         )
-    ;
-        GoalType = pgt_non_atomic_goal,
-        error("Auto parallelisation over non-atomic goals NIY")
     ),
     Time = TimeBefore + UseTime.
 
@@ -2342,140 +2255,187 @@ pardgoal_consumed_vars_accum(Goal, !Vars
     % Check if it is appropriate to parallelise this call.  That is it must be
     % model_det and have a cost above the call site cost threshold.
     %
-:- pred can_parallelise_call(implicit_parallelism_info::in,
-    detism_rep::in, cs_cost_csq::in) is semidet.
+:- pred can_parallelise_goal(implicit_parallelism_info::in,
+    detism_rep::in, goal_cost_csq::in) is semidet.
 
-can_parallelise_call(Info, Detism, Cost) :-
+can_parallelise_goal(Info, Detism, Cost) :-
     ( Detism = det_rep
     ; Detism = cc_multidet_rep ),
-    ( cs_cost_get_calls(Cost) > 0.0 ->
-        % This is conditional so that we can gauretee that it never causes a
-        % divide by zero error,
-        PercallCost = cs_cost_get_percall(Cost),
-        PercallCost > float(Info ^ ipi_opts ^ cpcp_call_site_threshold)
-    ;
-        fail 
-    ).
+    goal_cost_get_calls(Cost) > 0,
+    PercallCost = goal_cost_get_percall(Cost),
+    PercallCost > float(Info ^ ipi_opts ^ cpcp_call_site_threshold).
 
-:- pred maybe_costly_call(implicit_parallelism_info::in, goal_path::in,
-    atomic_goal_rep::in, detism_rep::in, inst_map_info::in,
-    pard_goal_type::out(pgt_atomic_goal), cord(message)::out) is det.
+:- pred atomic_pard_goal_type(implicit_parallelism_info::in, goal_path::in,
+    atomic_goal_rep::in, inst_map_info::in, pard_goal_type::out, 
+    cord(message)::out) is det.
 
-maybe_costly_call(Info, GoalPath, AtomicGoal, Detism,
-        InstMapInfo, GoalType, !:Messages) :-
+atomic_pard_goal_type(Info, GoalPath, AtomicGoal, InstMapInfo, GoalType,
+        !:Messages) :-
     !:Messages = cord.empty,
     InstMapBefore = InstMapInfo ^ im_before,
     InstMapAfter = InstMapInfo ^ im_after,
+    atomic_goal_is_call(AtomicGoal, IsCall),
     (
-        ( AtomicGoal = unify_construct_rep(_, _, _)
-        ; AtomicGoal = unify_deconstruct_rep(_, _, _)
-        ; AtomicGoal = partial_construct_rep(_, _, _)
-        ; AtomicGoal = partial_deconstruct_rep(_, _, _)
-        ; AtomicGoal = unify_assign_rep(_, _)
-        ; AtomicGoal = cast_rep(_, _)
-        ; AtomicGoal = unify_simple_test_rep(_, _)
-        % Don't bother parallelising foreign code, builtins or events.
-        ; AtomicGoal = pragma_foreign_code_rep(_)
-        ; AtomicGoal = builtin_call_rep(_, _, _)
-        ; AtomicGoal = event_call_rep(_, _)
-        ),
+        IsCall = atomic_goal_is_trivial,
         GoalType = pgt_other_atomic_goal 
     ;
-        ( AtomicGoal = higher_order_call_rep(_, Args)
-        ; AtomicGoal = method_call_rep(_, _, Args)
-        ; AtomicGoal = plain_call_rep(_, _, Args)
-        ),
-        
+        IsCall = atomic_goal_is_call(Args),
         % Lookup var use information.
         map.lookup(Info ^ ipi_call_sites, GoalPath, CallSite),
-        map_foldl(compute_var_modes_and_uses(Info, GoalPath, CallSite, 
-                InstMapBefore, InstMapAfter),
-            Args, VarsModesAndUses, 0, _),
+        map_foldl(compute_var_modes(InstMapBefore, InstMapAfter),
+            Args, VarsAndModes, 0, _),
+        GoalType = pgt_call(VarsAndModes, CallSite) 
+    ).
 
+:- pred atomic_pard_goal_cost(implicit_parallelism_info::in, goal_path::in,
+    atomic_goal_rep::in, goal_cost_csq::out) is det.
+
+atomic_pard_goal_cost(Info, GoalPath, AtomicGoal, Cost) :-
+    atomic_goal_is_call(AtomicGoal, IsCall),
+    (
+        IsCall = atomic_goal_is_trivial,
+        % XXX: Should include the number of calls here since the 0 makes this
+        % code appear to be dead when it probably isn't.
+        Cost = atomic_goal_cost
+    ;
+        IsCall = atomic_goal_is_call(_),
+        map.lookup(Info ^ ipi_call_sites, GoalPath, CallSite),
         (
             cost_and_callees_is_recursive(Info ^ ipi_clique, CallSite), 
             map.search(Info ^ ipi_rec_call_sites, GoalPath, RecCost) 
         ->
-            Cost = RecCost
+            CSCost = RecCost
         ;
-            Cost = CallSite ^ cac_cost
+            CSCost = CallSite ^ cac_cost
         ),
-        % XXX: The goal annotations cannot represent reasons why a goal
-        % can't be parallelised, for example it could be nondet, semidet or
-        % impure.
-        ( can_parallelise_call(Info, Detism, Cost) ->
-            CostAboveThreshold = cost_above_par_threshold
-        ;
-            CostAboveThreshold = cost_not_above_par_threshold
-        ),
-        GoalType = pgt_call(Cost, CostAboveThreshold, VarsModesAndUses,
-            CallSite) 
+        Cost = call_goal_cost(CSCost)
     ).
 
-:- pred compute_var_modes_and_uses(implicit_parallelism_info::in,
-    goal_path::in, cost_and_callees::in, inst_map::in, inst_map::in, 
-    var_rep::in, var_mode_and_use::out, int::in, int::out) is det.
+:- pred compute_var_modes(inst_map::in, inst_map::in, 
+    var_rep::in, var_and_mode::out, int::in, int::out) is det.
 
-compute_var_modes_and_uses(Info, GoalPath, CostAndCallee, InstMapBefore, InstMapAfter, Arg,
-        VarModeAndUse, !ArgNum) :-
+compute_var_modes(InstMapBefore, InstMapAfter, Arg, VarAndMode, !ArgNum) :-
     var_get_mode(InstMapBefore, InstMapAfter, Arg, Mode),
-    var_mode_to_var_use_type(Mode, VarUseType),
-    ArgNum = !.ArgNum,
-    LazyUse = delay((func) = compute_var_modes_and_uses_lazy(Info, GoalPath,
-        CostAndCallee, ArgNum, VarUseType)),
-    VarModeAndUse = var_mode_and_use(Arg, Mode, LazyUse),
+    VarAndMode = var_and_mode(Arg, Mode),
     !:ArgNum = !.ArgNum + 1.
 
-:- func compute_var_modes_and_uses_lazy(implicit_parallelism_info, 
-    goal_path, cost_and_callees, int, var_use_type) = var_use_info.
+:- pred atomic_goal_build_use_map(atomic_goal_rep::in, goal_path::in, 
+    implicit_parallelism_info::in, var_use_type::in, var_rep::in, 
+    map(var_rep, lazy(var_use_info))::in,
+    map(var_rep, lazy(var_use_info))::out) is det.
 
-compute_var_modes_and_uses_lazy(Info, GoalPath, CostAndCallee, ArgNum,
-        VarUseType) = Use :-
-    % Get cost
+atomic_goal_build_use_map(AtomicGoal, GoalPath, Info, VarUseType, Var,
+        !Map) :-
+    atomic_goal_is_call(AtomicGoal, IsCall), 
+    (
+        IsCall = atomic_goal_is_trivial,
     (
-        cost_and_callees_is_recursive(Info ^ ipi_clique, CostAndCallee),
+            VarUseType = var_use_consumption,
+            CostUntilUse = 0.0
+        ;
+            ( VarUseType = var_use_production
+            ; VarUseType = var_use_other
+            ),
+            CostUntilUse = 1.0
+        ),
+        LazyUse = val(var_use_info(CostUntilUse, 1.0, VarUseType))
+    ;
+        IsCall = atomic_goal_is_call(Args),
+        LazyUse = delay((func) = compute_var_use_lazy(Info, GoalPath, Var,
+            Args, VarUseType))
+    ),
+    svmap.det_insert(Var, LazyUse, !Map).
+
+:- func compute_var_use_lazy(implicit_parallelism_info, goal_path, 
+    var_rep, list(var_rep), var_use_type) = var_use_info.
+
+compute_var_use_lazy(Info, GoalPath, Var, Args, VarUseType) = Use :-
+    CliquePtr = Info ^ ipi_clique,
+    map.lookup(Info ^ ipi_call_sites, GoalPath, CostAndCallee),
+    ( 
+        cost_and_callees_is_recursive(CliquePtr, CostAndCallee),
         map.search(Info ^ ipi_rec_call_sites, GoalPath, RecCost)
     ->
-        % THe callsite is recursive and we know the cost of the
-        % recursive call.
-        Cost0 = RecCost
+        Cost = RecCost
     ;
-        Cost0 = CostAndCallee ^ cac_cost
+        Cost = CostAndCallee ^ cac_cost
     ),
-    Cost = cs_cost_get_percall(Cost0),
 
+    solutions(compute_var_use_lazy_arg(Info, Var, Args, CostAndCallee,
+            Cost, VarUseType), 
+        Uses),
+    (
+        VarUseType = var_use_consumption,
+        Uses = [FirstUse | OtherUses],
+        foldl(earliest_use, OtherUses, FirstUse, Use)
+    ;
+        ( VarUseType = var_use_production
+        ; VarUseType = var_use_other
+        ),
+        (
+            Uses = [Use]
+        ;
+            Uses = [_, _ | _],
+            error(this_file ++ "Too many solutions to compute_var_use_lazy_arg"
+                ++ " for a production")
+        )
+    ).
+
+:- pred earliest_use(var_use_info::in, var_use_info::in, var_use_info::out) is det.
+
+earliest_use(A, B, Ealiest) :-
+    TimeA = A ^ vui_cost_until_use,
+    TimeB = B ^ vui_cost_until_use,
+    ( TimeA < TimeB ->
+        Ealiest = A
+    ;
+        Ealiest = B
+    ).
+
+:- pred compute_var_use_lazy_arg(implicit_parallelism_info::in, var_rep::in,
+    list(var_rep)::in, cost_and_callees::in, cs_cost_csq::in, var_use_type::in,
+    var_use_info::out) is multi.
+
+compute_var_use_lazy_arg(Info, Var, Args, CostAndCallee, Cost, VarUseType, Use) :-
+    CostPercall = cs_cost_get_percall(Cost),
+    ( member_index0(Var, Args, ArgNum) ->
     HigherOrder = CostAndCallee ^ cac_call_site_is_ho,
     (
         HigherOrder = higher_order_call,
         % We cannot push signals or waits into higher order calls.
-        pessimistic_var_use_info(VarUseType, Cost, Use)
+            pessimistic_var_use_info(VarUseType, CostPercall, Use)
     ;
         HigherOrder = first_order_call,
-        Callees = CostAndCallee ^ cac_callees,
-        ( singleton_set(Callees, Callee) ->
-            CSDPtr = Callee ^ c_csd
+            ( singleton_set(CostAndCallee ^ cac_callees, CalleePrime) ->
+                Callee = CalleePrime
         ;
             error(this_file ++ 
                 "First-order call site has wrong number of CSDs")
         ),
+            CSDPtr = Callee ^ c_csd,
         RecursionType = Info ^ ipi_recursion_type,
         recursion_type_get_interesting_parallelisation_depth(
             RecursionType, MaybeCurDepth),
-        compute_var_modes_and_uses_2(Info, ArgNum, RecursionType,
-            MaybeCurDepth, VarUseType, Cost, CSDPtr, Use, Messages),
+            compute_var_use_2(Info, ArgNum, RecursionType, MaybeCurDepth,
+                VarUseType, CostPercall, CSDPtr, Use, Messages),
         trace [io(!IO)] (
             stderr_stream(Stderr, !IO),
             write_out_messages(Stderr, Messages, !IO)
         )
+        )
+    ;
+        Use = var_use_info(0.0, CostPercall, VarUseType),
+        require(unify(VarUseType, var_use_consumption), this_file ++ 
+            "Var use type most be consumption if \\+ member(Var, Args)")
     ).
 
-:- pred compute_var_modes_and_uses_2(implicit_parallelism_info::in,
-    int::in, recursion_type::in, maybe(recursion_depth)::in, var_use_type::in,
+:- pred compute_var_use_2(implicit_parallelism_info::in, int::in,
+    recursion_type::in, maybe(recursion_depth)::in, var_use_type::in,
     float::in, call_site_dynamic_ptr::in, var_use_info::out,
     cord(message)::out) is det.
 
-compute_var_modes_and_uses_2(Info, ArgNum, RecursionType, MaybeCurDepth,
-        VarUseType, Cost, CSDPtr, Use, !:Messages) :-
+compute_var_use_2(Info, ArgNum, RecursionType, MaybeCurDepth, VarUseType, Cost,
+        CSDPtr, Use, !:Messages) :-
     !:Messages = empty,
     Deep = Info ^ ipi_deep,
     CliquePtr = Info ^ ipi_clique,
@@ -2487,12 +2447,66 @@ compute_var_modes_and_uses_2(Info, ArgNu
         MaybeUse = error(Error),
         pessimistic_var_use_info(VarUseType, Cost, Use),
         append_message(call_site_dynamic(CSDPtr), 
-            warning_cannot_compute_arg_first_use_time(Error),
+            warning_cannot_compute_first_use_time(Error),
             !Messages)
     ).
 
+:- pred goal_build_use_map(goal_rep(coverage_and_instmap_info)::in,
+    goal_path::in, goal_cost_csq::in, implicit_parallelism_info::in,
+    var_use_type::in, var_rep::in, 
+    map(var_rep, lazy(var_use_info))::in, 
+    map(var_rep, lazy(var_use_info))::out) is det.
+
+goal_build_use_map(Goal, GoalPath, Cost, Info, VarUseType, Var, !Map) :-
+    LazyUse = delay((func) = compute_goal_var_use_lazy(Goal, GoalPath, Cost,
+        Info, VarUseType, Var)),
+    svmap.det_insert(Var, LazyUse, !Map).
+
+:- func compute_goal_var_use_lazy(goal_rep(coverage_and_instmap_info),
+    goal_path, goal_cost_csq, implicit_parallelism_info, var_use_type,
+    var_rep) = var_use_info.
+
+compute_goal_var_use_lazy(Goal, GoalPath, Cost, Info, VarUseType, Var) = Use :-
+    Info = implicit_parallelism_info(Deep, _ProgRep, _Params, CliquePtr,
+        CallSiteMap, RecursiveCallSiteMap, RecursionType, _VarTable,
+        _ProcLabel),
+    CostPercall = goal_cost_get_percall(Cost),    
+    (
+        ( RecursionType = rt_not_recursive
+        ; RecursionType = rt_single(_, _, _, _, _)
+        ),
+        recursion_type_get_interesting_parallelisation_depth(RecursionType,
+            yes(RecDepth)),
+        var_first_use(Deep, CliquePtr, CallSiteMap, RecursiveCallSiteMap,
+            RecursionType, RecDepth, Goal, GoalPath, CostPercall, Var,
+            VarUseType, Use)
+    ;
+        ( RecursionType = rt_divide_and_conquer(_, _)
+        ; RecursionType = rt_mutual_recursion(_)
+        ; RecursionType = rt_other(_)
+        ; RecursionType = rt_errors(_)
+        ),
+        % var_first_use doesn't work for these recursion types.
+        pessimistic_var_use_info(VarUseType, CostPercall, Use),
+        append_message(clique(CliquePtr), warning_cannot_compute_first_use_time(
+            "Recursion type unknown for var_first_use/12"),
+            empty, Messages),
+        trace [io(!IO)] (
+            io.stderr_stream(Stderr, !IO),
+            write_out_messages(Stderr, Messages, !IO)
+        )
+    ).
+
+:- instance goal_annotation_with_coverage(coverage_and_instmap_info) where [
+        (get_coverage(Goal) = Goal ^ goal_annotation ^ cai_coverage)
+    ].
+
 :- pred recursion_type_get_interesting_parallelisation_depth(
-    recursion_type::in, maybe(recursion_depth)::out) is det.
+    recursion_type, maybe(recursion_depth)).
+:- mode recursion_type_get_interesting_parallelisation_depth(
+    in(recursion_type_known_costs), out(maybe_yes(ground))) is det.
+:- mode recursion_type_get_interesting_parallelisation_depth(
+    in, out) is det.
 
 recursion_type_get_interesting_parallelisation_depth(RecursionType,
         MaybeDepth) :-
@@ -2512,29 +2526,35 @@ recursion_type_get_interesting_paralleli
     ).
 
 :- type is_costly_goal
-    --->    is_costly_goal
-    ;       is_not_costly_goal
-    ;       is_non_atomic_goal.
+    --->    is_not_costly_goal
+    ;       is_costly_atomic_goal
+    ;       is_costly_compound_goal.
 
-:- pred identify_costly_call(pard_goal_detail::in, is_costly_goal::out) is det.
+:- pred identify_costly_goal(pard_goal_detail::in, is_costly_goal::out) is det.
 
-identify_costly_call(Goal, Costly) :-
-    GoalType = Goal ^ goal_annotation ^ pgd_pg_type,
-    (
-        GoalType = pgt_call(_, CostAboveThreshold, _, _),
+identify_costly_goal(Goal, Costly) :-
+    CostAboveThreshold = Goal ^ goal_annotation ^ pgd_cost_above_threshold,
         (
             CostAboveThreshold = cost_above_par_threshold,
-            Costly = is_costly_goal 
-        ;
-            CostAboveThreshold = cost_not_above_par_threshold,
-            Costly = is_not_costly_goal
-        )
+        GoalType = Goal ^ goal_annotation ^ pgd_pg_type,
+        (
+            GoalType = pgt_call(_, _),
+            Costly = is_costly_atomic_goal
     ;
         GoalType = pgt_other_atomic_goal,
-        Costly = is_not_costly_goal
+            error(this_file ++ "pgt_other_atomic_goal is never costly")
     ;
         GoalType = pgt_non_atomic_goal,
-        Costly = is_non_atomic_goal
+            % TODO: distinguish between compound goals with one branch that is
+            % costly, and compound goals where all branches are costly.
+            % TODO: Provide information about how many costly goals are within
+            % the goal so that we can try to parallelise each of those against
+            % an outer costly goal.
+            Costly = is_costly_compound_goal
+        )
+    ;
+        CostAboveThreshold = cost_not_above_par_threshold,
+        Costly = is_not_costly_goal
     ).
 
 :- pred var_get_mode(inst_map::in, inst_map::in, var_rep::in, var_mode_rep::out)
@@ -2548,78 +2568,188 @@ var_get_mode(InstMapBefore, InstMapAfter
     % Transform a goal in a conjunction into a pard_goal.
     %
 :- pred goal_to_pard_goal(implicit_parallelism_info::in, goal_path::in,
-    (func(int) = goal_path_step)::in,
-    goal_rep(inst_map_info)::in, pard_goal_detail::out,
-    int::in, int::out,
+    goal_rep(coverage_and_instmap_info)::in, pard_goal_detail::out,
     cord(message)::in, cord(message)::out) is det.
 
-goal_to_pard_goal(Info, GoalPath0, Step, !Goal, !GoalNum, !Messages) :-
-    !.Goal = goal_rep(GoalExpr0, Detism, InstMapInfo),
-    GoalPath = goal_path_add_at_end(GoalPath0, Step(!.GoalNum)),
-    !:GoalNum = !.GoalNum + 1,
+goal_to_pard_goal(Info, GoalPath, !Goal, !Messages) :-
+    !.Goal = goal_rep(GoalExpr0, Detism, CoverageAndInstMapInfo),
+    InstMapInfo = CoverageAndInstMapInfo ^ cai_inst_map_info,
+    Coverage = CoverageAndInstMapInfo ^ cai_coverage,
+    get_coverage_before_det(Coverage, Before),
     (
         (
             GoalExpr0 = conj_rep(Conjs0),
-            map_foldl2(goal_to_pard_goal(Info, GoalPath, 
-                ( func(Num) = step_conj(Num) ) ), Conjs0, Conjs, 1, _,
+            map_foldl2(conj_to_pard_goals(Info, GoalPath), Conjs0, Conjs, 1, _,
                 !Messages),
+            conj_calc_cost(Conjs, Cost),
             GoalExpr = conj_rep(Conjs)
         ;
             GoalExpr0 = disj_rep(Disjs0),
-            map_foldl2(goal_to_pard_goal(Info, GoalPath,
-                ( func(Num) = step_disj(Num) ) ), Disjs0, Disjs, 1, _,
+            map_foldl2(disj_to_pard_goals(Info, GoalPath), Disjs0, Disjs, 1, _,
                 !Messages),
+            disj_calc_cost(Disjs, Cost),
             GoalExpr = disj_rep(Disjs)
         ;
             GoalExpr0 = switch_rep(Var, CanFail, Cases0),
             map_foldl2(case_to_pard_goal(Info, GoalPath), Cases0, Cases, 1, _,
                 !Messages),
+            switch_calc_cost(Cases, Before, Cost),
             GoalExpr = switch_rep(Var, CanFail, Cases)
         ; 
             GoalExpr0 = ite_rep(Cond0, Then0, Else0),
-            goal_to_pard_goal(Info, GoalPath, func(_) = step_ite_cond, 
-                Cond0, Cond, 1, _, !Messages),
-            goal_to_pard_goal(Info, GoalPath, func(_) = step_ite_then, 
-                Then0, Then, 1, _, !Messages),
-            goal_to_pard_goal(Info, GoalPath, func(_) = step_ite_else, 
-                Else0, Else, 1, _, !Messages),
+            goal_to_pard_goal(Info, 
+                goal_path_add_at_end(GoalPath, step_ite_cond), Cond0, Cond, 
+                !Messages),
+            goal_to_pard_goal(Info, 
+                goal_path_add_at_end(GoalPath, step_ite_then), Then0, Then, 
+                !Messages),
+            goal_to_pard_goal(Info, 
+                goal_path_add_at_end(GoalPath, step_ite_else), Else0, Else,
+                !Messages),
+            ite_calc_cost(Cond, Then, Else, Cost),
             GoalExpr = ite_rep(Cond, Then, Else)
         ; 
             GoalExpr0 = negation_rep(SubGoal0),
-            goal_to_pard_goal(Info, GoalPath, func(_) = step_neg,
-                SubGoal0, SubGoal, 1, _, !Messages),
+            goal_to_pard_goal(Info, goal_path_add_at_end(GoalPath, step_neg),
+                SubGoal0, SubGoal, !Messages),
+            Cost = SubGoal ^ goal_annotation ^ pgd_cost,
             GoalExpr = negation_rep(SubGoal)
         ; 
             GoalExpr0 = scope_rep(SubGoal0, MaybeCut),
-            goal_to_pard_goal(Info, GoalPath, func(_) = step_scope(MaybeCut),
-                SubGoal0, SubGoal, 1, _, !Messages),
+            goal_to_pard_goal(Info, 
+                goal_path_add_at_end(GoalPath, step_scope(MaybeCut)),
+                SubGoal0, SubGoal, !Messages),
+            Cost = SubGoal ^ goal_annotation ^ pgd_cost,
             GoalExpr = scope_rep(SubGoal, MaybeCut)
         ),
-        % XXX: We my consider lifting calls out of non-atomic goals so that
-        % they can be parallelised,  or parallelising the whole non-atomic
-        % goal.
-        PardGoalType = pgt_non_atomic_goal
+        PardGoalType = pgt_non_atomic_goal,
+
+        BoundVars = to_sorted_list(InstMapInfo ^ im_bound_vars),
+        foldl(goal_build_use_map(!.Goal, GoalPath, Cost, Info,
+                var_use_production),
+            BoundVars, map.init, ProductionUseMap),
+        ConsumedVars = to_sorted_list(InstMapInfo ^ im_consumed_vars),
+        foldl(goal_build_use_map(!.Goal, GoalPath, Cost, Info,
+                var_use_consumption),
+            ConsumedVars, map.init, ConsumptionUseMap)
     ;
         GoalExpr0 = atomic_goal_rep(Context, Line, BoundVars, AtomicGoal),
         GoalExpr = atomic_goal_rep(Context, Line, BoundVars, AtomicGoal),
-        maybe_costly_call(Info, GoalPath, AtomicGoal, Detism,
-            InstMapInfo, PardGoalType, Messages),
+        atomic_pard_goal_type(Info, GoalPath, AtomicGoal, InstMapInfo,
+            PardGoalType, Messages),
+        atomic_pard_goal_cost(Info, GoalPath, AtomicGoal, Cost),
+        
+        foldl(atomic_goal_build_use_map(AtomicGoal, GoalPath, Info, 
+                var_use_production),
+            BoundVars, map.init, ProductionUseMap),
+        ConsumedVars = InstMapInfo ^ im_consumed_vars,
+        foldl(atomic_goal_build_use_map(AtomicGoal, GoalPath, Info, 
+                var_use_consumption), 
+            to_sorted_list(ConsumedVars), map.init, ConsumptionUseMap),
+        
         !:Messages = !.Messages ++ Messages
     ),
-    PardGoalAnnotation = pard_goal_detail(PardGoalType, InstMapInfo, GoalPath),
+    % XXX: The goal annotations cannot represent reasons why a goal
+    % can't be parallelised, for example it could be nondet, semidet or
+    % impure.
+    ( can_parallelise_goal(Info, Detism, Cost) ->
+        CostAboveThreshold = cost_above_par_threshold
+    ;
+        CostAboveThreshold = cost_not_above_par_threshold
+    ),
+    PardGoalAnnotation = pard_goal_detail(PardGoalType, InstMapInfo, GoalPath, 
+        Coverage, Cost, CostAboveThreshold, ProductionUseMap,
+        ConsumptionUseMap),
     !:Goal = goal_rep(GoalExpr, Detism, PardGoalAnnotation).
 
-:- pred case_to_pard_goal(implicit_parallelism_info::in, goal_path::in,
-    case_rep(inst_map_info)::in, case_rep(pard_goal_detail_annotation)::out, 
+:- pred conj_to_pard_goals(implicit_parallelism_info::in, goal_path::in,
+    goal_rep(coverage_and_instmap_info)::in, pard_goal_detail::out, 
+    int::in, int::out, cord(message)::in, cord(message)::out) is det.
+
+conj_to_pard_goals(Info, GoalPath0, !Goal, !ConjNum, !Messages) :-
+    GoalPath = goal_path_add_at_end(GoalPath0, step_conj(!.ConjNum)),
+    goal_to_pard_goal(Info, GoalPath, !Goal, !Messages),
+    !:ConjNum = !.ConjNum + 1.
+
+:- pred disj_to_pard_goals(implicit_parallelism_info::in, goal_path::in,
+    goal_rep(coverage_and_instmap_info)::in, pard_goal_detail::out, 
     int::in, int::out, cord(message)::in, cord(message)::out) is det.
 
-case_to_pard_goal(Info, GoalPath0, !Case, !GoalNum, !Messages) :-
+disj_to_pard_goals(Info, GoalPath0, !Goal, !DisjNum, !Messages) :-
+    GoalPath = goal_path_add_at_end(GoalPath0, step_disj(!.DisjNum)),
+    goal_to_pard_goal(Info, GoalPath, !Goal, !Messages),
+    !:DisjNum = !.DisjNum + 1.
+
+:- pred case_to_pard_goal(implicit_parallelism_info::in, goal_path::in,
+    case_rep(coverage_and_instmap_info)::in, 
+    case_rep(pard_goal_detail_annotation)::out, int::in, int::out, 
+    cord(message)::in, cord(message)::out) is det.
+
+case_to_pard_goal(Info, GoalPath0, !Case, !CaseNum, !Messages) :-
     !.Case = case_rep(ConsId, OtherConsId, Goal0),
-    goal_to_pard_goal(Info, GoalPath0, 
-        ( func(Num) = step_switch(Num, no) ), Goal0, Goal, !GoalNum, !Messages),
+    GoalPath = goal_path_add_at_end(GoalPath0, step_switch(!.CaseNum, no)),
+    goal_to_pard_goal(Info, GoalPath, Goal0, Goal, !Messages),
+    !:CaseNum = !.CaseNum + 1,
     !:Case = case_rep(ConsId, OtherConsId, Goal).
 
 %----------------------------------------------------------------------------%
+
+:- pred conj_calc_cost(list(pard_goal_detail)::in, goal_cost_csq::out) 
+    is det.
+
+conj_calc_cost([], zero_goal_cost).
+conj_calc_cost([Conj | Conjs], Cost) :-
+    conj_calc_cost(Conjs, ConjsCost),
+    ConjCost = Conj ^ goal_annotation ^ pgd_cost,
+    Cost = add_goal_costs(ConjsCost, ConjCost).
+
+:- pred disj_calc_cost(list(pard_goal_detail)::in, goal_cost_csq::out) 
+    is det.
+
+disj_calc_cost([], zero_goal_cost).
+disj_calc_cost([Disj | Disjs], Cost) :-
+    Coverage = Disj ^ goal_annotation ^ pgd_coverage,
+    get_coverage_before_det(Coverage, Before), 
+    ( Before = 0 ->
+        % Avoid a divide by zero.
+        Cost = zero_goal_cost
+    ;
+        DisjCost = Disj ^ goal_annotation ^ pgd_cost,
+        disj_calc_cost(Disjs, DisjsCost),
+        % XXX: We assume this is a semidet disjunction
+        Branch = add_goal_costs_branch(Before, DisjsCost, zero_goal_cost),
+        Cost = add_goal_costs(DisjCost, Branch)
+    ).
+
+:- pred switch_calc_cost(list(case_rep(pard_goal_detail_annotation))::in,
+    int::in, goal_cost_csq::out) is det.
+
+switch_calc_cost([], _, zero_goal_cost).
+switch_calc_cost([Case | Cases], TotalCalls, Cost) :-
+    ( TotalCalls = 0 ->
+        % Avoid a divide by zero.
+        Cost = zero_goal_cost
+    ;
+        Coverage = Case ^ cr_case_goal ^ goal_annotation ^ pgd_coverage,
+        get_coverage_before_det(Coverage, CaseCalls),
+        switch_calc_cost(Cases, TotalCalls - CaseCalls, CasesCost),
+        CaseCost = Case ^ cr_case_goal ^ goal_annotation ^ pgd_cost,
+        Cost = add_goal_costs_branch(TotalCalls, CaseCost, CasesCost)
+    ).
+
+:- pred ite_calc_cost(pard_goal_detail::in, pard_goal_detail::in, 
+    pard_goal_detail::in, goal_cost_csq::out) is det.
+
+ite_calc_cost(Cond, Then, Else, Cost) :-
+    CondCost = Cond ^ goal_annotation ^ pgd_cost,
+    ThenCost = Then ^ goal_annotation ^ pgd_cost,
+    ElseCost = Else ^ goal_annotation ^ pgd_cost,
+    Coverage = Cond ^ goal_annotation ^ pgd_coverage,
+    get_coverage_before_det(Coverage, Before),
+    ThenElseCost = add_goal_costs_branch(Before, ThenCost, ElseCost),
+    Cost = add_goal_costs(CondCost, ThenElseCost).
+
+%----------------------------------------------------------------------------%
 %
 % Annotate a goal with instantiation information.
 %
@@ -2645,6 +2775,11 @@ case_to_pard_goal(Info, GoalPath0, !Case
                     % The variables produced by this goal.
             ).
 
+:- typeclass goal_annotation_add_instmap(A, B) where [
+        pred add_instmap(inst_map_info, A, B),
+        mode add_instmap(in, in, out) is det
+    ].
+
     % Note: It may be useful to add other annotations such as goal path or cost 
     % information.
     %
@@ -2654,13 +2789,14 @@ case_to_pard_goal(Info, GoalPath0, !Case
     % Vars is the set of variables used by this goal, both consumed and
     % produced.
     %
-:- pred goal_annotate_with_instmap(goal_rep::in, goal_rep(inst_map_info)::out, 
+:- pred goal_annotate_with_instmap(goal_rep(A)::in, goal_rep(B)::out, 
     inst_map::in, inst_map::out, seen_duplicate_instantiation::out, 
-    set(var_rep)::out, set(var_rep)::out) is det.
+    set(var_rep)::out, set(var_rep)::out) is det 
+    <= goal_annotation_add_instmap(A, B).
 
 goal_annotate_with_instmap(Goal0, Goal, !InstMap, SeenDuplicateInstantiation,
         ConsumedVars, BoundVars) :-
-    Goal0 = goal_rep(GoalExpr0, Detism, _),
+    Goal0 = goal_rep(GoalExpr0, Detism, Ann0),
     InstMapBefore = !.InstMap,
     (
         GoalExpr0 = conj_rep(Conjs0),
@@ -2718,12 +2854,13 @@ goal_annotate_with_instmap(Goal0, Goal, 
     InstMapAfter = !.InstMap,
     InstMapInfo = inst_map_info(InstMapBefore, InstMapAfter, ConsumedVars,
         BoundVars),
-    Goal = goal_rep(GoalExpr, Detism, InstMapInfo).
+    add_instmap(InstMapInfo, Ann0, Ann), 
+    Goal = goal_rep(GoalExpr, Detism, Ann).
 
-:- pred conj_annotate_with_instmap(list(goal_rep)::in,
-    list(goal_rep(inst_map_info))::out, inst_map::in, inst_map::out,
+:- pred conj_annotate_with_instmap(list(goal_rep(A))::in,
+    list(goal_rep(B))::out, inst_map::in, inst_map::out,
     seen_duplicate_instantiation::out, set(var_rep)::out, set(var_rep)::out)
-    is det.
+    is det <= goal_annotation_add_instmap(A, B).
 
 conj_annotate_with_instmap([], [], !InstMap,
     have_not_seen_duplicate_instantiation, set.init, set.init).
@@ -2739,10 +2876,10 @@ conj_annotate_with_instmap([Conj0 | Conj
         SeenDuplicateInstantiationHead,
         SeenDuplicateInstantiationTail).
 
-:- pred disj_annotate_with_instmap(list(goal_rep)::in,
-    list(goal_rep(inst_map_info))::out, inst_map::in, inst_map::out,
+:- pred disj_annotate_with_instmap(list(goal_rep(A))::in,
+    list(goal_rep(B))::out, inst_map::in, inst_map::out,
     seen_duplicate_instantiation::out, set(var_rep)::out, set(var_rep)::out)
-    is det.
+    is det <= goal_annotation_add_instmap(A, B).
 
 disj_annotate_with_instmap([], [], !InstMap,
         have_not_seen_duplicate_instantiation, set.init, set.init).
@@ -2776,10 +2913,10 @@ disj_annotate_with_instmap([Disj0 | Disj
         SeenDuplicateInstantiationHead,
         SeenDuplicateInstantiationTail).
 
-:- pred switch_annotate_with_instmap(list(case_rep)::in, 
-    list(case_rep(inst_map_info))::out, inst_map::in, inst_map::out,
+:- pred switch_annotate_with_instmap(list(case_rep(A))::in, 
+    list(case_rep(B))::out, inst_map::in, inst_map::out,
     seen_duplicate_instantiation::out, set(var_rep)::out, set(var_rep)::out) 
-    is det.
+    is det <= goal_annotation_add_instmap(A, B).
 
 switch_annotate_with_instmap([], [], !InstMap,
         have_not_seen_duplicate_instantiation, set.init, set.init).
@@ -2810,12 +2947,12 @@ switch_annotate_with_instmap([Case0 | Ca
         SeenDuplicateInstantiationHead,
         SeenDuplicateInstantiationTail).
 
-:- pred ite_annotate_with_instmap(goal_rep::in, goal_rep(inst_map_info)::out,
-    goal_rep::in, goal_rep(inst_map_info)::out,
-    goal_rep::in, goal_rep(inst_map_info)::out,
+:- pred ite_annotate_with_instmap(goal_rep(A)::in, goal_rep(B)::out,
+    goal_rep(A)::in, goal_rep(B)::out,
+    goal_rep(A)::in, goal_rep(B)::out,
     inst_map::in, inst_map::out, 
     seen_duplicate_instantiation::out, set(var_rep)::out, set(var_rep)::out) 
-    is det.
+    is det <= goal_annotation_add_instmap(A, B).
 
 ite_annotate_with_instmap(Cond0, Cond, Then0, Then, Else0, Else, InstMap0, InstMap,
         SeenDuplicateInstantiation, ConsumedVars, BoundVars) :-
@@ -2986,7 +3123,7 @@ create_candidate_parallel_conj_report(Va
         Report) :-
     print_proc_label_to_string(Proc, ProcString),
     CandidateParConjunction = candidate_par_conjunction(GoalPathString,
-        PartNum, FirstConjNum, IsDependent, GoalsBefore, Conjs, GoalsAfter,
+        FirstConjNum, IsDependent, GoalsBefore, Conjs, GoalsAfter,
         ParExecMetrics),
     ParExecMetrics = parallel_exec_metrics(NumCalls, SeqTime, ParTime,
         ParOverheads, FirstConjDeadTime, FutureDeadTime),
@@ -3004,21 +3141,29 @@ create_candidate_parallel_conj_report(Va
     TimeSaving = parallel_exec_metrics_get_time_saving(ParExecMetrics),
     TotalDeadTime = FirstConjDeadTime + FutureDeadTime,
     format("      %s\n" ++
-           "      Path and Partition Num: %s, %d\n" ++
+           "      Path: %s\n" ++
            "      Dependent: %s\n" ++
-           "      NumCalls: %d\n" ++
-           "      SeqTime: %f\n" ++
-           "      ParTime: %f\n" ++
-           "      ParOverheads: %f\n" ++
-           "      Speedup: %f\n" ++
-           "      Time saving: %f\n" ++
-           "      First conj dead time: %f\n" ++
-           "      Future dead time: %f\n" ++
-           "      Total dead time: %f\n\n", 
-        [s(ProcString), s(GoalPathString), i(PartNum), s(DependanceString),
-            i(NumCalls), f(SeqTime), f(ParTime), f(ParOverheads), f(Speedup),
-            f(TimeSaving), f(FirstConjDeadTime), f(FutureDeadTime),
-            f(TotalDeadTime)],
+           "      NumCalls: %s\n" ++
+           "      SeqTime: %s\n" ++
+           "      ParTime: %s\n" ++
+           "      ParOverheads: %s\n" ++
+           "      Speedup: %s\n" ++
+           "      Time saving: %s\n" ++
+           "      First conj dead time: %s\n" ++
+           "      Future dead time: %s\n" ++
+           "      Total dead time: %s\n\n", 
+        [s(ProcString), 
+         s(GoalPathString), 
+         s(DependanceString),
+         s(commas(NumCalls)), 
+         s(two_decimal_fraction(SeqTime)),
+         s(two_decimal_fraction(ParTime)),
+         s(two_decimal_fraction(ParOverheads)),
+         s(four_decimal_fraction(Speedup)),
+         s(two_decimal_fraction(TimeSaving)),
+         s(two_decimal_fraction(FirstConjDeadTime)),
+         s(two_decimal_fraction(FutureDeadTime)),
+         s(two_decimal_fraction(TotalDeadTime))],
         ReportHeaderStr),
     ReportHeader = singleton(ReportHeaderStr),
 
@@ -3137,7 +3282,8 @@ format_pard_goal_annotation(GoalAnnotati
             CostAboveThreshold = cost_not_above_par_threshold,
             CostAboveThresholdStr = "not above threshold"
         ),
-        Report = singleton(format("cost: %f ", [f(CostPercall)])) ++ 
+        Report = singleton(format("cost: %s ", 
+                [s(two_decimal_fraction(CostPercall))])) ++ 
             singleton(CostAboveThresholdStr) ++ singleton(")")
     ;
         ( GoalAnnotation = pard_goal_other_atomic
Index: deep_profiler/measurements.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/measurements.m,v
retrieving revision 1.22
diff -u -p -b -r1.22 measurements.m
--- deep_profiler/measurements.m	10 Oct 2010 04:19:53 -0000	1.22
+++ deep_profiler/measurements.m	14 Oct 2010 03:58:50 -0000
@@ -143,6 +143,35 @@
 
 :- func cs_cost_per_proc_call(cs_cost_csq, proc_cost_csq) = cs_cost_csq.
 
+%----------------------------------------------------------------------------%
+
+    % The cost of a goal.
+    %
+:- type goal_cost_csq.
+
+:- func atomic_goal_cost = goal_cost_csq.
+
+:- func zero_goal_cost = goal_cost_csq.
+
+    % call_goal_cost(NumCalls, PerCallCost) = Cost
+    %
+:- func call_goal_cost(int, float) = goal_cost_csq.
+
+:- func call_goal_cost(cs_cost_csq) = goal_cost_csq.
+
+:- func add_goal_costs(goal_cost_csq, goal_cost_csq) = goal_cost_csq.
+
+    % add_goal_costs_branch(TotalCalls, BranchA, BranchB) = Cost.
+    %
+    % Add the costs of goal accross the arms of a branch.
+    %
+:- func add_goal_costs_branch(int, goal_cost_csq, goal_cost_csq) = 
+    goal_cost_csq.
+
+:- func goal_cost_get_percall(goal_cost_csq) = float.
+
+:- func goal_cost_get_calls(goal_cost_csq) = int.
+
 %-----------------------------------------------------------------------------%
 
 :- type recursion_depth.
@@ -673,6 +702,76 @@ cs_cost_per_proc_call(cs_cost_csq(CSCall
 
 %----------------------------------------------------------------------------%
 
+:- type goal_cost_csq
+    --->    trivial_goal
+    ;       non_trivial_goal(
+                tg_avg_cost             :: cost,
+                tg_calls                :: int
+            ).
+
+atomic_goal_cost = trivial_goal.
+
+zero_goal_cost = trivial_goal.
+
+call_goal_cost(Calls, PercallCost) = non_trivial_goal(Cost, Calls) :-
+    Cost = cost_per_call(PercallCost).
+
+call_goal_cost(CSCost) = non_trivial_goal(Cost, Calls) :-
+    Calls = round_to_int(cs_cost_get_calls(CSCost)),
+    Cost = CSCost ^ cscc_csq_cost. 
+
+add_goal_costs(trivial_goal, trivial_goal) = 
+    trivial_goal.
+add_goal_costs(trivial_goal, R at non_trivial_goal(_, _)) = R.
+add_goal_costs(R at non_trivial_goal(_, _), trivial_goal) = R.
+add_goal_costs(non_trivial_goal(CostA, CallsA), non_trivial_goal(CostB, CallsB)) 
+        = non_trivial_goal(Cost, Calls) :-
+    Calls = max(CallsA, CallsB),
+    Cost = cost_total(cost_get_total(float(CallsA), CostA) + 
+        cost_get_total(float(CallsB), CostB)).
+
+add_goal_costs_branch(TotalCalls, A, B) = R :-
+    ( TotalCalls = 0 ->
+        R = zero_goal_cost
+    ;
+        (
+            A = trivial_goal,
+            (
+                B = trivial_goal,
+                R = trivial_goal
+            ;
+                B = non_trivial_goal(Cost, _),
+                R = non_trivial_goal(Cost, TotalCalls)
+            )
+        ;
+            A = non_trivial_goal(CostA, CallsA),
+            (
+                B = trivial_goal,
+                R = non_trivial_goal(CostA, TotalCalls)
+            ;
+                B = non_trivial_goal(CostB, CallsB),
+                Cost = sum_costs(float(CallsA), CostA, float(CallsB), CostB),
+                Calls = CallsA + CallsB,
+                require(unify(Calls, TotalCalls), 
+                    this_file ++ "TotalCalls \\= CallsA + CallsB"),
+                R = non_trivial_goal(Cost, Calls)
+            )
+        )
+    ).
+
+goal_cost_get_percall(trivial_goal) = 0.0.
+goal_cost_get_percall(non_trivial_goal(Cost, Calls)) =
+    ( Calls = 0 ->
+        0.0
+    ;
+        cost_get_percall(float(Calls), Cost)
+    ).
+
+goal_cost_get_calls(trivial_goal) = 0.
+goal_cost_get_calls(non_trivial_goal(_, Calls)) = Calls.
+
+%----------------------------------------------------------------------------%
+
 :- type cost
     --->    cost_per_call(float)
     ;       cost_total(float).
@@ -698,6 +797,18 @@ Cost0 / Denom = Cost :-
         Cost = cost_per_call(Percall / float(Denom))
     ).
 
+:- func cost_by_weight(float, cost) = cost.
+
+cost_by_weight(Weight, cost_total(Total)) = cost_total(Total * Weight).
+cost_by_weight(Weight, cost_per_call(PC)) = cost_per_call(PC * Weight).
+
+:- func sum_costs(float, cost, float, cost) = cost.
+
+sum_costs(CallsA, CostA, CallsB, CostB) = cost_total(Sum) :-
+    Sum = CostTotalA + CostTotalB,
+    CostTotalA = cost_get_total(CallsA, CostA),
+    CostTotalB = cost_get_total(CallsB, CostB).
+
 %----------------------------------------------------------------------------%
 
 :- type recursion_depth
Index: deep_profiler/message.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/message.m,v
retrieving revision 1.8
diff -u -p -b -r1.8 message.m
--- deep_profiler/message.m	7 Oct 2010 02:38:09 -0000	1.8
+++ deep_profiler/message.m	14 Oct 2010 03:58:50 -0000
@@ -139,11 +139,11 @@
                 %
     ;       warning_cannot_compute_cost_of_recursive_calls(string)
             
-                % Couldn't compute the time at which a call site's argument is
-                % produced or consumed.
+                % Couldn't compute the time at which a variable is produced or
+                % consumed.
                 %
                 % The parameter contains extra information about this error.
-    ;       warning_cannot_compute_arg_first_use_time(string)
+    ;       warning_cannot_compute_first_use_time(string)
 
                 % We don't yet handle clique_proc_reports with multiple proc
                 % dynamics.
@@ -281,7 +281,7 @@ message_type_to_level(warning_cannot_com
     message_warning.
 message_type_to_level(warning_cannot_compute_cost_of_recursive_calls(_)) = 
     message_warning.
-message_type_to_level(warning_cannot_compute_arg_first_use_time(_)) = 
+message_type_to_level(warning_cannot_compute_first_use_time(_)) = 
     message_warning.
 message_type_to_level(error_extra_proc_dynamics_in_clique_proc) = 
     message_error.
@@ -351,9 +351,9 @@ message_type_to_string(MessageType) = Co
             Template = "Cannot compute cost of recursive calls: %s"
         ;
             MessageType = 
-                warning_cannot_compute_arg_first_use_time(ErrorStr),
+                warning_cannot_compute_first_use_time(ErrorStr),
             Template = "Cannot compute the production or consumption time of a"
-                ++ " call site's argument: %s"
+                ++ " variable: %s"
         ),
         string.format(Template, [s(ErrorStr)], String)
     ),
Index: deep_profiler/program_representation_utils.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/program_representation_utils.m,v
retrieving revision 1.25
diff -u -p -b -r1.25 program_representation_utils.m
--- deep_profiler/program_representation_utils.m	4 Aug 2010 02:25:02 -0000	1.25
+++ deep_profiler/program_representation_utils.m	14 Oct 2010 03:58:50 -0000
@@ -86,7 +86,7 @@
 
     % Build the initial inst for a procedure.
     %
-:- func initial_inst_map(proc_defn_rep) = inst_map.
+:- func initial_inst_map(proc_defn_rep(T)) = inst_map.
 
     % inst_map_ground_vars(Vars, DepVars, !InstMap, SeenDuplicateInstantiaton).
     %
@@ -167,9 +167,17 @@
 
 %----------------------------------------------------------------------------%
 
+:- type atomic_goal_is_call
+    --->    atomic_goal_is_call(list(var_rep))
+    ;       atomic_goal_is_trivial.
+
+:- pred atomic_goal_is_call(atomic_goal_rep::in, atomic_goal_is_call::out) 
+    is det.
+
+%----------------------------------------------------------------------------%
+
 :- implementation.
 
-% :- import_module create_report.
 :- import_module mdbcomp.prim_data.
 
 :- import_module array.
@@ -918,6 +926,30 @@ merge_seen_duplicate_instantiation(A, B)
 
 %----------------------------------------------------------------------------%
 
+atomic_goal_is_call(AtomicGoal, IsCall) :-
+    (
+        ( AtomicGoal = unify_construct_rep(_, _, _)
+        ; AtomicGoal = unify_deconstruct_rep(_, _, _)
+        ; AtomicGoal = partial_construct_rep(_, _, _)
+        ; AtomicGoal = partial_deconstruct_rep(_, _, _)
+        ; AtomicGoal = unify_assign_rep(_, _)
+        ; AtomicGoal = cast_rep(_, _)
+        ; AtomicGoal = unify_simple_test_rep(_, _)
+        ; AtomicGoal = pragma_foreign_code_rep(_)
+        ; AtomicGoal = builtin_call_rep(_, _, _)
+        ; AtomicGoal = event_call_rep(_, _)
+        ),
+        IsCall = atomic_goal_is_trivial
+    ;
+        ( AtomicGoal = higher_order_call_rep(_, Args)
+        ; AtomicGoal = method_call_rep(_, _, Args)
+        ; AtomicGoal = plain_call_rep(_, _, Args)
+        ),
+        IsCall = atomic_goal_is_call(Args)
+    ).
+
+%----------------------------------------------------------------------------%
+
 :- func this_file = string.
 
 this_file = "program_representation_utils: ".
Index: deep_profiler/var_use_analysis.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/var_use_analysis.m,v
retrieving revision 1.6
diff -u -p -b -r1.6 var_use_analysis.m
--- deep_profiler/var_use_analysis.m	10 Oct 2010 04:19:53 -0000	1.6
+++ deep_profiler/var_use_analysis.m	14 Oct 2010 03:58:50 -0000
@@ -17,13 +17,16 @@
 
 :- interface.
 
+:- import_module analysis_utils.
 :- import_module mdbcomp.
 :- import_module mdbcomp.program_representation.
+:- import_module coverage.
 :- import_module measurements.
 :- import_module profile.
 :- import_module report.
 
 :- import_module list.
+:- import_module map.
 :- import_module maybe.
 :- import_module set.
 
@@ -109,10 +112,25 @@
 
 %-----------------------------------------------------------------------------%
 
+:- typeclass goal_annotation_with_coverage(T) where [
+        (func get_coverage(goal_rep(T)) = coverage_info)
+    ].
+
+:- instance goal_annotation_with_coverage(coverage_info).
+
+    % Find the first use of a variable in an arbitrary goal.
+    %
+:- pred var_first_use(deep::in, clique_ptr::in,
+    map(goal_path, cost_and_callees)::in, map(goal_path, cs_cost_csq)::in, 
+    recursion_type::in(recursion_type_known_costs), recursion_depth::in,
+    goal_rep(T)::in, goal_path::in, float::in, var_rep::in, 
+    var_use_type::in, var_use_info::out) is det 
+    <= goal_annotation_with_coverage(T).
+
+%-----------------------------------------------------------------------------%
+
 :- implementation.
 
-:- import_module analysis_utils.
-:- import_module coverage.
 :- import_module create_report.
 :- import_module program_representation_utils.
 :- import_module recursion_patterns.
@@ -120,7 +138,6 @@
 :- import_module float.
 :- import_module int.
 :- import_module io.
-:- import_module map.
 :- import_module require.
 :- import_module solutions.
 :- import_module string.
@@ -335,6 +352,8 @@ proc_dynamic_var_use_info(Deep, CliquePt
         MaybeVarUseInfo = error(Error)
     ).
 
+%----------------------------------------------------------------------------%
+
     % This type represents whether the first use of a variable has been found
     % or not. If it has then the call sequence counts since it was found is
     % stored in this type also.
@@ -382,12 +401,14 @@ proc_dynamic_var_use_info(Deep, CliquePt
     % follow call the calls seen during profiling and aggregate their variable
     % use information based on how often they are called from that call site.
     %
-:- pred goal_var_first_use(goal_path::in, goal_rep(coverage_info)::in,
+:- pred goal_var_first_use(goal_path::in, goal_rep(T)::in,
     var_first_use_static_info::in(var_first_use_static_info), float::in, 
-    float::out, found_first_use::out) is det.
+    float::out, found_first_use::out) is det 
+    <= goal_annotation_with_coverage(T).
 
 goal_var_first_use(GoalPath, Goal, StaticInfo, !CostSoFar, FoundFirstUse) :-
-    Goal = goal_rep(GoalExpr, Detism, Coverage),
+    Goal = goal_rep(GoalExpr, Detism, _),
+    Coverage = get_coverage(Goal),
     (
         % Do not bother exploring this goal if it is never entered.  Or never
         % finishes and we're looking for a production.
@@ -662,9 +683,10 @@ atomic_trivial_var_first_use(AtomicGoal,
     % an execution order, namely disjunctions and if-then-elses.
     %
 :- pred conj_var_first_use(goal_path::in, int::in,
-    list(goal_rep(coverage_info))::in, 
+    list(goal_rep(T))::in, 
     var_first_use_static_info::in(var_first_use_static_info),
-    float::in, float::out, found_first_use::out) is det.
+    float::in, float::out, found_first_use::out) is det 
+    <= goal_annotation_with_coverage(T).
 
 conj_var_first_use(_, _, [], _, !Cost, have_not_found_first_use).
 conj_var_first_use(GoalPath, ConjNum, [Conj | Conjs], StaticInfo, !CostSoFar,
@@ -686,9 +708,10 @@ conj_var_first_use(GoalPath, ConjNum, [C
         FoundFirstUse = TailFoundFirstUse
     ).
 
-:- pred disj_var_first_use(goal_path::in, list(goal_rep(coverage_info))::in,
+:- pred disj_var_first_use(goal_path::in, list(goal_rep(T))::in,
     detism_rep::in, var_first_use_static_info::in(var_first_use_static_info), 
-    float::in, float::out, found_first_use::out) is det.
+    float::in, float::out, found_first_use::out) is det
+    <= goal_annotation_with_coverage(T).
 
 disj_var_first_use(GoalPath, Disjuncts, Detism, StaticInfo,
         !CostSoFar, FoundFirstUse) :-
@@ -721,9 +744,10 @@ disj_var_first_use(GoalPath, Disjuncts, 
     ).
 
 :- pred disj_var_first_use_2(goal_path::in, int::in,
-    list(goal_rep(coverage_info))::in,
+    list(goal_rep(T))::in,
     var_first_use_static_info::in(var_first_use_static_info),
-    float::in, float::out, found_first_use::out) is det.
+    float::in, float::out, found_first_use::out) is det
+    <= goal_annotation_with_coverage(T).
 
 disj_var_first_use_2(_, _, [], _, !CostSoFar, have_not_found_first_use).
 disj_var_first_use_2(GoalPath, DisjNum, [Disj | Disjs], StaticInfo, !CostSoFar,
@@ -760,7 +784,7 @@ disj_var_first_use_2(GoalPath, DisjNum, 
             ),
             % Use a weighted average to reflect the likely success of the first
             % disjunct.
-            ( get_coverage_before(Disj ^ goal_annotation, HeadCount) ->
+            ( get_coverage_before(get_coverage(Disj), HeadCount) ->
                 HeadWeight = float(HeadCount)
             ;
                 error(this_file ++ " unknown coverage before disjunct")
@@ -770,7 +794,7 @@ disj_var_first_use_2(GoalPath, DisjNum, 
                 TailWeight = 0.0
             ;
                 Disjs = [FirstTailDisj | _],
-                FirstTailCoverage = FirstTailDisj ^ goal_annotation,
+                FirstTailCoverage = get_coverage(FirstTailDisj),
                 ( get_coverage_before(FirstTailCoverage, TailCount) ->
                     TailWeight = float(TailCount)
                 ;
@@ -784,9 +808,10 @@ disj_var_first_use_2(GoalPath, DisjNum, 
     ).
 
 :- pred switch_var_first_use(goal_path::in, var_rep::in,
-    list(case_rep(coverage_info))::in, 
+    list(case_rep(T))::in, 
     var_first_use_static_info::in(var_first_use_static_info),
-    float::in, float::out, found_first_use::out) is det.
+    float::in, float::out, found_first_use::out) is det
+    <= goal_annotation_with_coverage(T).
 
 switch_var_first_use(GoalPath, SwitchedOnVar, Cases, StaticInfo,
         CostBeforeSwitch, CostAfterSwitch, FoundFirstUse) :-
@@ -821,9 +846,9 @@ switch_var_first_use(GoalPath, SwitchedO
 
 :- pred switch_var_first_use_2(goal_path::in, int::in,
     var_first_use_static_info::in(var_first_use_static_info), 
-    list(case_rep(coverage_info))::in, list(float)::out, float::in, 
+    list(case_rep(T))::in, list(float)::out, float::in, 
     list(float)::out, list(found_first_use)::out)
-    is det.
+    is det <= goal_annotation_with_coverage(T).
 
 switch_var_first_use_2(_, _, _, [], [], _, [], []).
 switch_var_first_use_2(GoalPath, CaseNum, StaticInfo, [Case | Cases],
@@ -835,24 +860,23 @@ switch_var_first_use_2(GoalPath, CaseNum
     Case = case_rep(_, _, Goal),
     goal_var_first_use(CaseGoalPath, Goal, StaticInfo, Cost0, Cost,
         FoundFirstUse),
-    Goal = goal_rep(_, _, Coverage),
-    ( get_coverage_before(Coverage, BeforeCount) ->
+    ( get_coverage_before(get_coverage(Goal), BeforeCount) ->
         Weight = float(BeforeCount)
     ;
         error(this_file ++ "unknown coverage before switch case")
     ).
 
-:- pred ite_var_first_use(goal_path::in, goal_rep(coverage_info)::in,
-    goal_rep(coverage_info)::in, goal_rep(coverage_info)::in,
+:- pred ite_var_first_use(goal_path::in, 
+    goal_rep(T)::in, goal_rep(T)::in, goal_rep(T)::in,
     var_first_use_static_info::in(var_first_use_static_info),
     float::in, float::out, found_first_use::out)
-    is det.
+    is det <= goal_annotation_with_coverage(T).
 
 ite_var_first_use(GoalPath, Cond, Then, Else, StaticInfo,
         !CostSoFar, FoundFirstUse) :-
     (
-        get_coverage_before(Then ^ goal_annotation, CountBeforeThen),
-        get_coverage_before(Else ^ goal_annotation, CountBeforeElse)
+        get_coverage_before(get_coverage(Then), CountBeforeThen),
+        get_coverage_before(get_coverage(Else), CountBeforeElse)
     ->
         Weights = [float(CountBeforeThen), float(CountBeforeElse)]
     ;
@@ -930,9 +954,28 @@ goal_var_first_use_wrapper(Deep, CliqueP
             RecursiveCallSiteMap, Var, VarUseType, CallStack, RT,
             CurDepth),
         0.0, _Cost, FoundFirstUse),
+    found_first_use_to_use_info(FoundFirstUse, ProcCost, VarUseType,
+        VarUseInfo).
+
+:- instance goal_annotation_with_coverage(coverage_info) where [
+        (get_coverage(Goal) = Goal ^ goal_annotation)
+    ].
+
+var_first_use(Deep, CliquePtr, CallSiteMap, RecursiveCallSiteMap, RT, CurDepth,
+        Goal, GoalPath, Cost, Var, VarUseType, VarUseInfo) :-
+    goal_var_first_use(GoalPath, Goal,
+        var_first_use_static_info(Deep, CliquePtr, CallSiteMap,
+            RecursiveCallSiteMap, Var, VarUseType, set.init, RT, CurDepth), 
+        0.0, _, FoundFirstUse),
+    found_first_use_to_use_info(FoundFirstUse, Cost, VarUseType, VarUseInfo).
+
+:- pred found_first_use_to_use_info(found_first_use::in, float::in,
+    var_use_type::in, var_use_info::out) is det.
+
+found_first_use_to_use_info(FoundFirstUse, Cost, VarUseType, VarUseInfo) :-
     (
         FoundFirstUse = found_first_use(CostUntilUse),
-        VarUseInfo = var_use_info(CostUntilUse, ProcCost, VarUseType)
+        VarUseInfo = var_use_info(CostUntilUse, Cost, VarUseType)
     ;
         FoundFirstUse = have_not_found_first_use,
         % If the first use has not been found then:
@@ -946,10 +989,10 @@ goal_var_first_use_wrapper(Deep, CliqueP
                 ": Goal did not produce a variable that it should have")
         ;
             VarUseType = var_use_consumption,
-            VarUseInfo = var_use_info(ProcCost, ProcCost, VarUseType)
+            VarUseInfo = var_use_info(Cost, Cost, VarUseType)
         ;
             VarUseType = var_use_other,
-            pessimistic_var_use_info(VarUseType, ProcCost, VarUseInfo)
+            pessimistic_var_use_info(VarUseType, Cost, VarUseInfo)
         )
     ).
 
Index: library/list.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/list.m,v
retrieving revision 1.195
diff -u -p -b -r1.195 list.m
--- library/list.m	7 Oct 2010 02:38:10 -0000	1.195
+++ library/list.m	14 Oct 2010 03:58:50 -0000
@@ -1097,6 +1097,31 @@
 :- mode list.map2_foldl(pred(in, out, out, di, uo) is cc_multi, in, out, out,
     di, uo) is cc_multi.
 
+    % Same as list.map2_foldl, but with three mapped outputs.
+    %
+:- pred list.map3_foldl(pred(L, M, N, O, A, A), list(L), list(M), list(N),
+    list(O), A, A).
+:- mode list.map3_foldl(pred(in, out, out, out, in, out) is det, in, out, out,
+    out, in, out) is det.
+:- mode list.map3_foldl(pred(in, out, out, out, mdi, muo) is det, in, out, out,
+    out, mdi, muo) is det.
+:- mode list.map3_foldl(pred(in, out, out, out, di, uo) is det, in, out, out,
+    out, di, uo) is det.
+:- mode list.map3_foldl(pred(in, out, out, out, in, out) is semidet, in, out, 
+    out, out, in, out) is semidet.
+:- mode list.map3_foldl(pred(in, out, out, out, mdi, muo) is semidet, in, out,
+    out, out, mdi, muo) is semidet.
+:- mode list.map3_foldl(pred(in, out, out, out, di, uo) is semidet, in, out, 
+    out, out, di, uo) is semidet.
+:- mode list.map3_foldl(pred(in, out, out, out, in, out) is nondet, in, out, 
+    out, out, in, out) is nondet.
+:- mode list.map3_foldl(pred(in, out, out, out, mdi, muo) is nondet, in, out,
+    out, out, mdi, muo) is nondet.
+:- mode list.map3_foldl(pred(in, out, out, out, in, out) is cc_multi, in, out, 
+    out, out, in, out) is cc_multi.
+:- mode list.map3_foldl(pred(in, out, out, out, di, uo) is cc_multi, in, out, 
+    out, out, di, uo) is cc_multi.
+
     % Same as list.map_foldl, but with two accumulators.
     %
 :- pred list.map_foldl2(pred(L, M, A, A, B, B), list(L), list(M), A, A, B, B).
@@ -2416,6 +2441,11 @@ list.map2_foldl(P, [H0 | T0], [H1 | T1],
     P(H0, H1, H2, !A),
     list.map2_foldl(P, T0, T1, T2, !A).
 
+list.map3_foldl(_, [], [], [], [], !A).
+list.map3_foldl(P, [H0 | T0], [H1 | T1], [H2 | T2], [H3 | T3], !A) :-
+    P(H0, H1, H2, H3, !A),
+    list.map3_foldl(P, T0, T1, T2, T3, !A).
+
 list.map_foldl2(_, [], [], !A, !B).
 list.map_foldl2(P, [H0 | T0], [H | T], !A, !B) :-
     P(H0, H, !A, !B),
Index: mdbcomp/feedback.automatic_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/mdbcomp/feedback.automatic_parallelism.m,v
retrieving revision 1.3
diff -u -p -b -r1.3 feedback.automatic_parallelism.m
--- mdbcomp/feedback.automatic_parallelism.m	7 Oct 2010 02:38:10 -0000	1.3
+++ mdbcomp/feedback.automatic_parallelism.m	14 Oct 2010 03:58:50 -0000
@@ -140,14 +140,8 @@
                 % The path within the procedure to this conjunuction.
                 cpc_goal_path           :: goal_path_string,
                
-                % Used to locate the goals to be parallelised within the
-                % conjunction. Partitions are separated by non-atomic goals,
-                % the first partition has the number 1.
-                cpc_partition_number    :: int,
-
-                % The first conjunct number in the partition. This is only
-                % used for pretty-printing these reports with meaningful
-                % goal paths.
+                % The position within the original conjunction that this
+                % parallelisation starts.
                 cpc_first_conj_num      :: int,
 
                 cpc_is_dependent        :: conjuncts_are_dependent,
@@ -322,12 +316,12 @@ convert_candidate_par_conjunctions_proc(
     CPCProcB = candidate_par_conjunctions_proc(VarTable, CPCB).
 
 convert_candidate_par_conjunction(Conv, CPC0, CPC) :-
-    CPC0 = candidate_par_conjunction(GoalPath, PartNum, FirstGoalNum,
+    CPC0 = candidate_par_conjunction(GoalPath, FirstGoalNum,
         IsDependent, GoalsBefore0, Conjs0, GoalsAfter0, Metrics),
     map(convert_seq_conj(Conv), Conjs0, Conjs),
     map(Conv, GoalsBefore0, GoalsBefore),
     map(Conv, GoalsAfter0, GoalsAfter),
-    CPC = candidate_par_conjunction(GoalPath, PartNum, FirstGoalNum,
+    CPC = candidate_par_conjunction(GoalPath, FirstGoalNum,
         IsDependent, GoalsBefore, Conjs, GoalsAfter, Metrics).
 
 convert_seq_conj(Conv, seq_conj(Conjs0), seq_conj(Conjs)) :-
Index: mdbcomp/feedback.m
===================================================================
RCS file: /home/mercury1/repository/mercury/mdbcomp/feedback.m,v
retrieving revision 1.17
diff -u -p -b -r1.17 feedback.m
--- mdbcomp/feedback.m	24 Aug 2010 06:36:39 -0000	1.17
+++ mdbcomp/feedback.m	14 Oct 2010 03:58:50 -0000
@@ -535,7 +535,7 @@ feedback_first_line = "Mercury Compiler 
 
 :- func feedback_version = string.
 
-feedback_version = "12".
+feedback_version = "13".
 
 %-----------------------------------------------------------------------------%
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20101014/0ec85bc1/attachment.sig>
Previous message: [m-rev.] diff: post_typecheck and parallelism
Next message: [m-rev.] for review: mention winmercury on the release page
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list