[m-rev.] for review: Support implicit parallelism in the compiler.

Paul Bone pbone at csse.unimelb.edu.au
Fri Jan 23 16:47:18 AEDT 2009


For review by anyone.

I'm not comfortable with the name goal_path_consable for a goal path
type where the goal_path_steps are stored in order so that accesses to
the head of the list are efficient.


Estimated hours taken: 20 
Branches: main

Support implicit parallelism in the compiler.

The compiler now uses the deep profiler feedback information to build a
parallel version of a program.

Changes have also been made to the feedback format for candidate parallel
conjunctions and the analysis that recommends opportunities for parallelism to
the compiler.

compiler/implicit_parallelism.m:
	Mark Tannier's implementation as deprecated (it also crashes the
	compiler).
	Introduce new implicit parallelism transformation.
	apply_implicit_parallelism_transformation now returns maybe_error rather
	than maybe so that errors can be described.

compiler/goal_util.m:
	Add a predicate to transform a goal referenced by a goal path within a
	larger goal structure and rebuild that structure.

compiler/mercury_compile.m:
	Conform to changes in implicit_parallelism.m

deep_profiler/mdprof_feedback.m:
	Return a cord of warnings from many predicates, these warnings are used to
	describe cases where parallelism might be profitable but it is not (yet)
	possible to transform the code into parallel code.
	Fix a bug whereby the wrong deep profiling statistic was used to calculate
	the cost of a call.
	Do not attempt to parallelise calls with other goals between them.

mdbcomp/feedback.m:
	Remove the intermediate goals information from the candidate parallel
	conjunctions feedback data.

mdbcomp/program_representation.m:
	Provide a in-order alternative to the goal_path type so that operations on
	the start of the goal path occur in constant time and goal_path itself
	remains usable as a key in arrays because it doesn't use the cord type
	internally.


Index: NEWS
===================================================================
RCS file: /home/mercury1/repository/mercury/NEWS,v
retrieving revision 1.499
diff -u -p -b -r1.499 NEWS
--- NEWS	12 Jan 2009 02:28:45 -0000	1.499
+++ NEWS	23 Jan 2009 04:25:44 -0000
@@ -100,6 +100,7 @@ Changes to the Mercury standard library:
 	list.foldl4_corresponding3/12
 	list.split_upto/4
 	list.contains/2
+	list.find_index_of_match/4
 
    We have also added versions of list.foldl/4 and list.foldr/4 that have
    determinism multi.
@@ -119,6 +120,11 @@ Changes to the Mercury standard library:
 	bag.member/2
 	bag.member/3
 
+* A unique mode has been added to cord.foldl_pred/4
+
+* The following function has been added to the pqueue module
+	pqueue.length/1
+
 * We have changed the interface of the ops module to make lookups of operators
   more efficient.
 
Index: compiler/goal_util.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/goal_util.m,v
retrieving revision 1.162
diff -u -p -b -r1.162 goal_util.m
--- compiler/goal_util.m	12 Jan 2009 02:01:02 -0000	1.162
+++ compiler/goal_util.m	23 Jan 2009 03:37:58 -0000
@@ -26,6 +26,7 @@
 :- import_module hlds.instmap.
 :- import_module hlds.pred_table.
 :- import_module mdbcomp.prim_data.
+:- import_module mdbcomp.program_representation.
 :- import_module parse_tree.prog_data.
 
 :- import_module assoc_list.
@@ -391,6 +392,18 @@
 :- func maybe_strip_equality_pretest(hlds_goal) = hlds_goal.
 
 %-----------------------------------------------------------------------------%
+
+    % Locate the goal described by the goal path and use its first argument to
+    % transform that goal before rebuilding the goal tree and returning.  If
+    % the goal is not found the result is no.  If the result of the higher
+    % order argument is no then the result is no.
+    %
+:- pred maybe_transform_goal_at_goal_path(pred(hlds_goal, maybe(hlds_goal)),
+    goal_path_consable, hlds_goal, maybe(hlds_goal)). 
+:- mode maybe_transform_goal_at_goal_path(pred(in, out) is det,
+    in, in, out) is det.
+
+%-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
 :- implementation.
@@ -1843,6 +1856,165 @@ maybe_strip_equality_pretest_case(Case0)
 
 %-----------------------------------------------------------------------------%
 
+maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath0, Goal0,
+        MaybeGoal) :-
+    (
+        goal_path_consable_remove_first(TargetGoalPath0, Step, TargetGoalPath)
+    ->
+        GoalExpr0 = Goal0 ^ hlds_goal_expr,  
+        (
+            ( GoalExpr0 = unify(_, _, _, _, _) 
+            ; GoalExpr0 = plain_call(_, _, _, _, _, _)
+            ; GoalExpr0 = generic_call(_, _, _, _)
+            ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
+            ),
+            % This search should never reach an atomic goal.
+            MaybeGoalExpr = no
+        ;
+            GoalExpr0 = conj(ConjType, Conjs0),
+            (
+                Step = step_conj(ConjNum),
+                list.index1(Conjs0, ConjNum, Conj0)
+            ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    Conj0, MaybeConj),
+                (
+                    MaybeConj = yes(Conj),
+                    list.replace_nth_det(Conjs0, ConjNum, Conj, Conjs),
+                    MaybeGoalExpr = yes(conj(ConjType, Conjs))
+                ;
+                    MaybeConj = no,
+                    MaybeGoalExpr = no
+                )
+            ;
+                MaybeGoalExpr = no
+            )
+        ;
+            GoalExpr0 = disj(Disjs0),
+            (
+                Step = step_disj(DisjNum),
+                list.index1(Disjs0, DisjNum, Disj0)
+            ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    Disj0, MaybeDisj),
+                (
+                    MaybeDisj = yes(Disj),
+                    list.replace_nth_det(Disjs0, DisjNum, Disj, Disjs),
+                    MaybeGoalExpr = yes(disj(Disjs))
+                ;
+                    MaybeDisj = no,
+                    MaybeGoalExpr = no
+                )
+            ;
+                MaybeGoalExpr = no
+            )
+        ;
+            GoalExpr0 = switch(Var, CanFail, Cases0),
+            (
+                Step = step_switch(CaseNum, _MaybeNumConstructors),
+                list.index1(Cases0, CaseNum, Case0)
+            ->
+                CaseGoal0 = Case0 ^ case_goal,
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    CaseGoal0, MaybeCaseGoal),
+                (
+                    MaybeCaseGoal = yes(CaseGoal),
+                    Case = Case0 ^ case_goal := CaseGoal,
+                    list.replace_nth_det(Cases0, CaseNum, Case, Cases),
+                    MaybeGoalExpr = yes(switch(Var, CanFail, Cases))
+                ;
+                    MaybeCaseGoal = no,
+                    MaybeGoalExpr = no
+                )
+            ;
+                MaybeGoalExpr = no
+            )
+        ;
+            GoalExpr0 = negation(SubGoal0),
+            ( Step = step_neg ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    SubGoal0, MaybeSubGoal),
+                (
+                    MaybeSubGoal = yes(SubGoal),
+                    MaybeGoalExpr = yes(negation(SubGoal))
+                ;
+                    MaybeSubGoal = no,
+                    MaybeGoalExpr = no
+                )
+            ;
+                MaybeGoalExpr = no
+            )
+        ;
+            GoalExpr0 = scope(Reason, SubGoal0),
+            ( Step = step_scope(_MaybeCut) ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    SubGoal0, MaybeSubGoal),
+                (
+                    MaybeSubGoal = yes(SubGoal),
+                    MaybeGoalExpr = yes(scope(Reason, SubGoal))
+                ;
+                    MaybeSubGoal = no,
+                    MaybeGoalExpr = no
+                )
+            ;
+                MaybeGoalExpr = no
+            )
+        ;
+            GoalExpr0 = if_then_else(ExistVars, Cond0, Then0, Else0),
+            ( Step = step_ite_cond ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    Cond0, MaybeCond),
+                (
+                    MaybeCond = yes(Cond),
+                    MaybeGoalExpr = yes(if_then_else(ExistVars, Cond, Then0,
+                        Else0))
+                ;
+                    MaybeCond = no,
+                    MaybeGoalExpr = no
+                )
+            ; Step = step_ite_then ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    Then0, MaybeThen),
+                (
+                    MaybeThen = yes(Then),
+                    MaybeGoalExpr = yes(if_then_else(ExistVars, Cond0, Then,
+                        Else0))
+                ;
+                    MaybeThen = no,
+                    MaybeGoalExpr = no
+                )
+            ; Step = step_ite_else ->
+                maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
+                    Else0, MaybeElse),
+                (
+                    MaybeElse = yes(Else),
+                    MaybeGoalExpr = yes(if_then_else(ExistVars, Cond0, Then0,
+                        Else))
+                ;
+                    MaybeElse = no,
+                    MaybeGoalExpr = no
+                )
+            ;
+                MaybeGoalExpr = no
+            )
+        ;
+            GoalExpr0 = shorthand(_),
+            unexpected(this_file, 
+                "Shorthand goals should have been eliminated already")
+        ),
+        (
+            MaybeGoalExpr = yes(GoalExpr),
+            MaybeGoal = yes(Goal0 ^ hlds_goal_expr := GoalExpr)
+        ;
+            MaybeGoalExpr = no,
+            MaybeGoal = no
+        )
+    ;
+        TransformP(Goal0, MaybeGoal)
+    ).
+
+%-----------------------------------------------------------------------------%
+
 :- func this_file = string.
 
 this_file = "goal_util.m".
Index: compiler/implicit_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/implicit_parallelism.m,v
retrieving revision 1.10
diff -u -p -b -r1.10 implicit_parallelism.m
--- compiler/implicit_parallelism.m	23 Dec 2008 01:37:34 -0000	1.10
+++ compiler/implicit_parallelism.m	23 Jan 2009 05:32:59 -0000
@@ -14,15 +14,6 @@
 % worthwhile (implicit parallelism). It deals with both independent and
 % dependent parallelism.
 %
-% TODO
-%   -   Once a call which is a candidate for implicit parallelism is found,
-%       search forward AND backward for the closest goal which is also a
-%       candidate for implicit parallelism/parallel conjunction and determine
-%       which side is the best (on the basis of the number of shared variables).
-%
-% XXX Several predicates in this module repeatedly add goals to the ends of
-% lists of goals, yielding quadratic behavior. This should be fixed.
-%
 %-----------------------------------------------------------------------------%
 
 :- module transform_hlds.implicit_parallelism.
@@ -40,7 +31,7 @@
     % feedback file.
     %
 :- pred apply_implicit_parallelism_transformation(
-    module_info::in, maybe(module_info)::out) is det.
+    module_info::in, maybe_error(module_info)::out) is det.
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
@@ -53,9 +44,11 @@
 :- import_module hlds.hlds_goal.
 :- import_module hlds.hlds_pred.
 :- import_module hlds.instmap.
+:- import_module hlds.pred_table.
 :- import_module hlds.quantification.
 :- import_module libs.compiler_util.
 :- import_module libs.globals.
+:- import_module libs.options.
 :- import_module mdbcomp.feedback.
 :- import_module mdbcomp.prim_data.
 :- import_module mdbcomp.program_representation.
@@ -63,18 +56,374 @@
 :- import_module parse_tree.prog_data.
 :- import_module transform_hlds.dep_par_conj.
 
+:- import_module assoc_list.
 :- import_module bool.
 :- import_module char.
 :- import_module counter.
 :- import_module int.
 :- import_module list.
+:- import_module map.
 :- import_module pair.
 :- import_module require.
 :- import_module set.
 :- import_module string.
+:- import_module svmap.
+:- import_module varset.
+
+%-----------------------------------------------------------------------------%
+
+apply_implicit_parallelism_transformation(ModuleInfo0, MaybeModuleInfo) :-
+    module_info_get_globals(ModuleInfo0, Globals),
+    lookup_bool_option(Globals, old_implicit_parallelism,
+        UseOldImplicitParallelism), 
+    (
+        UseOldImplicitParallelism = yes,
+        apply_old_implicit_parallelism_transformation(ModuleInfo0,
+            MaybeModuleInfo)
+    ;
+        UseOldImplicitParallelism = no,
+        apply_new_implicit_parallelism_transformation(ModuleInfo0,
+            MaybeModuleInfo)
+    ).
 
 %-----------------------------------------------------------------------------%
 
+    % This type is used to track whether parallelism has been introduced by a
+    % predicate.
+    %
+:- type introduced_parallelism
+    --->    have_not_introduced_parallelism
+    ;       introduced_parallelism.
+
+:- pred apply_new_implicit_parallelism_transformation(module_info::in,
+    maybe_error(module_info)::out) is det.
+
+apply_new_implicit_parallelism_transformation(ModuleInfo0, MaybeModuleInfo) :-
+    module_info_get_globals(ModuleInfo0, Globals0),
+    globals.get_feedback_info(Globals0, FeedbackInfo),
+    module_info_get_name(ModuleInfo0, ModuleName),
+    (
+        get_implicit_parallelism_feedback(ModuleName, FeedbackInfo,
+            ParallelismInfo)
+    ->
+        some [!ModuleInfo]
+        (
+            !:ModuleInfo = ModuleInfo0,
+            
+            % Retrieve and process predicates.
+            module_info_predids(PredIds, !ModuleInfo),
+            module_info_get_predicate_table(!.ModuleInfo, PredTable0),
+            predicate_table_get_preds(PredTable0, PredMap0),
+            list.foldl2(maybe_parallelise_pred(!.ModuleInfo, ParallelismInfo), 
+                PredIds, PredMap0, PredMap, 
+                have_not_introduced_parallelism, IntroducedParallelism),
+            (
+                IntroducedParallelism = have_not_introduced_parallelism
+            ;
+                IntroducedParallelism = introduced_parallelism,
+                predicate_table_set_preds(PredMap, PredTable0, PredTable),
+                module_info_set_predicate_table(PredTable, !ModuleInfo),
+                module_info_set_contains_par_conj(!ModuleInfo)
+            ),
+            MaybeModuleInfo = ok(!.ModuleInfo)
+        )
+    ;
+        MaybeModuleInfo =
+            error("Insufficient feedback information for implicit parallelism") 
+    ).
+
+    % Information retrieved from the feedback system to be used for
+    % parallelising this module.
+    %
+:- type parallelism_info
+    --->    parallelism_info(
+                pi_desired_parallelism  :: float,
+                    % The number of desired busy sparks.
+
+                pi_sparking_cost        :: int,
+                    % The cost of creating a spark in call sequence counts.
+
+                pi_locking_cost         :: int,
+                    % The cost of maintaining a lock on a single dependant
+                    % variable in call sequence counts.
+
+                pi_cpc_map              :: module_candidate_par_conjs_map
+                    % A map of candidate parallel conjunctions in this module
+                    % indexed by their procedure.
+            ).
+
+:- type intra_module_proc_label
+    --->    intra_module_proc_label(
+                im_pred_name            :: string,
+                im_arity                :: int,
+                im_pred_or_func         :: pred_or_func,
+                im_mode                 :: int
+            ).
+
+    % A map of the candidate parallel conjunctions indexed by the procedure
+    % label for a given module.
+    %
+:- type module_candidate_par_conjs_map
+    == map(intra_module_proc_label, candidate_par_conjunction).
+
+:- pred get_implicit_parallelism_feedback(module_name::in, feedback_info::in,
+    parallelism_info::out) is semidet.
+
+get_implicit_parallelism_feedback(ModuleName, FeedbackInfo, ParallelismInfo) :-
+    FeedbackData = 
+        feedback_data_candidate_parallel_conjunctions(_, _, _, _),
+    get_feedback_data(FeedbackInfo, FeedbackData),
+    FeedbackData = feedback_data_candidate_parallel_conjunctions(
+        DesiredParallelism, SparkingCost, LockingCost, AssocList), 
+    make_module_candidate_par_conjs_map(ModuleName, AssocList,
+        CandidateParConjsMap),
+    ParallelismInfo = parallelism_info(DesiredParallelism, SparkingCost,
+        LockingCost, CandidateParConjsMap).
+
+:- pred make_module_candidate_par_conjs_map(module_name::in,
+    assoc_list(string_proc_label, candidate_par_conjunction)::in,
+    module_candidate_par_conjs_map::out) is det.
+
+make_module_candidate_par_conjs_map(ModuleName,
+        CandidateParConjsAssocList0, CandidateParConjsMap) :-
+    ModuleNameStr = sym_name_to_string(ModuleName),
+    filter_map(cpc_proc_is_in_module(ModuleNameStr),
+        CandidateParConjsAssocList0, CandidateParConjsAssocList),
+    CandidateParConjsMap = map.from_assoc_list(CandidateParConjsAssocList).
+
+:- pred cpc_proc_is_in_module(string::in, 
+    pair(string_proc_label, candidate_par_conjunction)::in,
+    pair(intra_module_proc_label, candidate_par_conjunction)::out) is semidet.
+
+cpc_proc_is_in_module(ModuleName, ProcLabel - CPC, IMProcLabel - CPC) :-
+    ( 
+        ProcLabel = str_ordinary_proc_label(PredOrFunc, _, DefModule, Name,
+            Arity, Mode)
+    ; 
+        ProcLabel = str_special_proc_label(_, _, DefModule, Name, Arity, Mode),
+        PredOrFunc = pf_predicate 
+    ),
+    ModuleName = DefModule,
+    IMProcLabel = intra_module_proc_label(Name, Arity, PredOrFunc, Mode).
+
+%-----------------------------------------------------------------------------%
+
+:- pred maybe_parallelise_pred(module_info::in, parallelism_info::in, 
+    pred_id::in, pred_table::in, pred_table::out,
+    introduced_parallelism::in, introduced_parallelism::out) is det.
+
+maybe_parallelise_pred(ModuleInfo, ParallelismInfo, PredId, !PredTable, 
+        !IntroducedParallelism) :-
+    map.lookup(!.PredTable, PredId, PredInfo0),
+    ProcIds = pred_info_non_imported_procids(PredInfo0),
+    pred_info_get_procedures(PredInfo0, ProcTable0),
+    list.foldl2(maybe_parallelise_proc(ModuleInfo, ParallelismInfo, PredId),
+        ProcIds, ProcTable0, ProcTable, have_not_introduced_parallelism,
+        ProcIntroducedParallelism),
+    (
+        ProcIntroducedParallelism = have_not_introduced_parallelism
+    ;
+        ProcIntroducedParallelism = introduced_parallelism,
+        !:IntroducedParallelism = introduced_parallelism,
+        pred_info_set_procedures(ProcTable, PredInfo0, PredInfo),
+        svmap.det_update(PredId, PredInfo, !PredTable)
+    ).
+
+:- pred maybe_parallelise_proc(module_info::in, parallelism_info::in, 
+    pred_id::in, proc_id::in, proc_table::in, proc_table::out, 
+    introduced_parallelism::in, introduced_parallelism::out) is det.
+
+maybe_parallelise_proc(ModuleInfo, ParallelismInfo, PredId, ProcId, !ProcTable,
+        !IntroducedParallelism) :-
+    module_info_pred_proc_info(ModuleInfo, PredId, ProcId, 
+        PredInfo, ProcInfo0),
+    
+    % Lookup the Candidate Parallel Conjunction (CPC) Map for this procedure.
+    Name = pred_info_name(PredInfo),
+    Arity = pred_info_orig_arity(PredInfo),
+    PredOrFunc = pred_info_is_pred_or_func(PredInfo),
+    Mode = proc_id_to_int(ProcId),
+    IMProcLabel = intra_module_proc_label(Name, Arity, PredOrFunc, Mode),
+    CPCMap = ParallelismInfo ^ pi_cpc_map,
+    ( map.search(CPCMap, IMProcLabel, CPC) ->
+        proc_info_get_goal(ProcInfo0, Goal0),
+        TargetGoalPathString = CPC ^ goal_path,
+        ( goal_path_from_string(TargetGoalPathString, TargetGoalPathPrime) ->
+            TargetGoalPath = TargetGoalPathPrime
+        ;
+            unexpected(this_file, 
+                "Invalid goal path in CPC Feedback Information")
+        ),
+        maybe_parallelise_goal(ProcInfo0, CPC, TargetGoalPath, 
+            Goal0, MaybeGoal),
+        (
+            MaybeGoal = yes(Goal),
+            % In the future we'll specialise the procedure for parallelism,  We
+            % don't do that now so simply replace the procedure's body.
+            proc_info_set_goal(Goal, ProcInfo0, ProcInfo1),
+            proc_info_set_has_parallel_conj(yes, ProcInfo1, ProcInfo),
+            svmap.det_update(ProcId, ProcInfo, !ProcTable),
+            !:IntroducedParallelism = introduced_parallelism
+        ;
+            MaybeGoal = no
+        )
+    ;
+        true 
+    ).
+
+    % maybe_parallelise_goal(ModuleInfo, CPC, GoalPath, Goal, MaybeGoal).
+    %
+    % Parallelise a goal addressed by GoalPath within Goal producing MaybeGoal
+    % if found.  The goal to parallelise must be a conjunction with conjuncts
+    % matching those described in CPC.
+    %
+    % As this predicate recurses deeper into the goal tree GoalPath becomes
+    % smaller as goal path steps are popped off the end and followed.
+    %
+    % Try to parallelise the given conjunction within this goal.
+    %
+:- pred maybe_parallelise_goal(proc_info::in, candidate_par_conjunction::in,
+    goal_path::in, hlds_goal::in, maybe(hlds_goal)::out) is det.
+
+maybe_parallelise_goal(ProcInfo, CPC, TargetGoalPath, 
+        Goal0, MaybeGoal) :-
+    goal_path_consable(TargetGoalPath, TargetGoalPathC),
+    maybe_transform_goal_at_goal_path(maybe_parallelise_conj(ProcInfo, CPC),
+        TargetGoalPathC, Goal0, MaybeGoal).
+
+:- pred maybe_parallelise_conj(proc_info::in, candidate_par_conjunction::in,
+    hlds_goal::in, maybe(hlds_goal)::out) is det.
+
+maybe_parallelise_conj(ProcInfo, CPC, Goal0, MaybeGoal) :-
+    GoalExpr0 = Goal0 ^ hlds_goal_expr,
+    % We've reached the point indicated by the goal path, Find the
+    % conjuncts that we wish to parallelise.
+    ( 
+        GoalExpr0 = conj(plain_conj, Conjs0),
+        index1_of_candidate_par_conjunct(ProcInfo, CPC ^ par_conjunct_a,
+            Conjs0, AIdx),
+        index1_of_candidate_par_conjunct(ProcInfo, CPC ^ par_conjunct_b,
+            Conjs0, BIdx),
+        AIdx \= BIdx,
+        % In the future there may be some goals between the two calls to
+        % parallelise.  The analysis will say how many of these goals should be
+        % run with each of the two calls, but it may be incorrect in cases
+        % where the code has changed.  Thus we need to ensure that the code is
+        % still mode-correct.
+        GoalA = list.det_index1(Conjs0, AIdx),
+        GoalB = list.det_index1(Conjs0, BIdx),
+        model_det_and_at_least_semipure(GoalA),
+        model_det_and_at_least_semipure(GoalB),
+        ( BIdx - AIdx = 1 ->
+            % The conjuncts are adjacent with GoalA occuring first.
+            FirstParGoal = GoalA,
+            SecondParGoal = GoalB,
+            MaxIdx = BIdx,
+            MinIdx = AIdx
+        ; AIdx - BIdx = 1 ->
+            FirstParGoal = GoalB,
+            SecondParGoal = GoalA,
+            MaxIdx = AIdx,
+            MinIdx = BIdx
+        ;
+            fail
+        )
+    ->
+        ParConjExprList = [FirstParGoal, SecondParGoal],
+        ParConjExpr = conj(parallel_conj, ParConjExprList),
+        goal_list_nonlocals(ParConjExprList, ParConjNonLocals),
+        goal_list_instmap_delta(ParConjExprList, ParConjInstmapDelta),
+        goal_list_determinism(ParConjExprList, ParConjDetism),
+        goal_list_purity(ParConjExprList, ParConjPurity),
+        goal_info_init(ParConjNonLocals, ParConjInstmapDelta, ParConjDetism,
+            ParConjPurity, ParConjInfo),
+        ParConj = hlds_goal(ParConjExpr, ParConjInfo),
+        (
+            take(MinIdx - 1, Conjs0, GoalsBeforeParPrime),
+            drop(MaxIdx, Conjs0, GoalsAfterParPrime)
+        ->
+            GoalsBeforePar = GoalsBeforeParPrime,
+            GoalsAfterPar = GoalsAfterParPrime
+        ;
+            unexpected(this_file, "Miscalculated conjunct list operations.")
+        ),
+        Conjs = GoalsBeforePar ++ [ ParConj | GoalsAfterPar ],
+        GoalExpr = conj(plain_conj, Conjs),
+        MaybeGoal = yes(hlds_goal(GoalExpr, Goal0 ^ hlds_goal_info))
+    ;
+        MaybeGoal = no
+    ).
+
+:- pred index1_of_candidate_par_conjunct(proc_info::in, 
+    candidate_par_conjunct::in, list(hlds_goal)::in, int::out) is semidet.
+
+index1_of_candidate_par_conjunct(ProcInfo, CPC, Goals, Index) :-
+    MaybeCallee = CPC ^ callee,
+    (
+        MaybeCallee = yes(Callee),
+        NamePt1 - NamePt2 = Callee,
+        MaybeName = yes(string_to_sym_name(NamePt1 ++ "." ++ NamePt2))
+    ;
+        MaybeCallee = no,
+        MaybeName = no
+    ),
+    CPC ^ vars = CPCArgs,
+    proc_info_get_varset(ProcInfo, VarSet),
+
+    find_index_of_match((pred(Goal::in) is semidet :-
+            Goal = hlds_goal(CallGoal, _),
+            % Some calls know the name of the callee, others dont.  match the
+            % if the name is know and what it is if known against the candidate
+            % parallel conjunct.
+            % Note: we don't (yet) allow parallelisation of foreign code,
+            (
+                CallGoal = plain_call(_, _, Args, _, _, Name),
+                MaybeName = yes(Name)
+            ; 
+                CallGoal = generic_call(_, Args, _, _),
+                MaybeName = no
+            ),
+            % Match arguments which have user defined names in the profiled
+            % program against the names in the current program.
+            list.map(args_match(VarSet), Args, CPCArgs)
+        ), Goals, 1, Index).
+
+:- pred args_match(prog_varset::in, prog_var::in, maybe(string)::in) 
+    is semidet.
+
+args_match(_, _, no).
+args_match(VarSet, Var, yes(Name)) :-
+    varset.search_name(VarSet, Var, Name).
+
+:- pred model_det_and_at_least_semipure(hlds_goal::in) is semidet.
+
+model_det_and_at_least_semipure(Goal) :-
+    GoalInfo = Goal ^ hlds_goal_info,
+    Determinism = goal_info_get_determinism(GoalInfo),
+    ( Determinism = detism_det
+    ; Determinism = detism_cc_multi
+    ),
+    Purity = goal_info_get_purity(GoalInfo),
+    ( Purity = purity_pure
+    ; Purity = purity_semipure
+    ).
+
+%-----------------------------------------------------------------------------%
+%
+% The following code is deprecated, it is the older implicit parallelisation
+% transformation developed by Jerömé.
+%
+% TODO
+%   -   Once a call which is a candidate for implicit parallelism is found,
+%       search forward AND backward for the closest goal which is also a
+%       candidate for implicit parallelism/parallel conjunction and determine
+%       which side is the best (on the basis of the number of shared variables).
+%
+% XXX Several predicates in this module repeatedly add goals to the ends of
+% lists of goals, yielding quadratic behavior. This should be fixed.
+%
+%-----------------------------------------------------------------------------%
+
     % Represent a call site static which is a candidate for introducing
     % implicit parallelism.
     %
@@ -107,7 +456,10 @@ construct_call_site_kind("callback",    
 
 %-----------------------------------------------------------------------------%
 
-apply_implicit_parallelism_transformation(ModuleInfo0, MaybeModuleInfo) :-
+:- pred apply_old_implicit_parallelism_transformation(
+    module_info::in, maybe_error(module_info)::out) is det.
+
+apply_old_implicit_parallelism_transformation(ModuleInfo0, MaybeModuleInfo) :-
     module_info_get_globals(ModuleInfo0, Globals),
     globals.get_feedback_info(Globals, FeedbackInfo),
     (
@@ -123,10 +475,11 @@ apply_implicit_parallelism_transformatio
             list.map(call_site_convert, Calls, CandidateCallSites),
             process_preds_for_implicit_parallelism(PredIds, CandidateCallSites,
                 !ModuleInfo),
-            MaybeModuleInfo = yes(!.ModuleInfo)
+            MaybeModuleInfo = ok(!.ModuleInfo)
         )
     ;
-        MaybeModuleInfo = no
+        MaybeModuleInfo =
+            error("Insufficient feedback information for implicit parallelism") 
     ).
 
     % This predicate isn't really necessary as this entire module should use
Index: compiler/mercury_compile.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/mercury_compile.m,v
retrieving revision 1.486
diff -u -p -b -r1.486 mercury_compile.m
--- compiler/mercury_compile.m	20 Jan 2009 06:24:02 -0000	1.486
+++ compiler/mercury_compile.m	23 Jan 2009 02:43:59 -0000
@@ -4170,12 +4170,10 @@ maybe_implicit_parallelism(Verbose, Stat
         maybe_flush_output(Verbose, !IO),
         apply_implicit_parallelism_transformation(!.HLDS, MaybeHLDS),
         (
-            MaybeHLDS = yes(!:HLDS)
+            MaybeHLDS = ok(!:HLDS)
         ;
-            MaybeHLDS = no,
-            io.write_string(
-                "Insufficiant feedback information for implicit parallelism.",
-                !IO),
+            MaybeHLDS = error(Error),
+            io.write_string(Error ++ "\n", !IO),
             io.set_exit_status(1, !IO)
         ),
         maybe_write_string(Verbose, "% done.\n", !IO),
Index: deep_profiler/mdprof_feedback.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/mdprof_feedback.m,v
retrieving revision 1.14
diff -u -p -b -r1.14 mdprof_feedback.m
--- deep_profiler/mdprof_feedback.m	5 Nov 2008 03:38:40 -0000	1.14
+++ deep_profiler/mdprof_feedback.m	23 Jan 2009 05:34:33 -0000
@@ -101,7 +101,7 @@ main(!IO) :-
                 (
                     FeedbackReadResult = ok(Feedback0),
                     process_deep_to_feedback(RequestedFeedbackInfo,
-                        Deep, Feedback0, Feedback),
+                        Deep, Warnings, Feedback0, Feedback),
                     write_feedback_file(OutputFileName, ProfileProgName,
                         Feedback, WriteResult, !IO),
                     (
@@ -115,6 +115,16 @@ main(!IO) :-
                         io.format(Stderr, "%s: %s\n",
                             [s(OutputFileName), s(ErrorMessage)], !IO),
                         io.set_exit_status(1, !IO)
+                    ),
+                    lookup_bool_option(Options, show_warnings, ShowWarnings),
+                    (
+                        ShowWarnings = yes,
+                        cord.foldl_pred(
+                            (pred(Warning::in, IO0::di, IO::uo) is det :-
+                                io.format("W: %s\n", [s(Warning)], IO0, IO)
+                            ), Warnings, !IO)
+                    ;
+                        ShowWarnings = no
                     )
                 ;
                     FeedbackReadResult = error(FeedbackReadError),
@@ -254,6 +264,7 @@ read_deep_file(Input, Verbose, MaybeDeep
     ;       program_name
     ;       verbose
     ;       version
+    ;       show_warnings
 
             % The calls above threshold sorted feedback information, this is
             % used for the old implicit parallelism implementation.
@@ -282,6 +293,7 @@ short('h',  help).
 short('p',  program_name).
 short('V',  verbose).
 short('v',  version).
+short('w',  show_warnings).
 
 :- pred long(string::in, option::out) is semidet.
 
@@ -289,6 +301,7 @@ long("help",                            
 long("verbose",                             verbose).
 long("version",                             version).
 long("program-name",                        program_name).
+long("show-warnings",                       show_warnings).
 
 long("calls-above-threshold-sorted",        calls_above_threshold_sorted).
 long("calls-above-threshold-sorted-measure",
@@ -312,6 +325,7 @@ defaults(help,              bool(no)).
 defaults(program_name,      string("")).
 defaults(verbose,           bool(no)).
 defaults(version,           bool(no)).
+defaults(show_warnings,     bool(no)).
 
 defaults(calls_above_threshold_sorted,                      bool(no)).
 defaults(calls_above_threshold_sorted_measure,              string("mean")).
@@ -468,15 +482,17 @@ set_option(Option, Value, !Options) :-
 
 %----------------------------------------------------------------------------%
 
-    % process_deep_to_feedback(RequestedFeedbackInfo, Deep, !Feedback)
+    % process_deep_to_feedback(RequestedFeedbackInfo, Deep, Warnings,
+    %   !Feedback)
     %
     % Process a deep profiling structure and update the feedback information
     % according to the RequestedFeedbackInfo parameter.
     %
 :- pred process_deep_to_feedback(requested_feedback_info::in, deep::in,
-    feedback_info::in, feedback_info::out) is det.
+    cord(string)::out, feedback_info::in, feedback_info::out) is det.
 
-process_deep_to_feedback(RequestedFeedbackInfo, Deep, !Feedback) :-
+process_deep_to_feedback(RequestedFeedbackInfo, Deep, WarningStrs, 
+        !Feedback) :-
     MaybeCallsAboveThresholdSortedOpts =
         RequestedFeedbackInfo ^ maybe_calls_above_threshold_sorted,
     (
@@ -494,11 +510,44 @@ process_deep_to_feedback(RequestedFeedba
         MaybeCandidateParallelConjunctionsOpts = 
             yes(CandidateParallelConjunctionsOpts),
         candidate_parallel_conjunctions(CandidateParallelConjunctionsOpts,
-            Deep, !Feedback)
+            Deep, Warnings, !Feedback)
     ;
-        MaybeCandidateParallelConjunctionsOpts = no
+        MaybeCandidateParallelConjunctionsOpts = no,
+        Warnings = cord.empty
+    ),
+    cord.map_pred(warning_to_string, Warnings, WarningStrs).
+
+%----------------------------------------------------------------------------%
+
+:- type warning
+    --->    warning(
+                warning_proc        :: string_proc_label,
+                warning_maybe_gp    :: maybe(goal_path),
+                warning_string      :: string
     ).
 
+:- pred warning_to_string(warning::in, string::out) is det.
+
+warning_to_string(warning(Proc, MaybeGoalPath, Warning), String) :-
+    ProcString = string(Proc), 
+    (
+        MaybeGoalPath = yes(GoalPath),
+        GoalPathString = goal_path_to_string(GoalPath),
+        string.format("In %s at %s: %s", 
+            [s(ProcString), s(GoalPathString), s(Warning)], String)
+    ;
+        MaybeGoalPath = no,
+        string.format("In %s: %s",
+            [s(ProcString), s(Warning)], String)
+    ).
+
+:- pred append_warning(string_proc_label::in, maybe(goal_path)::in, string::in,
+    cord(warning)::in, cord(warning)::out) is det.
+
+append_warning(StringProcLabel, MaybeGoalPath, WarningStr, !Warnings) :-
+    Warning = warning(StringProcLabel, MaybeGoalPath, WarningStr),
+    !:Warnings = cord.snoc(!.Warnings, Warning).
+
 %----------------------------------------------------------------------------%
 %
 % Build the candidate parallel conjunctions feedback information used for
@@ -512,10 +561,10 @@ process_deep_to_feedback(RequestedFeedba
 %
 
 :- pred candidate_parallel_conjunctions(
-    candidate_parallel_conjunctions_opts::in, deep::in,
+    candidate_parallel_conjunctions_opts::in, deep::in, cord(warning)::out,
     feedback_info::in, feedback_info::out) is det.
 
-candidate_parallel_conjunctions(Opts, Deep, !Feedback) :-
+candidate_parallel_conjunctions(Opts, Deep, Warnings, !Feedback) :-
     Opts = candidate_parallel_conjunctions_opts(DesiredParallelism,
         SparkingCost, LockingCost, ProcThreshold, _CallSiteThreshold), 
     % First retrieve the top procedures above the configured threshold.
@@ -544,10 +593,8 @@ candidate_parallel_conjunctions(Opts, De
     % Take the top procs list and look for conjunctions that can be
     % parallelised and give an estimated speedup when parallelised.  There may
     % be more than one opportunity for parallelism in any given procedure.
-    list.map(
-        candidate_parallel_conjunctions_proc(Opts, Deep), 
-        TopProcsList, Conjunctions0),
-    list.condense(Conjunctions0, Conjunctions),
+    candidate_parallel_conjunctions_procs(Opts, Deep, TopProcsList,
+        [], Conjunctions, cord.empty, Warnings),
 
     % XXX: Analysing the clique tree to reduce the amount of nested parallel
     % execution should be done here.
@@ -566,15 +613,35 @@ candidate_parallel_conjunctions(Opts, De
                 ipi_var_table   :: var_table
             ).
 
+:- pred candidate_parallel_conjunctions_procs(
+    candidate_parallel_conjunctions_opts::in, deep::in,
+    list(perf_row_data(proc_desc))::in,
+    assoc_list(string_proc_label, candidate_par_conjunction)::in,
+    assoc_list(string_proc_label, candidate_par_conjunction)::out,
+    cord(warning)::in, cord(warning)::out) is det.
+
+candidate_parallel_conjunctions_procs(_, _, [], !Candidates, !Warnings).
+candidate_parallel_conjunctions_procs(Opts, Deep, [PrefRowData | PrefRowDatas],
+        !Candidates, !Warnings) :-
+    candidate_parallel_conjunctions_proc(Opts, Deep, PrefRowData, Candidates,
+        Warnings),
+    % This partially reverses the list of candidates, but order shouldn't be
+    % important.
+    !:Candidates = Candidates ++ !.Candidates,
+    !:Warnings = !.Warnings ++ Warnings,
+    candidate_parallel_conjunctions_procs(Opts, Deep, PrefRowDatas,
+        !Candidates, !Warnings).
+
     % Find candidate parallel conjunctions within the given procedure.
     %
 :- pred candidate_parallel_conjunctions_proc(
     candidate_parallel_conjunctions_opts::in, deep::in,
     perf_row_data(proc_desc)::in,
-    assoc_list(string_proc_label, candidate_par_conjunction)::out) is det.
+    assoc_list(string_proc_label, candidate_par_conjunction)::out,
+    cord(warning)::out) is det.
 
-candidate_parallel_conjunctions_proc(Opts, Deep, PrefRowData, 
-        Candidates) :-
+candidate_parallel_conjunctions_proc(Opts, Deep, PrefRowData, Candidates,
+        Warnings) :-
     % Lookup the proc static to find the ProcLabel.
     PSPtr = PrefRowData ^ perf_row_subject ^ pdesc_ps_ptr,  
     deep_lookup_proc_statics(Deep, PSPtr, PS),
@@ -612,60 +679,64 @@ candidate_parallel_conjunctions_proc(Opt
         Info = implicit_parallelism_info(Deep, ProgRep, Opts, CallSitesMap,
             VarTable),
         goal_get_conjunctions_worth_parallelising(Goal, empty_goal_path, Info,
-            initial_inst_map(ProcDefnRep), _, Candidates0,
-            SeenDuplicateInstantiation, _),
-        (
-            SeenDuplicateInstantiation = seen_duplicate_instantiation,
-            Candidates = []
-        ;
-            SeenDuplicateInstantiation = have_not_seen_duplicate_instantiation,
+            ProcLabel, Candidates0, _, _,
+            Warnings, initial_inst_map(ProcDefnRep), _),
             list.map((pred(Candidate0::in, Candidate::out) is det :-
                     Candidate = (ProcLabel - Candidate0)
                 ), Candidates0, Candidates)
-        )
     ;
         % Builtin procedures cannot be found in the program representation, and
         % cannot be parallelised either.
-        Candidates = []
+        Candidates = [],
+        append_warning(ProcLabel, no, 
+            warning_cannot_lookup_proc_defn,
+            cord.empty, Warnings)
     ).
     
 :- pred goal_get_conjunctions_worth_parallelising(goal_rep::in, goal_path::in,
-    implicit_parallelism_info::in, inst_map::in, inst_map::out,
+    implicit_parallelism_info::in, string_proc_label::in, 
     list(candidate_par_conjunction)::out, seen_duplicate_instantiation::out,
-    maybe_call_conjunct::out) is det.
+    maybe_call_conjunct::out, cord(warning)::out, inst_map::in, inst_map::out) 
+    is det.
 
-goal_get_conjunctions_worth_parallelising(Goal, GoalPath, Info, !InstMap,
-        Candidates, SeenDuplicateInstantiation, MaybeCallConjunct) :-
+goal_get_conjunctions_worth_parallelising(Goal, GoalPath, Info, ProcLabel, 
+        Candidates, SeenDuplicateInstantiation, MaybeCallConjunct,
+        Warnings, !InstMap) :-
     Goal = goal_rep(GoalExpr, Detism, _),
     (
         (
             GoalExpr = conj_rep(Conjuncts),
             conj_get_conjunctions_worth_parallelising(Conjuncts, GoalPath,
-                Info, !InstMap, Candidates, SeenDuplicateInstantiation)
+                Info, ProcLabel, Candidates, SeenDuplicateInstantiation,
+                Warnings, !InstMap) 
         ;
             GoalExpr = disj_rep(Disjuncts),
             disj_get_conjunctions_worth_parallelising(Disjuncts, GoalPath, 1,
-                Info, !InstMap, Candidates, SeenDuplicateInstantiation)
+                Info, ProcLabel, Candidates, SeenDuplicateInstantiation,
+                Warnings, !InstMap)
         ;
             GoalExpr = switch_rep(_, _, Cases),
             switch_case_get_conjunctions_worth_parallelising(Cases, GoalPath, 1,
-                Info, !InstMap, Candidates, SeenDuplicateInstantiation)
+                Info, ProcLabel, Candidates, SeenDuplicateInstantiation,
+                Warnings, !InstMap)
         ;
             GoalExpr = ite_rep(Cond, Then, Else),
             ite_get_conjunctions_worth_parallelising(Cond, Then, Else,
-                GoalPath, Info, !InstMap, Candidates,
-                SeenDuplicateInstantiation)
+                GoalPath, Info, ProcLabel, Candidates,
+                SeenDuplicateInstantiation, Warnings, !InstMap)
         ;
             GoalExpr = scope_rep(SubGoal, MaybeCut),
             ScopeGoalPath = 
                 goal_path_add_at_end(GoalPath, step_scope(MaybeCut)),
             goal_get_conjunctions_worth_parallelising(SubGoal, ScopeGoalPath,
-                Info, !InstMap, Candidates, SeenDuplicateInstantiation, _) 
+                Info, ProcLabel, Candidates, SeenDuplicateInstantiation, _,
+                Warnings, !InstMap) 
         ;
             GoalExpr = negation_rep(SubGoal),
             NegGoalPath = goal_path_add_at_end(GoalPath, step_neg),
             goal_get_conjunctions_worth_parallelising(SubGoal, NegGoalPath,
-                Info, !InstMap, Candidates, SeenDuplicateInstantiation, _) 
+                Info, ProcLabel, Candidates, SeenDuplicateInstantiation, _,
+                Warnings, !InstMap) 
         ),
         % TODO: Parallelising conjunctions like 
         %   ( call1(A, B) , not call2(C, D) )
@@ -677,9 +748,11 @@ goal_get_conjunctions_worth_parallelisin
         GoalExpr = atomic_goal_rep(_, _, BoundVars, AtomicGoal),
         InstMapBeforeCall = !.InstMap,
         % The binding of a variable may depend on any number of other
-        % variables, and of course the variables they depend upon.  Except that
-        % variables involved in control flow (switched on vars, vars in ITE
-        % conds) are not recorded here, but should be for completeness.
+        % variables, and recursively the variables that those depended-on
+        % variables depend upon.  
+        % Except that variables involved in control flow (switched on vars,
+        % vars in ITE conds) however this never comes up as for-now we only
+        % consider atomic goals.
         atomic_goal_get_vars(AtomicGoal, AtomicGoalVars0),
         list.foldl((pred(Var::in, Set0::in, Set::out) is det :-
                 ( set.remove(Set0, Var, SetPrime) ->
@@ -692,88 +765,115 @@ goal_get_conjunctions_worth_parallelisin
             SeenDuplicateInstantiation),
         maybe_call_site_conjunct(Info, GoalPath, AtomicGoal, Detism,
             InstMapBeforeCall, !.InstMap, BoundVars, MaybeCallConjunct),
-        Candidates = []
+        Candidates = [],
+        Warnings = cord.empty
     ).
 
 :- pred conj_get_conjunctions_worth_parallelising(list(goal_rep)::in,
-    goal_path::in, implicit_parallelism_info::in, inst_map::in,
-    inst_map::out, list(candidate_par_conjunction)::out,
-    seen_duplicate_instantiation::out) is det.
+    goal_path::in, implicit_parallelism_info::in, string_proc_label::in,
+    list(candidate_par_conjunction)::out, seen_duplicate_instantiation::out, 
+    cord(warning)::out, inst_map::in, inst_map::out) is det.
 
 conj_get_conjunctions_worth_parallelising(Conjs, GoalPath, Info,
-        !InstMap, Candidates, SeenDuplicateInstantiation) :-
+        ProcLabel, Candidates, SeenDuplicateInstantiation, Warnings, 
+        !InstMap) :-
     % Note: it will be better to look at each pair of conjuncts, determine if
     % they are parallelisable (perhaps by placing middle goals in either of the
     % the parallel conjuncts to create the optimum amount of parallelism.  This
     % will need to have an in-order representation of goals, and for each
     % variable seen have a tree of variables it depends upon.
     %
-    % For now consider parallelising conjuncts that seperated only by other
+    % For now consider parallelising conjuncts that separated only by other
     % atomic goals.
     conj_get_conjunctions_worth_parallelising_2(Conjs, GoalPath, 1, Info,
-        !InstMap, Candidates0, CallSiteConjuncts, 
-        SeenDuplicateInstantiation),
+        ProcLabel, Candidates0, CallSiteConjuncts, 
+        SeenDuplicateInstantiation, WarningsA, !InstMap),
+        
+    build_candidate_conjunctions(Info, !.InstMap, GoalPath, ProcLabel,
+        list(CallSiteConjuncts), WarningsB, pqueue.init, CandidatesQueue),
+    Warnings0 = WarningsA ++ WarningsB,
+    % Pick best candidate from queue.
     (
-        % Only perform analysis at this point if it's not going to be
-        % thrown away later due to unhandled use of partial instantiation.
         SeenDuplicateInstantiation = have_not_seen_duplicate_instantiation,
-        build_candidate_conjunctions(Info, !.InstMap, GoalPath,
-            list(CallSiteConjuncts), pqueue.init, CandidatesQueue),
-        % Pick best candidate from queue.
         ( pqueue.remove(CandidatesQueue, _, Candidate, _) ->
-            Candidates = [Candidate | Candidates0]
+            Candidates = [Candidate | Candidates0],
+            (
+                pqueue.length(CandidatesQueue) = Length,
+                Length > 0
+            ->
+                append_warning(ProcLabel, yes(GoalPath),
+                    warning_extra_callpairs_in_conjunction(Length), 
+                    Warnings0, Warnings)
         ;
-            Candidates = Candidates0
+                Warnings = Warnings0
+            )
+        ;
+            Candidates = Candidates0,
+            Warnings = Warnings0
         )
     ;
         SeenDuplicateInstantiation = seen_duplicate_instantiation,
-        Candidates = Candidates0
+        Candidates = Candidates0,
+        (
+            pqueue.length(CandidatesQueue) = Length,
+            Length >= 1 
+        ->
+            append_warning(ProcLabel, yes(GoalPath), 
+                warning_duplicate_instantiation(Length), 
+                Warnings0, Warnings)
+        ;
+            Warnings = Warnings0
+        )
     ). 
 
 :- pred conj_get_conjunctions_worth_parallelising_2(list(goal_rep)::in,
     goal_path::in, int::in, implicit_parallelism_info::in, 
-    inst_map::in, inst_map::out, list(candidate_par_conjunction)::out,
-    cord(maybe_call_conjunct)::out,
-    seen_duplicate_instantiation::out) is det.
+    string_proc_label::in, list(candidate_par_conjunction)::out,
+    cord(maybe_call_conjunct)::out, seen_duplicate_instantiation::out,
+    cord(warning)::out, inst_map::in, inst_map::out) is det.
 
-conj_get_conjunctions_worth_parallelising_2([], _, _, _, !InstMap, [], cord.empty, 
-        have_not_seen_duplicate_instantiation).
+conj_get_conjunctions_worth_parallelising_2([], _, _, _, _, [], cord.empty, 
+        have_not_seen_duplicate_instantiation, cord.empty, !InstMap).
 conj_get_conjunctions_worth_parallelising_2([Conj | Conjs], GoalPath,
-        ConjunctNum, Info, !InstMap, Candidates, CallSiteConjuncts,
-        SeenDuplicateInstantiation) :-
+        ConjunctNum, Info, ProcLabel, Candidates, CallSiteConjuncts,
+        SeenDuplicateInstantiation, Warnings, !InstMap) :-
     ConjGoalPath = goal_path_add_at_end(GoalPath, step_conj(ConjunctNum)),
     goal_get_conjunctions_worth_parallelising(Conj, ConjGoalPath, Info,
-        !InstMap, CandidatesHead, SeenDuplicateInstantiationHead,
-        MaybeCallSiteConjunct), 
+        ProcLabel, CandidatesHead, SeenDuplicateInstantiationHead,
+        MaybeCallSiteConjunct, WarningsHead, !InstMap), 
     
     conj_get_conjunctions_worth_parallelising_2(Conjs, GoalPath, ConjunctNum+1,
-        Info, !InstMap, CandidatesTail, CallSiteConjuncts0,
-        SeenDuplicateInstantiationTail),
+        Info, ProcLabel, CandidatesTail, CallSiteConjuncts0,
+        SeenDuplicateInstantiationTail, WarningsTail, !InstMap),
 
     Candidates = CandidatesHead ++ CandidatesTail,
+    Warnings = WarningsHead ++ WarningsTail,
     CallSiteConjuncts = cord.cons(MaybeCallSiteConjunct, CallSiteConjuncts0),
     SeenDuplicateInstantiation = merge_seen_duplicate_instantiation(
         SeenDuplicateInstantiationHead,
         SeenDuplicateInstantiationTail).
 
 :- pred disj_get_conjunctions_worth_parallelising(list(goal_rep)::in,
-    goal_path::in, int::in, implicit_parallelism_info::in, inst_map::in,
-    inst_map::out, list(candidate_par_conjunction)::out,
-    seen_duplicate_instantiation::out) is det.
+    goal_path::in, int::in, implicit_parallelism_info::in, 
+    string_proc_label::in, list(candidate_par_conjunction)::out,
+    seen_duplicate_instantiation::out, cord(warning)::out,
+    inst_map::in, inst_map::out) is det.
 
-disj_get_conjunctions_worth_parallelising([], _, _, _, !InstMap, [],
-    have_not_seen_duplicate_instantiation).
+disj_get_conjunctions_worth_parallelising([], _, _, _, _, [],
+    have_not_seen_duplicate_instantiation, cord.empty, !InstMap).
 disj_get_conjunctions_worth_parallelising([Disj | Disjs], GoalPath, DisjNum,
-        Info, InstMap0, InstMap, Candidates, SeenDuplicateInstantiation) :-
+        Info, ProcLabel, Candidates, SeenDuplicateInstantiation, 
+        Warnings, InstMap0, InstMap) :-
     DisjGoalPath = goal_path_add_at_end(GoalPath, step_disj(DisjNum)),
     HeadDetism = Disj ^ goal_detism_rep,
     goal_get_conjunctions_worth_parallelising(Disj, DisjGoalPath, Info,
-        InstMap0, HeadInstMap, HeadCandidates, HeadSeenDuplicateInstantiation, 
-        _MaybeCallConjunct),
+        ProcLabel, HeadCandidates, HeadSeenDuplicateInstantiation, 
+        _MaybeCallConjunct, HeadWarnings, InstMap0, HeadInstMap),
     disj_get_conjunctions_worth_parallelising(Disjs, GoalPath, DisjNum + 1,
-        Info, InstMap0, TailInstMap, TailCandidates,
-        TailSeenDuplicateInstantiation),
+        Info, ProcLabel, TailCandidates, TailSeenDuplicateInstantiation,
+        TailWarnings, InstMap0, TailInstMap),
     Candidates = HeadCandidates ++ TailCandidates,
+    Warnings = HeadWarnings ++ TailWarnings,
     % merge_inst_map requires the detism of goals that produce both inst maps,
     % we can create fake values that satisfy merge_inst_map easily.
     (
@@ -789,25 +889,27 @@ disj_get_conjunctions_worth_parallelisin
         TailSeenDuplicateInstantiation).
 
 :- pred switch_case_get_conjunctions_worth_parallelising(list(case_rep)::in,
-    goal_path::in, int::in, implicit_parallelism_info::in, inst_map::in,
-    inst_map::out, list(candidate_par_conjunction)::out,
-    seen_duplicate_instantiation::out) is det.
+    goal_path::in, int::in, implicit_parallelism_info::in,
+    string_proc_label::in, list(candidate_par_conjunction)::out,
+    seen_duplicate_instantiation::out, cord(warning)::out, 
+    inst_map::in, inst_map::out) is det.
 
-switch_case_get_conjunctions_worth_parallelising([], _, _, _, !InstMap, [],
-    have_not_seen_duplicate_instantiation).
+switch_case_get_conjunctions_worth_parallelising([], _, _, _, _, [],
+    have_not_seen_duplicate_instantiation, cord.empty, !InstMap).
 switch_case_get_conjunctions_worth_parallelising([Case | Cases], GoalPath,
-        CaseNum, Info, InstMap0, InstMap, Candidates,
-        SeenDuplicateInstantiation) :-
+        CaseNum, Info, ProcLabel, Candidates, SeenDuplicateInstantiation,
+        Warnings, InstMap0, InstMap) :-
     Case = case_rep(_, _, Goal),
     HeadDetism = Goal ^ goal_detism_rep,
     CaseGoalPath = goal_path_add_at_end(GoalPath, step_switch(CaseNum, no)),
     goal_get_conjunctions_worth_parallelising(Goal, CaseGoalPath, Info,
-        InstMap0, HeadInstMap, HeadCandidates, HeadSeenDuplicateInstantiation, 
-        _MaybeCallConjs),
+        ProcLabel, HeadCandidates, HeadSeenDuplicateInstantiation, 
+        _MaybeCallConjs, HeadWarnings, InstMap0, HeadInstMap),
     switch_case_get_conjunctions_worth_parallelising(Cases, GoalPath, 
-        CaseNum + 1, Info, InstMap0, TailInstMap, TailCandidates,
-        TailSeenDuplicateInstantiation),
+        CaseNum + 1, Info, ProcLabel, TailCandidates,
+        TailSeenDuplicateInstantiation, TailWarnings, InstMap0, TailInstMap),
     Candidates = HeadCandidates ++ TailCandidates,
+    Warnings = HeadWarnings ++ TailWarnings,
     % merge_inst_map requires the detism of goals that produce both inst maps,
     % we can create fake values that satisfy merge_inst_map easily.
     (
@@ -823,24 +925,25 @@ switch_case_get_conjunctions_worth_paral
         TailSeenDuplicateInstantiation).
 
 :- pred ite_get_conjunctions_worth_parallelising(goal_rep::in, goal_rep::in,
-    goal_rep::in, goal_path::in, implicit_parallelism_info::in, inst_map::in,
-    inst_map::out, list(candidate_par_conjunction)::out,
-    seen_duplicate_instantiation::out) is det.
+    goal_rep::in, goal_path::in, implicit_parallelism_info::in,
+    string_proc_label::in, list(candidate_par_conjunction)::out,
+    seen_duplicate_instantiation::out, cord(warning)::out,
+    inst_map::in, inst_map::out) is det.
 
 ite_get_conjunctions_worth_parallelising(Cond, Then, Else, GoalPath, Info,
-        !InstMap, Candidates, SeenDuplicateInstantiation) :-
+        ProcLabel, Candidates, SeenDuplicateInstantiation, Warnings, !InstMap) :-
     CondGoalPath = goal_path_add_at_end(GoalPath, step_ite_cond),
     ThenGoalPath = goal_path_add_at_end(GoalPath, step_ite_then),
     ElseGoalPath = goal_path_add_at_end(GoalPath, step_ite_else),
     goal_get_conjunctions_worth_parallelising(Cond, CondGoalPath, Info,
-        !.InstMap, PostCondInstMap, CondCandidates,
-        CondSeenDuplicateInstantiation, _),
+        ProcLabel, CondCandidates, CondSeenDuplicateInstantiation, _,
+        CondWarnings, !.InstMap, PostCondInstMap),
     goal_get_conjunctions_worth_parallelising(Then, ThenGoalPath, Info, 
-        PostCondInstMap, PostThenInstMap, ThenCandidates,
-        ThenSeenDuplicateInstantiation, _),
+        ProcLabel, ThenCandidates, ThenSeenDuplicateInstantiation, _,
+        ThenWarnings, PostCondInstMap, PostThenInstMap),
     goal_get_conjunctions_worth_parallelising(Else, ElseGoalPath, Info, 
-        PostCondInstMap, PostElseInstMap, ElseCandidates,
-        ElseSeenDuplicateInstantiation, _),
+        ProcLabel, ElseCandidates, ElseSeenDuplicateInstantiation, _, 
+        ElseWarnings, PostCondInstMap, PostElseInstMap),
     Candidates = CondCandidates ++ ThenCandidates ++ ElseCandidates,
     (
         CondSeenDuplicateInstantiation = have_not_seen_duplicate_instantiation,
@@ -851,6 +954,7 @@ ite_get_conjunctions_worth_parallelising
     ;
         SeenDuplicateInstantiation = seen_duplicate_instantiation
     ),
+    Warnings = CondWarnings ++ ThenWarnings ++ ElseWarnings,
     ThenDetism = Then ^ goal_detism_rep,
     ElseDetism = Else ^ goal_detism_rep,
     !:InstMap = merge_inst_map(PostThenInstMap, ThenDetism, 
@@ -976,86 +1080,106 @@ var_get_mode(InstMapBefore, InstMapAfter
     % Note: this runs in quadratic time.
     %
 :- pred build_candidate_conjunctions(implicit_parallelism_info::in,
-    inst_map::in, goal_path::in, list(maybe_call_conjunct)::in,
+    inst_map::in, goal_path::in, string_proc_label::in, 
+    list(maybe_call_conjunct)::in, cord(warning)::out,
     pqueue(float, candidate_par_conjunction)::in, 
     pqueue(float, candidate_par_conjunction)::out) is det.
 
-build_candidate_conjunctions(_, _, _, [], !Candidates).
-build_candidate_conjunctions(Info, InstMap, GoalPath, [MaybeCall | MaybeCalls],
-        !Candidates) :-
+build_candidate_conjunctions(_, _, _, _, [], cord.empty, !Candidates).
+build_candidate_conjunctions(Info, InstMap, GoalPath, ProcLabel,
+        [MaybeCall | MaybeCalls], Warnings, !Candidates) :-
     (
         MaybeCall = call(_, _, _, CallSitePerf),
-        Cost = CallSitePerf ^ csf_summary_perf ^ perf_row_self 
-            ^ perf_row_callseqs_percall,
+        Cost = get_call_site_cost(CallSitePerf),
         ( Cost > float(Info ^ ipi_opts ^ cpc_call_site_threshold) ->
             % This conjunction is a call and is expensive enough to
             % parallelise, find some later conjunct to parallelise against it.
-            build_candidate_conjunctions_2(Info, InstMap, GoalPath, MaybeCall,
-                cord.empty, MaybeCalls, !Candidates)
+            build_candidate_conjunctions_2(Info, InstMap, GoalPath, ProcLabel,
+                MaybeCall, cord.empty, MaybeCalls, Warnings0, !Candidates)
             % XXX: pick the most expensive non-overlapping candidates from the
             % result.
         ;
-            true
+            Warnings0 = cord.empty
         )
     ;
-        MaybeCall = non_atomic_goal
+        MaybeCall = non_atomic_goal,
+        Warnings0 = cord.empty
     ;
-        MaybeCall = trivial_atomic_goal(_, _)
+        MaybeCall = trivial_atomic_goal(_, _),
+        Warnings0 = cord.empty
     ),
-    build_candidate_conjunctions(Info, InstMap, GoalPath, MaybeCalls,
-        !Candidates).
+    build_candidate_conjunctions(Info, InstMap, GoalPath, ProcLabel, MaybeCalls,
+        WarningsTail, !Candidates),
+    Warnings = Warnings0 ++ WarningsTail.
 
 :- pred build_candidate_conjunctions_2(implicit_parallelism_info::in,
-    inst_map::in, goal_path::in, maybe_call_conjunct::in(call),
-    cord(maybe_call_conjunct)::in, list(maybe_call_conjunct)::in,
-    pqueue(float, candidate_par_conjunction)::in, pqueue(float,
-    candidate_par_conjunction)::out) is det.
-
-build_candidate_conjunctions_2(_, _, _, _, _, [], !Candidates).
-build_candidate_conjunctions_2(Info, InstMap, GoalPath, CallA,
-        IntermediateGoals0, [MaybeCall | MaybeCalls], !Candidates) :-
+    inst_map::in, goal_path::in, string_proc_label::in, 
+    maybe_call_conjunct::in(call), cord(maybe_call_conjunct)::in,
+    list(maybe_call_conjunct)::in, cord(warning)::out,
+    pqueue(float, candidate_par_conjunction)::in, 
+    pqueue(float, candidate_par_conjunction)::out) is det.
+
+build_candidate_conjunctions_2(_, _, _, _, _, _, [], cord.empty, !Candidates).
+build_candidate_conjunctions_2(Info, InstMap, GoalPath, ProcLabel, CallA,
+        IntermediateGoals, [MaybeCall | MaybeCalls], Warnings, !Candidates) :-
+    (
+        some [!Warnings]
     (
         MaybeCall = call(_, _, _, CallSitePerf),
+            !:Warnings = cord.empty,
         CallB = MaybeCall,
-        Cost = call_site_perf_self_callseqs_percall(CallSitePerf),
+            Cost = get_call_site_cost(CallSitePerf),
         ( Cost > float(Info ^ ipi_opts ^ cpc_call_site_threshold) ->
-            % This conjunction is a call and is expensive enough to
+                % This conjunct is a call and is expensive enough to
             % parallelise.
             are_conjuncts_dependant(CallA, CallB, InstMap, Dependance),
             (
                 Dependance = conjuncts_are_dependant(DepVars),
-                compute_optimal_dependant_parallelisation(Info, CallA, CallB,
-                    DepVars, IntermediateGoals0, InstMap, CPCA, CPCB, Speedup)
+                    compute_optimal_dependant_parallelisation(Info, 
+                        CallA, CallB, DepVars, IntermediateGoals, InstMap,
+                        CPCA, CPCB, Speedup)
             ;
                 Dependance = conjuncts_are_independent,
-                compute_independant_parallelisation_speedup(Info, CallA, CallB, 
-                    length(IntermediateGoals0), CPCA, CPCB, Speedup)
+                    compute_independent_parallelisation_speedup(Info, 
+                        CallA, CallB, CPCA, CPCB, Speedup)
             ),
             % XXX: This threshold should be configurable.
             ( Speedup > 0.0 ->
-                % XXX: I think this should be a priority queue or somesuch.
+                    ( length(IntermediateGoals) = 0 -> 
                 GoalPathString = goal_path_to_string(GoalPath),
                 Candidate = candidate_par_conjunction(GoalPathString, 
                     CPCA, CPCB, Dependance, Speedup),
-                % So that the candidates with the greatest speedup are removed
-                % first multiply speedup by -1.0.
+                        % So that the candidates with the greatest speedup are
+                        % removed first multiply speedup by -1.0.
                 pqueue.insert(!.Candidates, Speedup * -1.0, Candidate,
                     !:Candidates)
             ;
+                        append_warning(ProcLabel, yes(GoalPath),
+                            warning_candidate_callpairs_not_adjacent,
+                            !Warnings)
+                    )
+                ;
                 true
             )
         ;
             % Don't recurse here, we don't parallelise over call goals.
-            true
+                append_warning(ProcLabel, yes(GoalPath),
+                    warning_cannot_parallelise_over_cheap_call_goal, !Warnings)
+            ),
+            Warnings = !.Warnings
         )
     ;
         MaybeCall = trivial_atomic_goal(_, _),
-        build_candidate_conjunctions_2(Info, InstMap, GoalPath, CallA,
-            cord.snoc(IntermediateGoals0, MaybeCall), MaybeCalls, !Candidates)
-    ;
-        MaybeCall = non_atomic_goal
-        % Don't recurse in this case, we don't parallelise over non-atomic goals
-        % yet.
+        build_candidate_conjunctions_2(Info, InstMap, GoalPath, ProcLabel, 
+            CallA, cord.snoc(IntermediateGoals, MaybeCall), MaybeCalls,
+            Warnings, !Candidates)
+    ;
+        MaybeCall = non_atomic_goal,
+        % Don't recurse in this case, we don't parallelise over non-atomic
+        % goals yet.
+        append_warning(ProcLabel, yes(GoalPath),
+            warning_cannot_parallelise_over_nonatomic_goal,
+            cord.empty, Warnings)
     ).
 
 :- pred are_conjuncts_dependant(maybe_call_conjunct::in(call),
@@ -1108,45 +1232,56 @@ add_output_var_to_set(var_mode_and_use(V
         true
     ).
 
-:- pred compute_independant_parallelisation_speedup(
+    % Retrieve the average cost of a call site.
+    %
+:- func get_call_site_cost(call_site_perf) = float.
+
+get_call_site_cost(CallSitePerf) = Cost :-
+    % XXX: This selects self csq, we need to include decendants.  and do we
+    % want percall?
+    CSFSummary = CallSitePerf ^ csf_summary_perf,
+    CostSelf = CSFSummary ^ perf_row_self 
+        ^ perf_row_callseqs_percall,
+    MaybePerfTotal = CSFSummary ^ perf_row_maybe_total, 
+    (
+        MaybePerfTotal = yes(PerfTotal),
+        CostTotal = PerfTotal ^ perf_row_callseqs_percall
+    ;
+        MaybePerfTotal = no,
+        CostTotal = 0.0
+    ),
+    Cost = CostSelf + CostTotal.  
+
+:- pred compute_independent_parallelisation_speedup(
     implicit_parallelism_info::in, 
     maybe_call_conjunct::in(call), maybe_call_conjunct::in(call),
-    int::in, candidate_par_conjunct::out, candidate_par_conjunct::out,
+    candidate_par_conjunct::out, candidate_par_conjunct::out,
     float::out) is det.
 
-compute_independant_parallelisation_speedup(Info, CallA, CallB, NumUnifications,
+compute_independent_parallelisation_speedup(Info, CallA, CallB, 
         CPCA, CPCB, Speedup) :-
-    CostA = call_site_perf_self_callseqs_percall(CallA ^ mccc_perf),
-    CostB = call_site_perf_self_callseqs_percall(CallB ^ mccc_perf),
+    CostA = get_call_site_cost(CallA ^ mccc_perf),
+    CostB = get_call_site_cost(CallB ^ mccc_perf),
     SequentialCost = CostA + CostB,
     ParallelCost = max(CostA, CostB) + 
         float(Info ^ ipi_opts ^ cpc_sparking_cost),
     Speedup = SequentialCost - ParallelCost,
-    ( CostA < CostB ->
-        NumUnificationsA = NumUnifications,
-        NumUnificationsB = 0
-    ;
-        NumUnificationsA = 0,
-        NumUnificationsB = NumUnifications
-    ),
-    call_site_conj_to_candidate_par_conjunct(Info, CallA, NumUnificationsA,
-        CPCA),
-    call_site_conj_to_candidate_par_conjunct(Info, CallB, NumUnificationsB,
-        CPCB).
+    call_site_conj_to_candidate_par_conjunct(Info, CallA, CPCA),
+    call_site_conj_to_candidate_par_conjunct(Info, CallB, CPCB).
 
 :- pred compute_optimal_dependant_parallelisation(
     implicit_parallelism_info::in,
     maybe_call_conjunct::in(call), maybe_call_conjunct::in(call),
     set(var_rep)::in, cord(maybe_call_conjunct)::in, inst_map::in,
-    candidate_par_conjunct::out, candidate_par_conjunct::out, float::out) 
-    is det.
+    candidate_par_conjunct::out, candidate_par_conjunct::out,
+    float::out) is det.
 
 compute_optimal_dependant_parallelisation(Info, CallA, CallB,
-        DepVars, IntermediateGoals, InstMap, CPCA, CPCB, Speedup) :-
-    CostA = call_site_perf_self_callseqs_percall(CallA ^ mccc_perf),
-    CostB = call_site_perf_self_callseqs_percall(CallB ^ mccc_perf),
+        DepVars, _IntermediateGoals, InstMap, CPCA, CPCB,
+        Speedup) :-
+    CostA = get_call_site_cost(CallA ^ mccc_perf),
+    CostB = get_call_site_cost(CallB ^ mccc_perf),
     SequentialCost = CostA + CostB,
-    NumUnifications = length(IntermediateGoals),
     ( singleton_set(DepVars, DepVar) ->
         % Only parallelise conjunctions with a single dependant variable for
         % now.
@@ -1162,16 +1297,7 @@ compute_optimal_dependant_parallelisatio
                 CostBeforeConsume = 
                     cost_until_to_cost_since_start(CostUntilConsume, CostB),
                 CostAfterConsume = 
-                    cost_until_to_cost_before_end(CostUntilConsume, CostB),
-                % Unfications between the calls don't bind any variables useful
-                % for the calls, so serialise them with the cheaper call.
-                ( CostA < CostB ->
-                    NumUnificationsA = NumUnifications,
-                    NumUnificationsB = 0
-                ;
-                    NumUnificationsA = 0,
-                    NumUnificationsB = NumUnifications
-                )
+                    cost_until_to_cost_before_end(CostUntilConsume, CostB)
             ;
                 inst_map_get_var_deps(InstMap, DepVar, DepVarDeps),
                 set.fold(get_var_use_add_to_queue(CallA ^ mccc_args), 
@@ -1197,14 +1323,10 @@ compute_optimal_dependant_parallelisatio
                 % lesser one should have the unifications added to it.  This
                 % maximises the amount of parallelism.
                 ( CostBeforeConsume0 > CostAfterProduction0 ->
-                    NumUnificationsA = 0,
-                    NumUnificationsB = NumUnifications,
                     CostBeforeProduction = CostBeforeProduction0,
                     CostBeforeConsume = CostA,
                     CostAfterConsume = 0.0
                 ;
-                    NumUnificationsA = NumUnifications,
-                    NumUnificationsB = 0,
                     CostBeforeProduction = 0.0,
                     CostBeforeConsume = CostA - CostAfterConsume0,
                     CostAfterConsume = CostAfterConsume0 
@@ -1221,14 +1343,10 @@ compute_optimal_dependant_parallelisatio
             error("Dependant var not in consumer's arguments")
         )
     ;
-        Speedup = -1.0,
-        NumUnificationsA = NumUnifications,
-        NumUnificationsB = 0
-    ),
-    call_site_conj_to_candidate_par_conjunct(Info, CallA, NumUnificationsA,
-        CPCA),
-    call_site_conj_to_candidate_par_conjunct(Info, CallB, NumUnificationsB,
-        CPCB).
+        Speedup = -1.0
+    ),
+    call_site_conj_to_candidate_par_conjunct(Info, CallA, CPCA),
+    call_site_conj_to_candidate_par_conjunct(Info, CallB, CPCB).
 
 :- pred get_var_use_from_args(list(var_mode_and_use)::in, var_rep::in, 
     var_use_info::out) is semidet.
@@ -1257,21 +1375,16 @@ get_var_use_add_to_queue(VarsModeAndUse,
         true
     ).
 
-:- func call_site_perf_self_callseqs_percall(call_site_perf) = float.
-
-call_site_perf_self_callseqs_percall(CSP) = 
-    CSP ^ csf_summary_perf ^ perf_row_self ^ perf_row_callseqs_percall.
-
 :- pred call_site_conj_to_candidate_par_conjunct(
     implicit_parallelism_info::in, maybe_call_conjunct::in(call),
-    int::in, candidate_par_conjunct::out) is det.
+    candidate_par_conjunct::out) is det.
 
-call_site_conj_to_candidate_par_conjunct(Info, Call, NumUnifications, CPC) :-
+call_site_conj_to_candidate_par_conjunct(Info, Call, CPC) :-
     Call = call(MaybeCallee, _Detism, Args, Perf),
     VarTable = Info ^ ipi_var_table,
     list.map(var_mode_use_to_var_in_par_conj(VarTable), Args, Vars),
-    Cost = call_site_perf_self_callseqs_percall(Perf),
-    CPC = candidate_par_conjunct(MaybeCallee, Vars, Cost, NumUnifications).
+    Cost = get_call_site_cost(Perf),
+    CPC = candidate_par_conjunct(MaybeCallee, Vars, Cost).
 
 :- pred var_mode_use_to_var_in_par_conj(var_table::in, var_mode_and_use::in,
     maybe(string)::out) is det.
@@ -1284,6 +1397,42 @@ var_mode_use_to_var_in_par_conj(VarTable
         MaybeName = no
     ).
 
+:- func warning_duplicate_instantiation(int) = string.
+
+warning_duplicate_instantiation(CandidateConjuncts) = 
+    string.format(
+        "%d conjunctions not parallelised: Seen duplicate instantiations",
+        [i(CandidateConjuncts)]).
+
+:- func warning_extra_callpairs_in_conjunction(int) = string.
+
+warning_extra_callpairs_in_conjunction(NumCPCs) =
+    string.format(
+        "%d potential call pairs not parallelised in this conjunction",
+        [i(NumCPCs)]).
+
+:- func warning_cannot_lookup_proc_defn = string.
+
+warning_cannot_lookup_proc_defn = 
+    "Could not look up proc defn, perhaps this procedure is built-in".
+
+:- func warning_candidate_callpairs_not_adjacent = string.
+
+warning_candidate_callpairs_not_adjacent =
+    "Two callpairs are difficult to parallelise because they are not adjacent".
+
+:- func warning_cannot_parallelise_over_cheap_call_goal = string.
+
+warning_cannot_parallelise_over_cheap_call_goal =
+    "Parallelising expensive call goals with cheap call goals between them is"
+    ++ " not supported".
+
+:- func warning_cannot_parallelise_over_nonatomic_goal = string.
+
+warning_cannot_parallelise_over_nonatomic_goal =
+    "Parallelising call goals with non-atomic goals between them is"
+    ++ " not supported".
+
 %----------------------------------------------------------------------------%
 %
 % Jerome's implicit parallelism feedback information.
Index: library/cord.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/cord.m,v
retrieving revision 1.14
diff -u -p -b -r1.14 cord.m
--- library/cord.m	12 Jan 2009 02:28:51 -0000	1.14
+++ library/cord.m	20 Jan 2009 04:35:50 -0000
@@ -156,8 +156,9 @@
     % foldl(F, C, A) = list.foldl(F, list(C), A).
     %
 :- func foldl(func(T, U) = U, cord(T), U) = U.
-:- pred foldl_pred(pred(T, U, U)::in(pred(in, in, out) is det), cord(T)::in,
-    U::in, U::out) is det.
+:- pred foldl_pred(pred(T, U, U), cord(T), U, U).
+:- mode foldl_pred(in(pred(in, in, out) is det), in, in, out) is det.
+:- mode foldl_pred(in(pred(in, di, uo) is det), in, di, uo) is det.
 
     % foldr(F, C, A) = list.foldr(F, list(C), A).
     %
Index: library/list.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/list.m,v
retrieving revision 1.177
diff -u -p -b -r1.177 list.m
--- library/list.m	13 Dec 2008 12:59:02 -0000	1.177
+++ library/list.m	22 Jan 2009 04:48:13 -0000
@@ -1309,6 +1309,14 @@
     pred(X, A, B, C)::in(pred(in, out, out, out) is semidet),
     list(X)::in, A::out, B::out, C::out) is semidet.
 
+    % find_index_of_match(Match, List, Index0, Index)
+    %
+    % Find the index of an item in the list for which Match is true where the
+    % first element in the list has the index Index0.
+    %
+:- pred list.find_index_of_match(pred(T), list(T), int, int).
+:- mode list.find_index_of_match(pred(in) is semidet, in, in, out) is semidet.
+
     % list.takewhile(Predicate, List, UptoList, AfterList) takes a
     % closure with one input argument, and calls it on successive members
     % of List as long as the calls succeed. The elements for which
@@ -2431,6 +2439,15 @@ list.find_first_map3(P, [X | Xs], A, B, 
         list.find_first_map3(P, Xs, A, B, C)
     ).
 
+list.find_index_of_match(Match, [X | Xs], Index0, Index) :-
+    ( Match(X) ->
+        Index = Index0
+    ;
+        find_index_of_match(Match, Xs, Index0 + 1, Index)
+    ).
+
+%----------------------------------------------------------------------------%
+
 list.takewhile(_, [], [], []).
 list.takewhile(P, [X | Xs], Ins, Outs) :-
     ( P(X) ->
Index: library/pqueue.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/pqueue.m,v
retrieving revision 1.28
diff -u -p -b -r1.28 pqueue.m
--- library/pqueue.m	10 Sep 2007 04:44:06 -0000	1.28
+++ library/pqueue.m	21 Jan 2009 02:41:49 -0000
@@ -68,6 +68,12 @@
     %
 :- func pqueue.from_assoc_list(assoc_list(K, V)) = pqueue(K, V).
 
+    % length(PQueue) = Length.
+    %
+    % Length is the number of items in PQueue
+    %
+:- func pqueue.length(pqueue(K, V)) = int.
+
 %---------------------------------------------------------------------------%
 %---------------------------------------------------------------------------%
 
@@ -167,6 +173,11 @@ pqueue.from_assoc_list(List) = PQueue :-
     pqueue.assoc_list_to_pqueue(List, PQueue).
 
 %---------------------------------------------------------------------------%
+
+pqueue.length(empty) = 0.
+pqueue.length(pqueue(D, _, _, _, _)) = D + 1.
+
+%---------------------------------------------------------------------------%
 %---------------------------------------------------------------------------%
 % Ralph Becket <rwab1 at cl.cam.ac.uk> 29/04/99
 %   Functional forms added.
Index: mdbcomp/feedback.m
===================================================================
RCS file: /home/mercury1/repository/mercury/mdbcomp/feedback.m,v
retrieving revision 1.4
diff -u -p -b -r1.4 feedback.m
--- mdbcomp/feedback.m	20 Oct 2008 06:31:28 -0000	1.4
+++ mdbcomp/feedback.m	23 Jan 2009 04:27:32 -0000
@@ -81,8 +81,8 @@
 
                 conjunctions        :: assoc_list(string_proc_label,
                                             candidate_par_conjunction)
-                    % Assoclist of module name and an assoclist of procedure
-                    % labels and candidate parallel conjunctions.
+                    % Assoclist of procedure labels and candidate parallel
+                    % conjunctions.
             ).
 
 :- inst feedback_data_query
@@ -122,11 +122,17 @@
 :- type candidate_par_conjunct
     --->    candidate_par_conjunct(
                 callee                  :: maybe(pair(string, string)),
+                    % If the name of the callee is known (it's not a HO call),
+                    % then store the module and symbol names here.
+                    % Note: arity and mode are not represented.
+                    
                 vars                    :: list(maybe(string)),
-                cost                    :: float,
-                include_unifications    :: int
-                    % The number of unifications between conjuncts that should
-                    % be executed in sequence with this call.
+                    % The names of variables (if used defined) given as
+                    % arguments to this call.
+                    
+                cost                    :: float
+                    % The cost of this call in call sequence counts.
+                   
             ).
 
 :- type conjuncts_are_dependant
@@ -241,7 +247,11 @@
 %-----------------------------------------------------------------------------%
 
 :- type feedback_info
-    ---> feedback_info(map(feedback_type, feedback_data)).
+    --->    feedback_info(
+                fi_map                          :: map(feedback_type, 
+                                                    feedback_data)
+                    % The actual feedback data as read from the feedback file.
+            ).
 
     % This type is used as a key for the data that may be fed back into the
     % compiler.
@@ -254,7 +264,7 @@
 
 get_feedback_data(Info, Data) :-
     feedback_data_type(Type, Data),
-    Info = feedback_info(Map),
+    Map = Info ^ fi_map,
     map.search(Map, Type, DataPrime),
     % This disjunction will either unify Data to DataPrime, or throw an
     % exception, the impure annotation is required so to avoid a compiler
@@ -273,9 +283,9 @@ get_feedback_data(Info, Data) :-
 put_feedback_data(Data, !Info) :-
     feedback_data_type(Type, Data),
     some [!Map] (
-        !.Info = feedback_info(!:Map),
+        !:Map = !.Info ^ fi_map,
         svmap.set(Type, Data, !Map),
-        !:Info = feedback_info(!.Map)
+        !:Info = !.Info ^ fi_map := !.Map
     ).
 
 %----------------------------------------------------------------------------%
@@ -526,7 +536,7 @@ write_feedback_file_2(Stream, ProgName, 
     io.nl(Stream, !IO),
     io.write_string(Stream, ProgName, !IO),
     io.nl(Stream, !IO),
-    Feedback = feedback_info(Map),
+    Map = Feedback ^ fi_map,
     map.values(Map, FeedbackList),
     io.write(Stream, FeedbackList, !IO),
     io.write_string(Stream, ".\n", !IO),
Index: mdbcomp/program_representation.m
===================================================================
RCS file: /home/mercury1/repository/mercury/mdbcomp/program_representation.m,v
retrieving revision 1.45
diff -u -p -b -r1.45 program_representation.m
--- mdbcomp/program_representation.m	6 Nov 2008 05:47:59 -0000	1.45
+++ mdbcomp/program_representation.m	16 Jan 2009 04:47:04 -0000
@@ -540,6 +540,30 @@
     %
 :- pred is_goal_path_separator(char::in) is semidet.
 
+    % A goal path stored in order for constant time access to elements at the
+    % start of the goal path.  Recall that the start of a goal path is the root
+    % of the tree of goals.
+    %
+    % XXX: Review the name of this type and related predicates.
+    %
+:- type goal_path_consable.
+
+    % Convert between a goal_path and a goal_path_consable.
+    %
+:- pred goal_path_consable(goal_path, goal_path_consable).
+:- mode goal_path_consable(in, out) is det.
+:- mode goal_path_consable(out, in) is det.
+
+    % goal_path_consable_remove_first(GP, GPHead, GPTail).
+    %
+    % GPHead is the first goal path step in the GP, GPTail is the tail (the
+    % goals other than the first).  This predicate is false if GP is empty.
+    %
+:- pred goal_path_consable_remove_first(goal_path_consable::in, 
+    goal_path_step::out, goal_path_consable::out) is semidet.
+
+%----------------------------------------------------------------------------%
+
     % User-visible head variables are represented by a number from 1..N,
     % where N is the user-visible arity.
     %
@@ -919,6 +943,18 @@ goal_path_step_to_string(step_atomic_ore
 goal_path_step_to_string(step_first) = "f;".
 goal_path_step_to_string(step_later) = "l;".
 
+:- type goal_path_consable
+    --->    goal_path_consable(
+                list(goal_path_step)
+                    % The list of goal path steps is stored in-order.
+            ).
+
+goal_path_consable(goal_path(ListRev), goal_path_consable(List)) :-
+    reverse(List, ListRev).
+
+goal_path_consable_remove_first(goal_path_consable([H | T]), H,
+    goal_path_consable(T)).
+
 %-----------------------------------------------------------------------------%
 
 detism_rep(Detism) = Rep :-
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20090123/03ebf613/attachment.sig>


More information about the reviews mailing list