[m-rev.] For post-commit review: Correct problems in the automatic parallelism analysis.

Paul Bone pbone at csse.unimelb.edu.au
Thu Oct 7 13:42:27 AEDT 2010


For post commit review by Zoltan.

Anyone may review the changes to the standard library and news file.

Thanks.

---

Correct problems in the automatic parallelism analysis.

This patch fixes various problems, the most significant is the calculation of
variable use information.  The parallelisation analysis uses deep profiling
data.  In other words, profiling data that is attached to context information
referring to not just the procedure but the chain of calls leading to that
invocation of that procedure (modulo recursion).  The variable use analysis did
not use deep profiling data, therefore comparing the time that a variable is
produced with a call to the time in total of that call was not sound, and
sometimes resulted in information that is not possible, such as a variable
being produced or consumed after the call that produces or consumes it has
exited.

This change-set updates the variable use analysis to use deep profiling data to
avoid these problems.  At the same time it provides more accurate information
to the automatic parallelisation pass.  This is possible because of an earlier
change that allowed the coverage data to use deep profiling data.

In its current state, the parallelisation analysis now finishes without errors
and computes meaningful results when analysing a profile of the mercury
compiler's execution.

deep_profiler/report.m:
    The proc var use report is now a call site dynamic var use report.
       1) It now uses deep profiling data.
       2) It makes more sense from the callers perspective so it's now based
          around a call site rather than a proc.

    Add inst subtypes to the recursion_type type.

deep_profiler/query.m:
    The proc var use query is now a call site dynamic var use query, see
    report.m.

deep_profiler/var_use_analysis.m:
    Fix a bug here and in mdprof_fb.automatic_parallelism.m: If a
    variable is consumed by a call and appears in it's argument list more than
    once, take the earliest consumption time rather than the one for the
    earliest argument.

    Variable use analysis now uses recursion_patterns.m to correctly compute
    the cost of recursive calls.  It also uses 'deep' profiler data.

    Only measure variable use relative to the entry into a procedure, rather
    than either relative to the entry or exit.  This allows us to simplify a
    lot of code.

deep_profiler/create_report.m:
    The proc var use info report is now a call site dynamic var use info
    report.

    Move some utility code from here to the new analysis_utils.m module.

deep_profiler/display_report.m:
    Conform to changes in report.m.

    Improve the information displayed for variable first-use time
    reports.

deep_profiler/mdprof_fb.automatic_parallelism.m:
    Conform to changes in report.m

    Refactored the walk down the clique tree.  This no-longer uses the
    clique reports from the deep profiling tool.

    We now explore the same static procedure more than once.  It may be best to
    parallelise it in some contexts rather than others but for now we assume
    that the benefits in some context are worth the costs without benefit in
    the other contexts.  This is better than reaching a context where it is
    undesirable first and never visiting a case where parallelisation is
    desirable.

    Fix a bug in the calculation of how much parallelisation is used by
    parallelisations in a clique's parents.  This used to trigger an
    assertion.

    Don't try to parallelize anything in the "exception" module.
    There's probably other builtin code we should skip over here.

    Removed an overzealous assertion that was too easily triggered by the
    inaccuracies of IEEE-754 arithmetic.

    Compute variable use information lazily for each variable in each call.  I
    believe that this has made our implementation much faster as it no-longer
    computes information that is never used.

    Refactor and move build_recursive_call_site_cost_map to the new
    module analysis_utils.m where it can be used by other analyses.

    Call site cost maps now use the cs_cost_csq type to store costs,
    code in this module now conforms to this change.

    Conform to changes in messages.m

deep_profiler/recursion_patterns.m:
    Export a new predicate, recursion_type_get_maybe_avg_max_depth/2.  This
    retrieves the average maximum recursion depth from recursion types that know
    this information.

    Move code that builds a call site cost map for a procedure to
    analysis_utils.m where it can be used by other analyses.

deep_profiler/analysis_utils.m:
    Added a new module containing various utility predicates for profile
    analysis.

deep_profiler/coverage.m:
    Added an extra utility predicate get_coverage_after/2.

deep_profiler/message.m:
    Each message has a location that it refers to, a new location type has
    been added: call_site_dynamic.

    Added a new warning that can be used to describe when a call site's
    argument's use time cannot be computed.

    Added new predicates for printing out messages whose level is below a
    certain threshold.  These predicates can be called from io trace goals.
    Message levels start at 0 and currently go to 4, more critical messages
    have lower levels.  The desired verbosity level is stored in a module local
    mutable.

deep_profiler/mdprof_feedback.m:
    Move the message printing code from here to message.m.

deep_profiler/old_html_format.m:
deep_profiler/old_query.m:
    Conform to changes in query.m.

mdbcomp/feedback.automatic_parallelism.m:
    Added a new function for computing the 'cpu time' of a parallel
    computation.

library/lazy.m:
    Moved lazy.m from extras to the standard library.

library/list.m:
    Add a new predicate member_index0/3.  Like member/2 except it also gives
    the zero-based index of the current element within the list.

library/maybe.m:
    Add two new insts.
        maybe_yes(I) for the maybe type's yes/1 constructor.
        maybe_error_ok(I) for the maybe_error type's ok/1 constructor.

library/Mercury.options:
    Add a work around for compiling lazy.m with intermodule optimisations.

NEWS:
    Update news file for the addition of lazy.m and the member_index0 predicate
    in list.m

deep_profiler/.cvsignore:
    Ignore feedback.automatic_parallelism.m which is copied by Mmakefile from
    the mdbcomp/ directory.

Index: NEWS
===================================================================
RCS file: /home/mercury1/repository/mercury/NEWS,v
retrieving revision 1.535
diff -u -p -b -r1.535 NEWS
--- NEWS	6 Oct 2010 04:01:31 -0000	1.535
+++ NEWS	7 Oct 2010 02:18:48 -0000
@@ -16,6 +16,13 @@ Changes to the Mercury standard library:
   to the predicates of the same name in the list module, but do the relevant
   operations on keys instead of entire list elements.
 
+* We have added a new module, lazy,to the standard library.  It facilitates
+  the construction of lazy data structures.
+
+* We have added a new predicate to the list module of the standard library.
+  list.member_index0/3.  It is like list.member/2 except that it also takes a
+  parameter representing the zero-based index of the element within the list.
+
 Changes to the Mercury compiler:
 
 * Support for building and linking against frameworks on Mac OS X has
Index: deep_profiler/.cvsignore
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/.cvsignore,v
retrieving revision 1.15
diff -u -p -b -r1.15 .cvsignore
--- deep_profiler/.cvsignore	5 May 2009 08:05:29 -0000	1.15
+++ deep_profiler/.cvsignore	7 Oct 2010 02:04:49 -0000
@@ -31,6 +31,7 @@ DEEP_FLAGS
 *.opt
 *.optdate
 config.log
+feedback.automatic_parallelism.m
 feedback.m
 mdbcomp.m
 prim_data.m
Index: deep_profiler/analysis_utils.m
===================================================================
RCS file: deep_profiler/analysis_utils.m
diff -N deep_profiler/analysis_utils.m
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ deep_profiler/analysis_utils.m	6 Oct 2010 23:29:36 -0000
@@ -0,0 +1,377 @@
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et
+%-----------------------------------------------------------------------------%
+% Copyright (C) 2008-2010 The University of Melbourne.
+% This file may only be copied under the terms of the GNU General
+% Public License - see the file COPYING in the Mercury distribution.
+%-----------------------------------------------------------------------------%
+%
+% File: analysis_utils.m.
+% Author: pbone.
+%
+% This module contains utility code that is useful for writing profile
+% analyses.
+%
+%-----------------------------------------------------------------------------%
+
+:- module analysis_utils.
+:- interface.
+
+:- import_module mdbcomp.
+:- import_module mdbcomp.program_representation.
+:- import_module measurements.
+:- import_module profile.
+:- import_module report.
+
+:- import_module assoc_list.
+:- import_module list.
+:- import_module map.
+:- import_module maybe.
+:- import_module pair.
+:- import_module set.
+
+%----------------------------------------------------------------------------%
+    
+    % Instead of using the clique report above to find proc dynamics for a
+    % clique, use this as it is much faster.
+    %
+:- pred find_clique_first_and_other_procs(deep::in, clique_ptr::in, 
+    maybe(proc_dynamic_ptr)::out, list(proc_dynamic_ptr)::out) is det.
+
+%----------------------------------------------------------------------------%
+    
+    % Lookup a procedure representation from the deep structure.
+    %
+    % (Perhaps this should be a new report).
+    %
+:- pred deep_get_maybe_procrep(deep::in, proc_static_ptr::in, 
+    maybe_error(proc_rep)::out) is det.
+    
+%----------------------------------------------------------------------------%
+
+:- type cost_and_callees
+    --->    cost_and_callees(
+                cac_cost            :: cs_cost_csq,
+                cac_callees         :: set(callee),
+                cac_call_site_is_ho :: higher_order
+            ).
+
+:- type callee
+    --->    callee(
+                c_clique            :: clique_ptr,
+                c_csd               :: call_site_dynamic_ptr
+            ).
+
+:- type higher_order
+    --->    first_order_call
+    ;       higher_order_call.
+
+:- pred build_call_site_cost_and_callee_map(deep::in, 
+    pair(call_site_static_ptr, call_site_array_slot)::in,
+    map(goal_path, cost_and_callees)::in, map(goal_path, cost_and_callees)::out)
+    is det.
+
+%----------------------------------------------------------------------------%
+
+    % Estimate the cost of the recursive calls under the assumption that
+    % current call to this procedure had a particular cost.
+    %
+:- pred build_recursive_call_site_cost_map(deep, clique_ptr,
+    proc_dynamic_ptr, recursion_type, maybe(float), maybe_error(map(goal_path,
+    cs_cost_csq))) is det.
+:- mode build_recursive_call_site_cost_map(in, in, in,
+    in(recursion_type_known_costs), in(maybe_yes(ground)), 
+    out(maybe_error_ok(ground))) is det.
+:- mode build_recursive_call_site_cost_map(in, in, in, in, in, out) is det.
+
+%----------------------------------------------------------------------------%
+
+:- pred proc_dynamic_paired_call_site_slots(deep::in, proc_dynamic_ptr::in,
+    assoc_list(call_site_static_ptr, call_site_array_slot)::out) is det.
+
+%----------------------------------------------------------------------------%
+            
+:- pred cost_and_callees_is_recursive(clique_ptr::in, cost_and_callees::in) 
+    is semidet.
+
+%----------------------------------------------------------------------------%
+%----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module program_representation_utils.
+
+:- import_module array.
+:- import_module bool.
+:- import_module cord.
+:- import_module float.
+:- import_module int.
+:- import_module io.
+:- import_module message.
+:- import_module pair.
+:- import_module require.
+:- import_module string.
+:- import_module svmap.
+
+%----------------------------------------------------------------------------%
+
+find_clique_first_and_other_procs(Deep, CliquePtr, MaybeFirstPDPtr,
+        OtherPDPtrs) :-
+    deep_lookup_clique_members(Deep, CliquePtr, PDPtrs),
+    deep_lookup_clique_parents(Deep, CliquePtr, EntryCSDPtr),
+    ( valid_call_site_dynamic_ptr(Deep, EntryCSDPtr) ->
+        deep_lookup_call_site_dynamics(Deep, EntryCSDPtr, EntryCSD),
+        FirstPDPtr = EntryCSD ^ csd_callee,
+        MaybeFirstPDPtr = yes(FirstPDPtr),
+        list.negated_filter(unify(FirstPDPtr), PDPtrs,
+            OtherPDPtrs)
+    ;
+        MaybeFirstPDPtr = no,
+        OtherPDPtrs = PDPtrs
+    ).
+
+%----------------------------------------------------------------------------%
+
+deep_get_maybe_procrep(Deep, PSPtr, MaybeProcRep) :-
+    deep_get_maybe_progrep(Deep, MaybeProgRep),
+    (
+        MaybeProgRep = ok(ProgRep),
+        deep_lookup_proc_statics(Deep, PSPtr, PS),
+        ProcLabel = PS ^ ps_id,
+        ( progrep_search_proc(ProgRep, ProcLabel, ProcRep) ->
+            MaybeProcRep = ok(ProcRep)
+        ;
+            MaybeProcRep = error("Cannot find procedure representation")
+        )
+    ;
+        MaybeProgRep = error(Error),
+        MaybeProcRep = error(Error)
+    ).
+
+%----------------------------------------------------------------------------%
+
+build_call_site_cost_and_callee_map(Deep, CSSPtr - Slot, !CallSitesMap) :-
+    (
+        Slot = slot_normal(CSDPtr),
+        ( valid_call_site_dynamic_ptr(Deep, CSDPtr) ->
+            call_site_dynamic_get_callee_and_costs(Deep, CSDPtr, Callee, 
+                Own, Inherit),
+            CostCsq = build_cs_cost_csq(calls(Own), 
+                float(callseqs(Own) + inherit_callseqs(Inherit))),
+            Callees = [Callee]
+        ;
+            CostCsq = build_cs_cost_csq(0, 0.0),
+            Callees = []
+        )
+    ;
+        Slot = slot_multi(_, CSDPtrsArray),
+        to_list(CSDPtrsArray, CSDPtrs),
+        map3(call_site_dynamic_get_callee_and_costs(Deep), CSDPtrs, 
+            Callees, Owns, Inherits),
+        Own = sum_own_infos(Owns),
+        Inherit = sum_inherit_infos(Inherits),
+        CostCsq = build_cs_cost_csq(calls(Own), 
+            float(callseqs(Own) + inherit_callseqs(Inherit)))
+    ),
+    CostAndCallees = cost_and_callees(CostCsq, set(Callees), HigherOrder),
+    lookup_call_site_statics(Deep ^ call_site_statics, CSSPtr, CSS),
+    call_site_kind_to_higher_order(CSS ^ css_kind, HigherOrder),
+    goal_path_from_string_det(CSS ^ css_goal_path, GoalPath),
+    svmap.det_insert(GoalPath, CostAndCallees, !CallSitesMap).
+
+:- pred call_site_dynamic_get_callee_and_costs(deep::in, 
+    call_site_dynamic_ptr::in, callee::out, own_prof_info::out, 
+    inherit_prof_info::out) is det.
+
+call_site_dynamic_get_callee_and_costs(Deep, CSDPtr, 
+        callee(CalleeCliquePtr, CSDPtr), Own, Inherit) :-
+    lookup_call_site_dynamics(Deep ^ call_site_dynamics, CSDPtr, CSD),
+    lookup_csd_desc(Deep ^ csd_desc, CSDPtr, Inherit),
+    PDPtr = CSD ^ csd_callee,
+    lookup_clique_index(Deep ^ clique_index, PDPtr, CalleeCliquePtr), 
+    Own = CSD ^ csd_own_prof.
+
+:- pred call_site_kind_to_higher_order(call_site_kind_and_callee::in,
+    higher_order::out) is det.
+
+call_site_kind_to_higher_order(CallSiteKind, HigherOrder) :-
+    (
+        ( CallSiteKind = normal_call_and_callee(_, _)
+        ; CallSiteKind = callback_and_no_callee
+        ),
+        HigherOrder = first_order_call
+    ;
+        ( CallSiteKind = special_call_and_no_callee
+        ; CallSiteKind = higher_order_call_and_no_callee
+        ; CallSiteKind = method_call_and_no_callee
+        ),
+        HigherOrder = higher_order_call
+    ).
+
+%----------------------------------------------------------------------------%
+
+build_recursive_call_site_cost_map(Deep, CliquePtr, PDPtr, RecursionType,
+        MaybeDepth, MaybeRecursiveCallSiteCostMap) :-
+    (
+        RecursionType = rt_not_recursive,
+        MaybeRecursiveCallSiteCostMap = ok(map.init)
+    ;
+        RecursionType = rt_single(_, _, _AvgMaxDepth, _AvgRecCost, CostFn),
+        (
+            MaybeDepth = yes(DepthF),
+            Depth = round_to_int(DepthF)
+        ;
+            MaybeDepth = no,
+            error(this_file ++ "Expected valid depth for known recursion type")
+        ),
+    
+        get_recursive_calls_and_counts(Deep, CliquePtr, PDPtr,
+            CallCountsMap),
+        RecursiveCallSiteCostMap = map_values_only(
+            (func(Count) = 
+                build_cs_cost_csq_percall(float(Count), CostFn(Depth))),
+            CallCountsMap),
+        MaybeRecursiveCallSiteCostMap = ok(RecursiveCallSiteCostMap),
+        
+        trace [compile_time(flag("debug_recursive_call_costs")), io(!IO)] (
+            format_recursive_call_site_cost_map(
+                RecursiveCallSiteCostMap, PrettyCostMapCord),
+            PrettyCostMap = append_list(cord.list(PrettyCostMapCord)),
+            io.format("D: In clique %s recursive call site cost map"
+                    ++ " is:\n%s\n",
+                [s(string(CliquePtr)), s(PrettyCostMap)],
+                !IO),
+            io.flush_output(!IO)
+        )
+    ;
+        ( RecursionType = rt_divide_and_conquer(_, _)
+        ; RecursionType = rt_mutual_recursion(_)
+        ; RecursionType = rt_other(_)
+        ; RecursionType = rt_errors(_)
+        ),
+        (
+            ( RecursionType = rt_divide_and_conquer(_, _)
+            ; RecursionType = rt_mutual_recursion(_)
+            ),
+            Error = "No average recursion depth for this recursion type"
+        ;
+            RecursionType = rt_other(_),
+            Error = "Could not detect recursion type"
+        ;
+            RecursionType = rt_errors(Errors),
+            Error = join_list("; and ", Errors)
+        ),
+        MaybeRecursiveCallSiteCostMap = error(Error)
+    ).
+
+:- pred get_recursive_calls_and_counts(deep::in, clique_ptr::in, 
+    proc_dynamic_ptr::in, map(goal_path, int)::out) is det.
+
+get_recursive_calls_and_counts(Deep, CliquePtr, PDPtr, CallCountsMap) :-
+    proc_dynamic_paired_call_site_slots(Deep, PDPtr, SiteSlots),
+    foldl(build_recursive_call_site_counts_map(Deep, CliquePtr), SiteSlots, 
+        map.init, CallCountsMap).
+
+:- pred build_recursive_call_site_counts_map(deep::in, clique_ptr::in, 
+    pair(call_site_static_ptr, call_site_array_slot)::in,
+    map(goal_path, int)::in, map(goal_path, int)::out) is det.
+
+build_recursive_call_site_counts_map(Deep, CliquePtr, CSSPtr - CSDSlot, !Map) :-
+    (
+        CSDSlot = slot_normal(CSDPtr),
+        ( valid_call_site_dynamic_ptr(Deep, CSDPtr) ->
+            call_site_dynamic_get_count_and_callee(Deep, CSDPtr, Count,
+                MaybeCallee),
+            ( maybe_equals_or_is_no(CliquePtr, MaybeCallee) ->
+                Recursive = yes
+            ;
+                Recursive = no
+            )
+        ;
+            Recursive = no,
+            Count = 0 
+        )
+    ;
+        CSDSlot = slot_multi(_, CSDPtrs),
+        map2(call_site_dynamic_get_count_and_callee(Deep), to_list(CSDPtrs),
+            Counts, MaybeCallees),
+        Count = foldl(plus, Counts, 0),
+        (
+            member(MaybeCallee, MaybeCallees),
+            maybe_equals_or_is_no(CliquePtr, MaybeCallee)
+        ->
+            Recursive = yes
+        ;
+            Recursive = no
+        )
+    ),
+    (
+        Recursive = yes,
+        deep_lookup_call_site_statics(Deep, CSSPtr, CSS),
+        goal_path_from_string_det(CSS ^ css_goal_path, GoalPath),
+        svmap.det_insert(GoalPath, Count, !Map)
+    ;
+        Recursive = no
+    ).
+
+:- pred maybe_equals_or_is_no(T::in, maybe(T)::in) is semidet.
+
+maybe_equals_or_is_no(_, no).
+maybe_equals_or_is_no(X, yes(X)).
+
+:- pred call_site_dynamic_get_count_and_callee(deep::in, 
+    call_site_dynamic_ptr::in, int::out, maybe(clique_ptr)::out) is det.
+
+call_site_dynamic_get_count_and_callee(Deep, CSDPtr, Count, MaybeCallee) :-
+    ( valid_call_site_dynamic_ptr(Deep, CSDPtr) ->
+        deep_lookup_csd_own(Deep, CSDPtr, Own),
+        Count = calls(Own),
+        deep_lookup_clique_maybe_child(Deep, CSDPtr, MaybeCallee)
+    ;
+        Count = 0,
+        MaybeCallee = no
+    ).
+
+:- pred format_recursive_call_site_cost_map(map(goal_path, cs_cost_csq)::in, 
+    cord(string)::out) is det.
+
+format_recursive_call_site_cost_map(Map, Result) :-
+    map.foldl(format_recursive_call_site_cost, Map, cord.empty, Result).
+
+:- pred format_recursive_call_site_cost(goal_path::in, cs_cost_csq::in, 
+    cord(string)::in, cord(string)::out) is det.
+
+format_recursive_call_site_cost(GoalPath, Cost, !Result) :-
+    !:Result = cord.snoc(!.Result ++ indent(1), String),
+    String = format("%s -> Percall cost: %f Calls: %f\n",
+        [s(GoalPathString), f(PerCallCost), f(Calls)]),
+    GoalPathString = goal_path_to_string(GoalPath),
+    PerCallCost = cs_cost_get_percall(Cost),
+    Calls = cs_cost_get_calls(Cost).
+
+%----------------------------------------------------------------------------%
+
+proc_dynamic_paired_call_site_slots(Deep, PDPtr, PairedSlots) :-
+    deep_lookup_proc_dynamics(Deep, PDPtr, PD),
+    PSPtr = PD ^ pd_proc_static,
+    CSDArray = PD ^ pd_sites,
+    deep_lookup_proc_statics(Deep, PSPtr, PS),
+    CSSArray = PS ^ ps_sites,
+    array.to_list(CSDArray, CSDSlots),
+    array.to_list(CSSArray, CSSSlots),
+    assoc_list.from_corresponding_lists(CSSSlots, CSDSlots, PairedSlots).
+
+%----------------------------------------------------------------------------%
+
+cost_and_callees_is_recursive(ParentCliquePtr, CostAndCallees) :-
+    Callees = CostAndCallees ^ cac_callees,
+    member(Callee, Callees),
+    ParentCliquePtr = Callee ^ c_clique.
+
+%----------------------------------------------------------------------------%
+
+:- func this_file = string.
+
+this_file = "analysis_utils.m: ".
+
+%----------------------------------------------------------------------------%
Index: deep_profiler/coverage.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/coverage.m,v
retrieving revision 1.7
diff -u -p -b -r1.7 coverage.m
--- deep_profiler/coverage.m	23 Sep 2010 04:33:19 -0000	1.7
+++ deep_profiler/coverage.m	6 Oct 2010 23:29:36 -0000
@@ -44,6 +44,7 @@
 :- pred get_coverage_before(coverage_info::in, int::out) is semidet.
 :- pred get_coverage_before_and_after(coverage_info::in, int::out, int::out)
     is semidet.
+:- pred get_coverage_after(coverage_info::in, int::out) is semidet.
 
 %----------------------------------------------------------------------------%
     
@@ -95,14 +96,7 @@
 
 :- implementation.
 
-:- import_module message.
-:- import_module profile.
-:- import_module program_representation_utils.
-:- import_module report.
-
 :- import_module bool.
-:- import_module cord.
-:- import_module exception.
 :- import_module int.
 :- import_module io.
 :- import_module require.
@@ -118,6 +112,11 @@ get_coverage_before_and_after(coverage_k
 get_coverage_before_and_after(coverage_known_same(Count), Count, Count).
 get_coverage_before_and_after(coverage_known_zero, 0, 0).
 
+get_coverage_after(coverage_known(_, After), After).
+get_coverage_after(coverage_known_zero, 0).
+get_coverage_after(coverage_known_same(After), After).
+get_coverage_after(coverage_known_after(After), After).
+
 %-----------------------------------------------------------------------------%
 
 coverage_point_arrays_to_list(StaticArray, DynamicArray, CoveragePoints) :-
@@ -675,10 +674,6 @@ ite_annotate_coverage(Info, GoalPath, Be
         )
     ).
 
-:- pred not_unify(T::in, T::in) is semidet.
-
-not_unify(A, B) :- not unify(A, B).
-
     % Get the coverage information from a coverage point about the branch
     % referenced by the given goal path.
     %
@@ -958,59 +953,6 @@ check_coverage_complete(coverage_known_z
 %    ),
 %    goal_expr_is_trivial(GoalExpr).
 
-:- func goal_is_trivial(goal_rep(T)) = bool.
-
-goal_is_trivial(Goal) = IsTrivial:-
-    GoalExpr = Goal ^ goal_expr_rep,
-    IsTrivial = goal_expr_is_trivial(GoalExpr).
-
-:- func goal_expr_is_trivial(goal_expr_rep(T)) = bool.
-
-goal_expr_is_trivial(GoalRep) = IsTrivial :-
-    (
-        (
-            GoalRep = conj_rep(SubGoalReps)
-        ;
-            GoalRep = disj_rep(SubGoalReps)
-        ;
-            GoalRep = switch_rep(_, _, CaseReps),
-            SubGoalReps = list.map(project_case_rep_goal, CaseReps)
-        ;
-            GoalRep = ite_rep(CondRep, ThenRep, ElseRep),
-            SubGoalReps = [CondRep, ThenRep, ElseRep]
-        ),
-        SubGoalIsTrivials = list.map(goal_is_trivial, SubGoalReps),
-        bool.and_list(SubGoalIsTrivials, IsTrivial)
-    ;
-        ( GoalRep = negation_rep(SubGoalRep)
-        ; GoalRep = scope_rep(SubGoalRep, _)
-        ),
-        IsTrivial = goal_is_trivial(SubGoalRep)
-    ;
-        GoalRep = atomic_goal_rep(_, _, _, AtomicGoalRep),
-        (
-            ( AtomicGoalRep = plain_call_rep(_, _, _)
-            ; AtomicGoalRep = higher_order_call_rep(_, _)
-            ; AtomicGoalRep = method_call_rep(_, _, _)
-            ),
-            IsTrivial = no
-        ;
-            ( AtomicGoalRep = unify_construct_rep(_, _, _)
-            ; AtomicGoalRep = unify_deconstruct_rep(_, _, _)
-            ; AtomicGoalRep = partial_deconstruct_rep(_, _, _)
-            ; AtomicGoalRep = partial_construct_rep(_, _, _)
-            ; AtomicGoalRep = unify_assign_rep(_, _)
-            ; AtomicGoalRep = cast_rep(_, _)
-            ; AtomicGoalRep = unify_simple_test_rep(_, _)
-            % Built-in calls are cheap enough to consider to be trivial.
-            ; AtomicGoalRep = builtin_call_rep(_, _, _)
-            ; AtomicGoalRep = pragma_foreign_code_rep(_)
-            ; AtomicGoalRep = event_call_rep(_, _)
-            ),
-            IsTrivial = yes
-        )
-    ).
-
 %----------------------------------------------------------------------------%
 %
 % Coverage information helper predicates.
Index: deep_profiler/create_report.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/create_report.m,v
retrieving revision 1.25
diff -u -p -b -r1.25 create_report.m
--- deep_profiler/create_report.m	23 Sep 2010 04:33:19 -0000	1.25
+++ deep_profiler/create_report.m	6 Oct 2010 23:29:36 -0000
@@ -16,14 +16,11 @@
 :- module create_report.
 :- interface.
 
-:- import_module mdbcomp.
-:- import_module mdbcomp.program_representation.
 :- import_module measurements.
 :- import_module profile.
 :- import_module query.
 :- import_module report.
 
-:- import_module list.
 :- import_module maybe.
 
 %-----------------------------------------------------------------------------%
@@ -32,10 +29,11 @@
 
 %-----------------------------------------------------------------------------%
 
-    % Create a proc var use dump report.
+    % Create a CSD var use report.
     %
-:- pred create_proc_var_use_dump_report(deep::in, proc_static_ptr::in,
-    maybe_error(proc_var_use_dump_info)::out) is det.
+:- pred create_call_site_dynamic_var_use_report(deep::in,
+    call_site_dynamic_ptr::in,
+    maybe_error(call_site_dynamic_var_use_info)::out) is det.
 
     % Create a proc report.
     %
@@ -68,12 +66,6 @@
 :- pred create_clique_report(deep::in, clique_ptr::in,
     maybe_error(clique_report)::out) is det.
 
-    % Instead of using the clique report above to find proc dynamics for a
-    % clique, use this as it is much faster.
-    %
-:- pred find_clique_first_and_other_procs(deep::in, clique_ptr::in, 
-    maybe(proc_dynamic_ptr)::out, list(proc_dynamic_ptr)::out) is det.
-    
     % Create a proc_desc structure for a given proc static pointer.
     %
 :- func describe_proc(deep, proc_static_ptr) = proc_desc.
@@ -85,41 +77,33 @@
 :- pred own_and_inherit_to_perf_row_data(deep::in, T::in,
     own_prof_info::in, inherit_prof_info::in, perf_row_data(T)::out) is det.
 
-    % Lookup a procedure representation from the deep structure.
-    %
-    % (Perhaps this should be a new report).
-    %
-:- pred deep_get_maybe_procrep(deep::in, proc_static_ptr::in, 
-    maybe_error(proc_rep)::out) is det.
-
 %----------------------------------------------------------------------------%
 
 :- implementation.
 
+:- import_module analysis_utils.
 :- import_module apply_exclusion.
 :- import_module coverage.
 :- import_module exclude.
+:- import_module mdbcomp.
+:- import_module mdbcomp.program_representation.
 :- import_module measurement_units.
-:- import_module program_representation_utils.
 :- import_module recursion_patterns.
 :- import_module top_procs.
 :- import_module var_use_analysis.
 
 :- import_module array.
-:- import_module assoc_list.
 :- import_module char.
 :- import_module exception.
 :- import_module float.
 :- import_module int.
+:- import_module list.
 :- import_module map.
-:- import_module math.
 :- import_module pair.
 :- import_module require.
-:- import_module set.
 :- import_module string.
 :- import_module svmap.
 :- import_module unit.
-:- import_module univ.
 
 %-----------------------------------------------------------------------------%
 
@@ -225,9 +209,10 @@ create_report(Cmd, Deep, Report) :-
         create_clique_dump_report(Deep, CliquePtr, MaybeCliqueDump),
         Report = report_clique_dump(MaybeCliqueDump)
     ;
-        Cmd = deep_cmd_dump_proc_var_use(PSPtr),
-        create_proc_var_use_dump_report(Deep, PSPtr, MaybeProcVarUseDump),
-        Report = report_proc_var_use_dump(MaybeProcVarUseDump)
+        Cmd = deep_cmd_call_site_dynamic_var_use(CSDPtr),
+        create_call_site_dynamic_var_use_report(Deep, CSDPtr,
+            MaybeVarUse),
+        Report = report_call_site_dynamic_var_use(MaybeVarUse)
     ;
         Cmd = deep_cmd_restart,
         error("create_report/3", "unexpected restart command")
@@ -354,21 +339,6 @@ create_clique_report(Deep, CliquePtr, Ma
     CliqueReport = clique_report(CliquePtr, AncestorRowDatas, CliqueProcs),
     MaybeCliqueReport = ok(CliqueReport).
 
-find_clique_first_and_other_procs(Deep, CliquePtr, MaybeFirstPDPtr,
-        OtherPDPtrs) :-
-    deep_lookup_clique_members(Deep, CliquePtr, PDPtrs),
-    deep_lookup_clique_parents(Deep, CliquePtr, EntryCSDPtr),
-    ( valid_call_site_dynamic_ptr(Deep, EntryCSDPtr) ->
-        deep_lookup_call_site_dynamics(Deep, EntryCSDPtr, EntryCSD),
-        FirstPDPtr = EntryCSD ^ csd_callee,
-        MaybeFirstPDPtr = yes(FirstPDPtr),
-        list.negated_filter(unify(FirstPDPtr), PDPtrs,
-            OtherPDPtrs)
-    ;
-        MaybeFirstPDPtr = no,
-        OtherPDPtrs = PDPtrs
-    ).
-
 :- func find_clique_ancestors(deep, clique_ptr) =
     list(perf_row_data(ancestor_desc)).
 
@@ -474,14 +444,7 @@ create_clique_proc_dynamic_report(Deep, 
     list(clique_call_site_report)::out) is det.
 
 create_child_call_site_reports(Deep, PDPtr, CliqueCallSiteReports) :-
-    deep_lookup_proc_dynamics(Deep, PDPtr, PD),
-    PSPtr = PD ^ pd_proc_static,
-    CSDArray = PD ^ pd_sites,
-    deep_lookup_proc_statics(Deep, PSPtr, PS),
-    CSSArray = PS ^ ps_sites,
-    array.to_list(CSDArray, CSDSlots),
-    array.to_list(CSSArray, CSSSlots),
-    assoc_list.from_corresponding_lists(CSSSlots, CSDSlots, PairedSlots),
+    proc_dynamic_paired_call_site_slots(Deep, PDPtr, PairedSlots),
     list.map(create_child_call_site_report(Deep), PairedSlots,
         CliqueCallSiteReports).
 
@@ -1396,11 +1359,125 @@ create_clique_dump_report(Deep, CliquePt
         MaybeCliqueDumpInfo = error("invalid clique_ptr")
     ).
 
-:- pragma memo(create_proc_var_use_dump_report/3,
-    [specified([addr, value, output])]).
-create_proc_var_use_dump_report(Deep, PSPtr, MaybeProcVarUseDump) :-
-    generic_vars_first_use(head_vars_all, Deep, PSPtr, set.init,
-        MaybeProcVarUseDump).
+create_call_site_dynamic_var_use_report(Deep, CSDPtr, MaybeVarUseInfo) :-
+    ( valid_call_site_dynamic_ptr(Deep, CSDPtr) ->
+        deep_lookup_call_site_dynamics(Deep, CSDPtr, CSD),
+        CalleePDPtr = CSD ^ csd_callee,
+        CallerPDPtr = CSD ^ csd_caller,
+        deep_lookup_proc_dynamics(Deep, CalleePDPtr, CalleePD),
+        CalleePSPtr = CalleePD ^ pd_proc_static,
+        deep_get_maybe_procrep(Deep, CalleePSPtr, MaybeProcrep),
+        (
+            MaybeProcrep = ok(Procrep),
+            HeadVars = Procrep ^ pr_defn ^ pdr_head_vars,
+            VarTable = Procrep ^ pr_defn ^ pdr_var_table,
+            deep_lookup_clique_index(Deep, CallerPDPtr, ParentCliquePtr),
+            deep_lookup_clique_index(Deep, CalleePDPtr, CalleeCliquePtr),
+            create_clique_recursion_costs_report(Deep, ParentCliquePtr,
+                MaybeRecursiveCostsReport),
+            (
+                MaybeRecursiveCostsReport = ok(RecursiveCostsReport),
+                RecursionType = RecursiveCostsReport ^ crr_recursion_type,
+                ( ParentCliquePtr = CalleeCliquePtr ->
+                    get_recursive_csd_cost(Deep, CSDPtr, RecursionType,
+                        MaybeCost)
+                ;
+                    deep_lookup_csd_desc(Deep, CSDPtr, Desc),
+                    deep_lookup_csd_own(Deep, CSDPtr, Own),
+                    Cost0 = callseqs(Own) + inherit_callseqs(Desc),
+                    MaybeCost = ok(float(Cost0))
+                ),
+                (
+                    MaybeCost = ok(Cost),
+                    map_foldl(call_site_dynamic_var_use_arg(Deep, CSDPtr,
+                            RecursionType, Cost, VarTable), 
+                        HeadVars, Uses0, 0, _),
+                    list_maybe_error_to_maybe_error_list(Uses0, MaybeUses),
+                    (
+                        MaybeUses = ok(Uses),
+                        VarUseInfo = call_site_dynamic_var_use_info(Cost, Uses),
+                        MaybeVarUseInfo = ok(VarUseInfo)
+                    ;
+                        MaybeUses = error(Error),
+                        MaybeVarUseInfo = error(Error)
+                    )
+                ;
+                    MaybeCost = error(Error),
+                    MaybeVarUseInfo = error(Error)
+                )
+            ;
+                MaybeRecursiveCostsReport = error(Error),
+                MaybeVarUseInfo = error(Error)
+            )
+        ;
+            MaybeProcrep = error(Error),
+            MaybeVarUseInfo = error(Error)
+        )
+    ;
+        CSDPtr = call_site_dynamic_ptr(CSDNum),
+        MaybeVarUseInfo = error(format(
+            "Invalid call site dynamic %d", [i(CSDNum)]))
+    ).
+
+:- pred call_site_dynamic_var_use_arg(deep::in, call_site_dynamic_ptr::in,
+    recursion_type::in, float::in, var_table::in, head_var_rep::in,
+    maybe_error(var_use_and_name)::out, int::in, int::out) is det.
+
+call_site_dynamic_var_use_arg(Deep, CSDPtr, RecursionType, Cost, VarTable,
+        HeadVar, MaybeUseAndName, !ArgNum) :-
+    HeadVar = head_var_rep(Var, Mode),
+    var_mode_to_var_use_type(Mode, UseType),
+    call_site_dynamic_var_use_info(Deep, CSDPtr, !.ArgNum, RecursionType, 
+        Cost, UseType, MaybeUse),
+    (
+        MaybeUse = ok(Use),
+        lookup_var_name(VarTable, Var, Name),
+        MaybeUseAndName = ok(var_use_and_name(Name, Use))
+    ;
+        MaybeUse = error(Error),
+        MaybeUseAndName = error(Error)
+    ),
+    !:ArgNum = !.ArgNum + 1.
+
+:- pred list_maybe_error_to_maybe_error_list(list(maybe_error(T))::in,
+    maybe_error(list(T))::out) is det.
+
+list_maybe_error_to_maybe_error_list([], ok([])).
+list_maybe_error_to_maybe_error_list([error(E) | _], error(E)).
+list_maybe_error_to_maybe_error_list([ok(X) | MaybeXs0], MaybeXs) :-
+    list_maybe_error_to_maybe_error_list(MaybeXs0, MaybeXs1),
+    (
+        MaybeXs1 = ok(Xs1),
+        MaybeXs = ok([X | Xs1])
+    ;
+        MaybeXs1 = error(E),
+        MaybeXs = error(E)
+    ).
+
+:- pred get_recursive_csd_cost(deep::in, call_site_dynamic_ptr::in,
+    recursion_type::in, maybe_error(float)::out) is det.
+
+get_recursive_csd_cost(Deep, CSDPtr, RecursionType, MaybeCost) :-
+    (
+        RecursionType = rt_not_recursive,
+        MaybeCost = error(
+            "get_recursive_csd_cost called for non-recursive clique")
+    ;
+        RecursionType = rt_single(_, _, AvgMaxDepth, _, CostFn),
+        deep_lookup_csd_own(Deep, CSDPtr, Own),
+        Calls = float(calls(Own)),
+        MaybeCost = ok(CostFn(round_to_int(AvgMaxDepth) - 1) * Calls)
+    ;
+        ( RecursionType = rt_divide_and_conquer(_, _)
+        ; RecursionType = rt_mutual_recursion(_)
+        ; RecursionType = rt_other(_)
+        ),
+        MaybeCost = error(
+            "get_recursive_csd_cost: Unhandled recursion type")
+    ;
+        RecursionType = rt_errors(Errors),
+        MaybeCost = error(join_list("\n", Errors))
+    ).
 
 %-----------------------------------------------------------------------------%
 
@@ -1514,24 +1591,6 @@ own_and_maybe_inherit_to_perf_row_data(D
 
 %----------------------------------------------------------------------------%
 
-deep_get_maybe_procrep(Deep, PSPtr, MaybeProcRep) :-
-    deep_get_maybe_progrep(Deep, MaybeProgRep),
-    (
-        MaybeProgRep = ok(ProgRep),
-        deep_lookup_proc_statics(Deep, PSPtr, PS),
-        ProcLabel = PS ^ ps_id,
-        ( progrep_search_proc(ProgRep, ProcLabel, ProcRep) ->
-            MaybeProcRep = ok(ProcRep)
-        ;
-            MaybeProcRep = error("Cannot find procedure representation")
-        )
-    ;
-        MaybeProgRep = error(Error),
-        MaybeProcRep = error(Error)
-    ).
-
-%-----------------------------------------------------------------------------%
-
     % int_per_call(Num, Calls) is the quotient of Nom and Calls, after they've
     % both been cast to float.
     %
Index: deep_profiler/display_report.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/display_report.m,v
retrieving revision 1.29
diff -u -p -b -r1.29 display_report.m
--- deep_profiler/display_report.m	24 Sep 2010 00:00:12 -0000	1.29
+++ deep_profiler/display_report.m	6 Oct 2010 23:29:36 -0000
@@ -219,12 +219,12 @@ report_to_display(Deep, Prefs, Report) =
             Display = display(no, [display_heading(Msg)])
         )
     ;
-        Report = report_proc_var_use_dump(MaybeProcVarUseDumpInfo),
+        Report = report_call_site_dynamic_var_use(MaybeVarUseInfo),
         (
-            MaybeProcVarUseDumpInfo = ok(ProcVarUseDumpInfo),
-            display_report_proc_var_use_dump(Prefs, ProcVarUseDumpInfo, Display)
+            MaybeVarUseInfo = ok(VarUseInfo),
+            display_report_call_site_dynamic_var_use(Prefs, VarUseInfo, Display)
         ;
-            MaybeProcVarUseDumpInfo = error(Msg),
+            MaybeVarUseInfo = error(Msg),
             Display = display(no, [display_heading(Msg)])
         )
     ).
@@ -2346,61 +2346,53 @@ display_report_clique_dump_member(Prefs,
     PDData = td_l(Link),
     Pair = PDLabel - PDData.
 
-:- pred display_report_proc_var_use_dump(preferences::in,
-    proc_var_use_dump_info::in, display::out) is det.
+:- pred display_report_call_site_dynamic_var_use(preferences::in,
+    call_site_dynamic_var_use_info::in, display::out) is det.
 
-display_report_proc_var_use_dump(_Prefs, ProcVarUseDumpInfo, Display) :-
-    ProcVarUseDumpInfo = proc_var_use_dump_info(AverageProcCost, VarUses),
-    HeaderCell = table_cell(td_s("Average Proc Cost: ")),
-    DataCell = table_cell(td_f(AverageProcCost)),
-    ProcCostRow = table_row([HeaderCell, DataCell]),
-    format_proc_var_uses(VarUses, 1, VarUseRows),
-    Rows = [ProcCostRow | VarUseRows],
-    Table = table(table_class_do_not_box, 2, no, Rows),
-    Title = "Dump of procedure's var use info",
-    Display = display(yes(Title), [display_table(Table)]).
+display_report_call_site_dynamic_var_use(_Prefs, CSDVarUseInfo, Display) :-
+    CSDVarUseInfo = call_site_dynamic_var_use_info(AverageCost, VarUses),
+    AverageCostItem = display_text(
+        format("Average Cost: %f", [f(AverageCost)])),
+
+    format_var_uses(VarUses, 1, VarUseRows),
+    Header = table_header(
+        map((func({Text, Class}) = table_header_group(
+                table_header_group_single(td_s(Text)),
+                Class, column_do_not_colour)
+            ), [{"Argument", table_column_class_ordinal_rank},
+                {"Name", table_column_class_no_class},
+                {"Type", table_column_class_no_class},
+                {"Percent into proc", table_column_class_no_class},
+                {"Time into proc (Callseqs)", table_column_class_callseqs}])), 
+    Table = table(table_class_box_if_pref, 5, yes(Header), VarUseRows),
+
+    Title = "Dump of var use info",
+    Display = display(yes(Title), [AverageCostItem, display_table(Table)]).
 
-:- pred format_proc_var_uses(list(var_use_info)::in, int::in,
+:- pred format_var_uses(list(var_use_and_name)::in, int::in,
     list(table_row)::out) is det.
 
-format_proc_var_uses([], _, []).
-format_proc_var_uses([VarUse | VarUses], RowNum, [Row | Rows]) :-
+format_var_uses([], _, []).
+format_var_uses([VarUse | VarUses], RowNum, [Row | Rows]) :-
     HeaderCell = table_cell(td_s(format("Argument: %i", [i(RowNum)]))),
-    VarUse = var_use_info(CostUntilUse, UseType),
-    (
-        CostUntilUse = cost_since_proc_start(CostSince),
-        (
-            UseType = var_use_production,
-            string.format("%f from start until production",
-                [f(CostSince)], UseStr)
-        ;
-            UseType = var_use_consumption,
-            string.format("%f from start until consumption",
-                [f(CostSince)], UseStr)
-        ;
-            UseType = var_use_other,
-            string.format("%f from start until other use",
-                [f(CostSince)], UseStr)
-        )
-    ;
-        CostUntilUse = cost_before_proc_end(CostBefore),
+    VarUse = var_use_and_name(Name, 
+        var_use_info(CostUntilUse, ProcCost, UseType)),
+    NameCell = table_cell(td_s(Name)),
         (
             UseType = var_use_production,
-            string.format("%f from production until end",
-                [f(CostBefore)], UseStr)
+        TypeText = "production"
         ;
             UseType = var_use_consumption,
-            string.format("%f from consumption until end",
-                [f(CostBefore)], UseStr)
+        TypeText = "consumption"
         ;
             UseType = var_use_other,
-            string.format("%f from other use until end",
-                [f(CostBefore)], UseStr)
-        )
+        TypeText = "other/unknown"
     ),
-    DataCell = table_cell(td_s(UseStr)),
-    Row = table_row([HeaderCell, DataCell]),
-    format_proc_var_uses(VarUses, RowNum + 1, Rows).
+    TypeCell = table_cell(td_s(TypeText)),
+    PercentCell = table_cell(td_p(percent(CostUntilUse / ProcCost))),
+    TimeCell = table_cell(td_f(CostUntilUse)),
+    Row = table_row([HeaderCell, NameCell, TypeCell, PercentCell, TimeCell]),
+    format_var_uses(VarUses, RowNum + 1, Rows).
 
 %-----------------------------------------------------------------------------%
 %
@@ -3825,10 +3817,6 @@ proc_reports_controls(Prefs, Proc, NotCm
                 Cmd = deep_cmd_dump_proc_static(Proc),
                 Label = "Unprocessed proc static data",
                 Developer = yes
-            ;
-                Cmd = deep_cmd_dump_proc_var_use(Proc),
-                Label = "Var use report",
-                Developer = yes
             ),
             Cmd \= NotCmd,
             make_control(yes(Prefs), Cmd, Label, Developer, Control)
Index: deep_profiler/mdprof_fb.automatic_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/mdprof_fb.automatic_parallelism.m,v
retrieving revision 1.16
diff -u -p -b -r1.16 mdprof_fb.automatic_parallelism.m
--- deep_profiler/mdprof_fb.automatic_parallelism.m	27 Sep 2010 05:51:02 -0000	1.16
+++ deep_profiler/mdprof_fb.automatic_parallelism.m	7 Oct 2010 02:23:33 -0000
@@ -61,14 +61,14 @@
 
 :- implementation.
 
+:- import_module analysis_utils.
 :- import_module branch_and_bound.
-:- import_module coverage.
-:- import_module create_report.
 :- import_module measurement_units.
 :- import_module measurements.
 :- import_module program_representation_utils.
 :- import_module recursion_patterns.
 :- import_module report.
+:- import_module solutions.
 :- import_module var_use_analysis.
 
 :- import_module array.
@@ -78,16 +78,14 @@
 :- import_module float.
 :- import_module io.
 :- import_module int.
+:- import_module lazy.
 :- import_module list.
 :- import_module map.
 :- import_module maybe.
-:- import_module multi_map.
 :- import_module require.
 :- import_module set.
-:- import_module set_tree234.
 :- import_module string.
 :- import_module svmap.
-:- import_module svset.
 
 %----------------------------------------------------------------------------%
 %
@@ -95,9 +93,6 @@
 %	--trace-flag=debug_cpc_search
 %	  Debug the traversal through the clique tree.
 %
-%	--trace-flag=debug_recursive_costs 
-%     Debug the calculation of the costs of recursive call sites.
-%
 %   --trace-flag=debug_parallel_conjunction_speedup
 %     Print candidate parallel conjunctions that are pessimisations.
 %
@@ -114,7 +109,7 @@ candidate_parallel_conjunctions(Params, 
     % exist.
     RootParallelism = no_parallelism,
     candidate_parallel_conjunctions_clique(Params, Deep,
-        RootParallelism, RootCliquePtr, ConjunctionsMap, Messages, init, _),
+        RootParallelism, RootCliquePtr, ConjunctionsMap, Messages),
 
     map.to_assoc_list(ConjunctionsMap, ConjunctionsAssocList0),
     map_values_only(convert_candidate_par_conjunctions_proc(
@@ -155,8 +150,10 @@ pard_goal_detail_annon_to_pard_goal_anno
                 ipi_deep            :: deep,
                 ipi_progrep         :: prog_rep,
                 ipi_opts            :: candidate_par_conjunctions_params,
-                ipi_call_sites      :: map(goal_path, clique_call_site_report),
+                ipi_clique          :: clique_ptr,
+                ipi_call_sites      :: map(goal_path, cost_and_callees),
                 ipi_rec_call_sites  :: map(goal_path, cs_cost_csq),
+                ipi_recursion_type  :: recursion_type,
                 ipi_var_table       :: var_table,
                 ipi_proc_label      :: string_proc_label
             ).
@@ -194,7 +191,7 @@ pard_goal_detail_annon_to_pard_goal_anno
                 pgtc_args                   :: list(var_mode_and_use),
                     % The argument modes and use information.
 
-                pgtc_call_site              :: clique_call_site_report
+                pgtc_call_site              :: cost_and_callees
                     % The call site report from the deep profiler.
             )
     ;       pgt_other_atomic_goal
@@ -215,7 +212,7 @@ pard_goal_detail_annon_to_pard_goal_anno
     --->    var_mode_and_use(
                 vmu_var                 :: var_rep,
                 vmu_mode                :: var_mode_rep,
-                vmu_use                 :: var_use_info
+                vmu_use                 :: lazy(var_use_info)
             ).
 
 :- type candidate_par_conjunctions ==
@@ -245,21 +242,28 @@ pard_goal_detail_annon_to_pard_goal_anno
 :- pred candidate_parallel_conjunctions_clique(
     candidate_par_conjunctions_params::in, deep::in, 
     parallelism_amount::in, clique_ptr::in, candidate_par_conjunctions::out,
-    cord(message)::out,
-    set_tree234(proc_static_ptr)::in, set_tree234(proc_static_ptr)::out) 
-    is det.
+    cord(message)::out) is det.
 
 candidate_parallel_conjunctions_clique(Opts, Deep, ParentParallelism,
-        CliquePtr, Candidates, Messages, !VisitedProcs) :-
-    create_clique_report(Deep, CliquePtr, MaybeCliqueReport),
+        CliquePtr, Candidates, Messages) :-
+    find_clique_first_and_other_procs(Deep, CliquePtr, MaybeFirstPDPtr,
+        OtherPDPtrs),
     (
-        MaybeCliqueReport = ok(CliqueReport)
+        MaybeFirstPDPtr = yes(FirstPDPtr),
+        PDPtrs = [FirstPDPtr | OtherPDPtrs]
     ;
-        MaybeCliqueReport = error(ErrorA),
-        error(this_file ++ ErrorA)
+        MaybeFirstPDPtr = no,
+        deep_lookup_clique_index(Deep, Deep ^ root, RootCliquePtr),
+        ( CliquePtr = RootCliquePtr ->
+            % It's okay, this clique never has an entry procedure.
+            PDPtrs = OtherPDPtrs
+        ;
+            CliquePtr = clique_ptr(CliqueNum),
+            error(format("%sClique %d has no entry proc", 
+                [s(this_file), i(CliqueNum)]))
+        )
     ),
     
-    CliqueProcs = CliqueReport ^ cr_clique_procs,
     create_clique_recursion_costs_report(Deep, CliquePtr,
         MaybeRecursiveCostsReport),
     (
@@ -271,21 +275,21 @@ candidate_parallel_conjunctions_clique(O
     ),
     
     % Look for parallelisation opportunities in this clique.
-    map2_foldl(candidate_parallel_conjunctions_clique_proc(Opts, Deep,
-            RecursionType, ParentParallelism, CliquePtr), 
-        CliqueProcs, Candidatess, Messagess, !VisitedProcs),
+    map2(candidate_parallel_conjunctions_clique_proc(Opts, Deep,
+            RecursionType, CliquePtr), 
+        PDPtrs, Candidatess, Messagess),
     foldl(union(merge_candidate_par_conjs_proc), Candidatess,
         map.init, CliqueCandidates), 
     CliqueMessages = cord_list_to_cord(Messagess),
 
     % Look in descendent cliques.
     some [!ChildCliques] (
-        map(clique_proc_callees(Deep, ParentParallelism), CliqueProcs,
+        map(proc_dynamic_callees(Deep, ParentParallelism), PDPtrs,
             ChildCliquess),
         !:ChildCliques = cord_list_to_cord(ChildCliquess),
-        map2_foldl(candidate_parallel_conjunctions_callee(Opts, Deep,
+        map2(candidate_parallel_conjunctions_callee(Opts, Deep,
                 CliquePtr, CliqueCandidates),
-            list(!.ChildCliques), CSCandidatess, CSMessagess, !VisitedProcs)
+            list(!.ChildCliques), CSCandidatess, CSMessagess)
     ),
     foldl(union(merge_candidate_par_conjs_proc), CSCandidatess,
         map.init, CSCandidates), 
@@ -297,7 +301,8 @@ candidate_parallel_conjunctions_clique(O
     
 :- type candidate_child_clique
     --->    candidate_child_clique(
-                ccc_perf                :: perf_row_data(clique_desc),
+                ccc_clique              :: clique_ptr,
+                ccc_cs_cost             :: cs_cost_csq,
 
                 % The context of the call site that calls this clique.
                 ccc_proc                :: string_proc_label,
@@ -308,47 +313,70 @@ candidate_parallel_conjunctions_clique(O
                 ccc_parallelism         :: parallelism_amount
             ).
 
-:- pred clique_proc_callees(deep::in, parallelism_amount::in,
-    clique_proc_report::in, cord(candidate_child_clique)::out) is det.
-
-clique_proc_callees(Deep, Parallelism, CliqueProc, ChildCliques) :-
-    FirstProcDynamic = CliqueProc ^ cpr_first_proc_dynamic,
-    OtherProcDynamics = CliqueProc ^ cpr_other_proc_dynamics,
-    ProcDynamics = [FirstProcDynamic | OtherProcDynamics],
-    map(proc_dynamic_callees(Deep, Parallelism), ProcDynamics, ChildCliquess),
-    ChildCliques = cord_list_to_cord(ChildCliquess).
-
 :- pred proc_dynamic_callees(deep::in, parallelism_amount::in, 
-    clique_proc_dynamic_report::in, cord(candidate_child_clique)::out) is det.
+    proc_dynamic_ptr::in, cord(candidate_child_clique)::out) is det.
 
-proc_dynamic_callees(Deep, Parallelism, PD, ChildCliques) :-
-    CCSRs = PD ^ cpdr_call_sites,
-    ProcDesc = PD ^ cpdr_proc_summary ^ perf_row_subject,
-    proc_label_from_proc_desc(Deep, ProcDesc, ProcLabel),
-    map(call_site_callees(ProcLabel, Parallelism), CCSRs, ChildCliquess),
+proc_dynamic_callees(Deep, Parallelism, PDPtr, ChildCliques) :-
+    deep_lookup_proc_dynamics(Deep, PDPtr, PD),
+    PSPtr = PD ^ pd_proc_static,
+    deep_lookup_proc_statics(Deep, PSPtr, PS),
+    ProcLabel = PS ^ ps_id, 
+    proc_dynamic_paired_call_site_slots(Deep, PDPtr, Slots),
+    map(pd_slot_callees(Deep, Parallelism, ProcLabel), Slots, ChildCliquess),
     ChildCliques = cord_list_to_cord(ChildCliquess).
 
-:- pred call_site_callees(string_proc_label::in, parallelism_amount::in, 
-    clique_call_site_report::in, cord(candidate_child_clique)::out) is det.
-
-call_site_callees(ProcLabel, Parallelism, CCSR, ChildCliques) :-
-    GoalPath = CCSR ^ ccsr_call_site_summary ^ perf_row_subject ^
-        csdesc_goal_path, 
-    ChildCliquess = map(
-        (func(Callee) = 
-            candidate_child_clique(Callee, ProcLabel, GoalPath, Parallelism)),
-        CCSR ^ ccsr_callee_perfs),
-    ChildCliques = from_list(ChildCliquess).
+:- pred pd_slot_callees(deep::in, parallelism_amount::in,
+    string_proc_label::in, 
+    pair(call_site_static_ptr, call_site_array_slot)::in, 
+    cord(candidate_child_clique)::out) is det.
+
+pd_slot_callees(Deep, Parallelism, ProcLabel, CSSPtr - Slot, ChildCliques) :-
+    deep_lookup_call_site_statics(Deep, CSSPtr, CSS),
+    goal_path_from_string_det(CSS ^ css_goal_path, GoalPath),
+    (
+        Slot = slot_normal(CSDPtr),
+        call_site_dynamic_callees(Deep, Parallelism, ProcLabel, GoalPath,
+            CSDPtr, ChildCliques)
+    ;
+        Slot = slot_multi(_, CSDPtrs),
+        map(call_site_dynamic_callees(Deep, Parallelism, ProcLabel, GoalPath),
+            to_list(CSDPtrs), ChildCliquess),
+        ChildCliques = cord_list_to_cord(ChildCliquess)
+    ).
+
+:- pred call_site_dynamic_callees(deep::in, parallelism_amount::in,
+    string_proc_label::in, goal_path::in, call_site_dynamic_ptr::in,
+    cord(candidate_child_clique)::out) is det.
+
+call_site_dynamic_callees(Deep, Parallelism, ProcLabel, GoalPath, CSDPtr,
+        ChildCliques) :-
+    ( valid_call_site_dynamic_ptr(Deep, CSDPtr) ->
+        deep_lookup_clique_maybe_child(Deep, CSDPtr, MaybeClique),
+        (
+            MaybeClique = yes(CliquePtr),
+            deep_lookup_csd_own(Deep, CSDPtr, Own),
+            deep_lookup_csd_desc(Deep, CSDPtr, Desc),
+            Cost = build_cs_cost_csq(calls(Own), 
+                float(callseqs(Own) + inherit_callseqs(Desc))),
+            ChildCliques = singleton(
+                candidate_child_clique(CliquePtr, Cost, ProcLabel, GoalPath,
+                    Parallelism))
+        ;
+            MaybeClique = no,
+            ChildCliques = empty 
+        )
+    ;
+        ChildCliques = empty 
+    ).
 
 :- pred candidate_parallel_conjunctions_callee(
     candidate_par_conjunctions_params::in, deep::in,
     clique_ptr::in, candidate_par_conjunctions::in, 
     candidate_child_clique::in, candidate_par_conjunctions::out,
-    cord(message)::out,
-    set_tree234(proc_static_ptr)::in, set_tree234(proc_static_ptr)::out) is det.
+    cord(message)::out) is det.
 
 candidate_parallel_conjunctions_callee(Opts, Deep, CliquePtr, CliqueCandidates,
-        !.Callee, Candidates, Messages, !VisitedProcs) :-
+        !.Callee, Candidates, Messages) :-
     ( not_callee(CliquePtr, !.Callee) -> 
         ( cost_threshold(Opts, !.Callee) -> 
             update_parallelism_available(CliqueCandidates, !Callee),
@@ -374,9 +402,9 @@ candidate_parallel_conjunctions_callee(O
         AnalyzeChild = yes,
         Parallelism = !.Callee ^ ccc_parallelism,
         ChildCliquePtr = 
-            !.Callee ^ ccc_perf ^ perf_row_subject ^ cdesc_clique_ptr,
+            !.Callee ^ ccc_clique,
         candidate_parallel_conjunctions_clique(Opts, Deep, Parallelism,
-            ChildCliquePtr, Candidates, Messages, !VisitedProcs)
+            ChildCliquePtr, Candidates, Messages)
     ;
         AnalyzeChild = no,
         Candidates = map.init,
@@ -388,20 +416,13 @@ candidate_parallel_conjunctions_callee(O
 
 cost_threshold(Opts, ChildClique) :-
     Threshold = Opts ^ cpcp_clique_threshold,
-    Pref = ChildClique ^ ccc_perf,
-    MaybeCostChildren = Pref ^ perf_row_maybe_total,
-    (
-        MaybeCostChildren = yes(CostChildren),
-        CostChildren ^ perf_row_callseqs_percall >= float(Threshold)
-    ;
-        MaybeCostChildren = no
-        % What does this mean?  For now we recurse into these call sites.
-    ).
+    Cost = ChildClique ^ ccc_cs_cost,
+    cs_cost_get_percall(Cost) >= float(Threshold).
 
 :- pred not_callee(clique_ptr::in, candidate_child_clique::in) is semidet.
 
 not_callee(CliquePtr, ChildClique) :-
-    CliquePtr \= ChildClique ^ ccc_perf ^ perf_row_subject ^ cdesc_clique_ptr.
+    CliquePtr \= ChildClique ^ ccc_clique.
 
 :- pred exceeded_parallelism(candidate_par_conjunctions_params::in,
     candidate_child_clique::in) is semidet.
@@ -449,10 +470,10 @@ update_parallelism_available_conj(Conj, 
         % we use as many cores as the loop has iterations.  (Except for dead
         % time).
         Metrics = Conj ^ cpc_par_exec_metrics,
-        SeqTime = Metrics ^ pem_seq_time,
+        CPUTime = parallel_exec_metrics_get_cpu_time(Metrics),
         DeadTime = Metrics ^ pem_first_conj_dead_time + 
             Metrics ^ pem_future_dead_time,
-        Efficiency = (SeqTime - DeadTime) / SeqTime,
+        Efficiency = CPUTime / (CPUTime + DeadTime),
         Parallelism0 = !.ChildClique ^ ccc_parallelism,
         sub_computation_parallelism(Parallelism0, probable(Efficiency),
             Parallelism),
@@ -462,67 +483,44 @@ update_parallelism_available_conj(Conj, 
     ).
 
     % candidate_parallel_conjunctions_clique_proc(Opts, Deep, 
-    %   CliqueIsRecursive, ParentCostInfo, ProcsAnalysed, CliquePtr,
-    %   CliqueProc, Candidates, Messages) :-
-    %
-    % Find candidate parallel conjunctions within a clique_proc_report.
-    %
-    % ParentCostInfo gives the cost of the call site calling this clique so
-    % that we may correctly calculate the per-call costs of recursive cliques
-    % and their call sites.
+    %   RecursionType, CliquePtr, PDPtr, Candidates, Messages).
     %
-    % ProcsAnalysed keeps a set of procs we've visited to prevent unbound
-    % recursion in this algorithm.
+    % Find candidate parallel conjunctions within a proc dynamic.
     %
-    % CliqueProcMap is a map of proc_desc to clique_proc_report structures
-    % extracted from the clique_report.
+    % RecursionType: The type of recursion used in the current clique.
     %
-    % CliquePtr is the clique that this proc belongs to.
+    % CliquePtr: The current clique, 
     %
 :- pred candidate_parallel_conjunctions_clique_proc(
     candidate_par_conjunctions_params::in, deep::in, recursion_type::in,
-    parallelism_amount::in, clique_ptr::in, clique_proc_report::in, 
-    candidate_par_conjunctions::out, cord(message)::out,
-    set_tree234(proc_static_ptr)::in, set_tree234(proc_static_ptr)::out) 
-    is det.
+    clique_ptr::in, proc_dynamic_ptr::in, candidate_par_conjunctions::out,
+    cord(message)::out) is det.
 
 candidate_parallel_conjunctions_clique_proc(Opts, Deep, RecursionType,
-        _ParentParallelism, CliquePtr, CliqueProc, Candidates, Messages,
-        !VisitedProcs) :-
+        CliquePtr, PDPtr, Candidates, Messages) :-
     some [!Messages]
     (
         !:Messages = cord.empty,
             
         % Determine the costs of the call sites in the procedure.
-        build_recursive_call_site_cost_map(Deep, CliquePtr, CliqueProc,
-            RecursionType, RecursiveCallSiteCostMap, CSCMMessages),
-        !:Messages = !.Messages ++ CSCMMessages,
-        trace [compile_time(flag("debug_cpc_search")), io(!IO)] (
-            format_recursive_call_site_cost_map(
-                RecursiveCallSiteCostMap, PrettyCostMapCord),
-            PrettyCostMap = append_list(cord.list(PrettyCostMapCord)),
-            io.format("D: In clique %s recursive call site cost map"
-                    ++ " is:\n%s\n",
-                [s(string(CliquePtr)), s(PrettyCostMap)],
-                !IO),
-            io.flush_output(!IO)
-        ),
-
-        ProcDesc = CliqueProc ^ cpr_proc_summary ^ perf_row_subject,
-        PsPtr = ProcDesc ^ pdesc_ps_ptr,
+        recursion_type_get_interesting_parallelisation_depth(RecursionType,
+            MaybeDepth), 
+        build_recursive_call_site_cost_map(Deep, CliquePtr, PDPtr,
+            RecursionType, MaybeDepth, MaybeRecursiveCallSiteCostMap),
         (
-            not member(!.VisitedProcs, PsPtr)
-        ->
-            insert(PsPtr, !VisitedProcs),
-
-            % Analyse this procedure for parallelism opportunities.
-            candidate_parallel_conjunctions_proc(Opts, Deep, CliqueProc,
-                RecursiveCallSiteCostMap, Candidates,
-                _ProcCandidatesByGoalPath, ProcMessages),
-            !:Messages = !.Messages ++ ProcMessages
+            MaybeRecursiveCallSiteCostMap = ok(RecursiveCallSiteCostMap)
         ;
-            Candidates = map.init
+            MaybeRecursiveCallSiteCostMap = error(Error),
+            append_message(clique(CliquePtr),
+                warning_cannot_compute_cost_of_recursive_calls(Error),
+                !Messages),
+            RecursiveCallSiteCostMap = map.init
         ),
+
+        % Analyse this procedure for parallelism opportunities.
+        candidate_parallel_conjunctions_proc(Opts, Deep, PDPtr, RecursionType,
+            RecursiveCallSiteCostMap, Candidates, ProcMessages),
+        !:Messages = !.Messages ++ ProcMessages,
         Messages = !.Messages
     ).
 
@@ -545,200 +543,53 @@ merge_candidate_par_conjs_proc(A, B, Res
 
 %----------------------------------------------------------------------------%
 %
-% Estimate the cost of the recursive calls under the assumption that current
-% call to this procedure had a particular cost.
-%
-
-:- pred build_recursive_call_site_cost_map(deep::in, clique_ptr::in, 
-    clique_proc_report::in, recursion_type::in, 
-    map(goal_path, cs_cost_csq)::out, cord(message)::out)
-    is det.
-
-build_recursive_call_site_cost_map(Deep, CliquePtr, CliqueProc, RecursionType,
-        RecursiveCallSiteCostMap, Messages) :-
-    some [!Messages] (
-        !:Messages = cord.empty,
-
-        (
-            RecursionType = rt_not_recursive,
-            RecursiveCallSiteCostMap = map.init
-        ;
-            RecursionType = rt_single(_, _, _AvgMaxDepth, AvgRecCost, _),
-        
-            get_recursive_calls_and_counts(Deep, CliquePtr, CliqueProc,
-                CallCountsMap, NewMessagesA),
-            !:Messages = !.Messages ++ NewMessagesA,
-            RecursiveCallSiteCostMap = map_values_only(
-                (func(Count) = 
-                    build_cs_cost_csq_percall(float(Count), AvgRecCost)),
-                CallCountsMap)
-        ;
-            ( RecursionType = rt_divide_and_conquer(_, _)
-            ; RecursionType = rt_mutual_recursion(_)
-            ; RecursionType = rt_other(_)
-            ; RecursionType = rt_errors(_)
-            ),
-            (
-                ( RecursionType = rt_divide_and_conquer(_, _)
-                ; RecursionType = rt_mutual_recursion(_)
-                ),
-                Error = "No average recursion depth for this recursion type"
-            ;
-                RecursionType = rt_other(_),
-                Error = "Could not detect recursion type"
-            ;
-                RecursionType = rt_errors(Errors),
-                Error = join_list("; and ", Errors)
-            ),
-            RecursiveCallSiteCostMap = map.init,
-        
-            % Lookup the proc static to find the ProcLabel.
-            PerfRowData = CliqueProc ^ cpr_proc_summary,
-            ProcDesc = PerfRowData ^ perf_row_subject,
-            proc_label_from_proc_desc(Deep, ProcDesc, ProcLabel),
-
-            append_message(proc(ProcLabel),
-                warning_cannot_compute_cost_of_recursive_calls(Error),
-                !Messages)
-        ),
-        Messages = !.Messages
-    ).
-
-:- pred get_recursive_calls_and_counts(deep::in, clique_ptr::in, 
-    clique_proc_report::in, map(goal_path, int)::out, cord(message)::out) 
-    is det.
-
-get_recursive_calls_and_counts(Deep, CliquePtr, CliqueProc, CallCountsMap,
-        !:Messages) :-
-    !:Messages = cord.empty,
-    % XXX: Handle mutual recursion for the cases that we know how to
-    % analyse.
-    OtherProcDynamics = CliqueProc ^ cpr_other_proc_dynamics,
-    (
-        OtherProcDynamics = [_ | _],
-        % We don't yet handle mutual recursion, and this can only occur during
-        % mutual recursion for the non-entry procedure.
-        
-        % Lookup the proc static to find the ProcLabel.
-        PerfRowData = CliqueProc ^ cpr_proc_summary,
-        ProcDesc = PerfRowData ^ perf_row_subject,
-        proc_label_from_proc_desc(Deep, ProcDesc, ProcLabel),
-        
-        append_message(proc(ProcLabel),
-            error_extra_proc_dynamics_in_clique_proc, !Messages)
-    ;
-        OtherProcDynamics = []
-    ),
-    CallSites = CliqueProc ^ cpr_first_proc_dynamic ^ cpdr_call_sites,
-    foldl(build_recursive_call_site_counts_map(CliquePtr), CallSites, map.init, 
-        CallCountsMap).
-
-:- pred build_recursive_call_site_counts_map(clique_ptr::in, 
-    clique_call_site_report::in,
-    map(goal_path, int)::in, map(goal_path, int)::out) is det.
-
-build_recursive_call_site_counts_map(CliquePtr, CallSite, !Map) :-
-    Callees = CallSite ^ ccsr_callee_perfs,
-    (
-        some [Callee, CalleeClique] (
-            member(Callee, Callees),
-            CalleeClique = Callee ^ perf_row_subject ^ cdesc_clique_ptr,
-            CalleeClique = CliquePtr
-        )
-    ->
-        % This call site calls the same clique at least once, therefore we
-        % treat it as recursive.  Usually a call site has exactly one callee,
-        % the exception is higher order and method calls, and I've never seen a
-        % recursive higher order or method call.
-        GoalPath = CallSite ^ ccsr_call_site_summary ^ perf_row_subject ^
-            csdesc_goal_path,
-        Count = CallSite ^ ccsr_call_site_summary ^ perf_row_calls,
-        svmap.det_insert(GoalPath, Count, !Map)
-    ;
-        true
-    ).
-
-:- pred format_recursive_call_site_cost_map(map(goal_path, cs_cost_csq)::in, 
-    cord(string)::out) is det.
-
-format_recursive_call_site_cost_map(Map, Result) :-
-    map.foldl(format_recursive_call_site_cost, Map, cord.empty, Result).
-
-:- pred format_recursive_call_site_cost(goal_path::in, cs_cost_csq::in, 
-    cord(string)::in, cord(string)::out) is det.
-
-format_recursive_call_site_cost(GoalPath, Cost, !Result) :-
-    !:Result = cord.snoc(!.Result ++ indent(1), String),
-    String = format("%s -> Percall cost: %f Calls: %f\n",
-        [s(GoalPathString), f(PerCallCost), f(Calls)]),
-    GoalPathString = goal_path_to_string(GoalPath),
-    PerCallCost = cs_cost_get_percall(Cost),
-    Calls = cs_cost_get_calls(Cost).
-
-    % The stateful data of the goal_get_callsite_cost_info code below.
-    %
-:- type callsite_cost_info
-    --->    callsite_cost_info(
-                csci_non_rec_cs_cost    :: float,
-                csci_rec_calls          :: float,
-                csci_rec_cs             :: set(clique_call_site_report),
-                csci_cs_prob_map        :: map(goal_path, float)
-            ).
-
-%----------------------------------------------------------------------------%
-%
 % Search for paralleliation opportunities within a procedure.
 %
 
     % Find candidate parallel conjunctions within the given procedure.
     %
 :- pred candidate_parallel_conjunctions_proc(
-    candidate_par_conjunctions_params::in, deep::in, clique_proc_report::in,
-    map(goal_path, cs_cost_csq)::in, candidate_par_conjunctions::out,
-    multi_map(goal_path_string, 
-        candidate_par_conjunction(pard_goal_detail))::out,
-    cord(message)::out) is det.
+    candidate_par_conjunctions_params::in, deep::in, proc_dynamic_ptr::in,
+    recursion_type::in, map(goal_path, cs_cost_csq)::in,
+    candidate_par_conjunctions::out, cord(message)::out) is det.
 
-candidate_parallel_conjunctions_proc(Opts, Deep, CliqueProc,
-        RecursiveCallSiteCostMap, Candidates, CandidatesByGoalPath,
-        Messages) :-
+candidate_parallel_conjunctions_proc(Opts, Deep, PDPtr, RecursionType,
+        RecursiveCallSiteCostMap, Candidates, Messages) :-
     some [!Messages] (
         !:Messages = cord.empty,
 
         % Lookup the proc static to find the ProcLabel.
-        PerfRowData = CliqueProc ^ cpr_proc_summary,
-        ProcDesc = PerfRowData ^ perf_row_subject,
-        proc_label_from_proc_desc(Deep, ProcDesc, ProcLabel),
-        
-        CallSites = CliqueProc ^ cpr_first_proc_dynamic ^ cpdr_call_sites,
-        ( CliqueProc ^ cpr_other_proc_dynamics = [_ | _] ->
-            append_message(proc(ProcLabel), 
-                error_extra_proc_dynamics_in_clique_proc,
-                !Messages)
-        ;
-            true
-        ),
-        list.foldl(add_call_site_report_to_map,
-            CallSites, map.init, CallSitesMap),
-        deep_get_progrep_det(Deep, ProgRep),
+        deep_lookup_proc_dynamics(Deep, PDPtr, PD),
+        deep_lookup_proc_statics(Deep, PD ^ pd_proc_static, PS),
+        ProcLabel = PS ^ ps_id,
         ( ProcLabel = str_ordinary_proc_label(_, ModuleName, _, _, _, _)
         ; ProcLabel = str_special_proc_label(_, ModuleName, _, _, _, _)
         ),
+        
+        deep_get_progrep_det(Deep, ProgRep),
+        
         (
-            ModuleName = "Mercury runtime"
+            ( ModuleName = "Mercury runtime"
+            ; ModuleName = "exception"
+            )
         ->
             % Silently skip over any code from the runtime, we can't expect to
             % find it's procedure representation.
-            Candidates = map.init,
-            CandidatesByGoalPath = map.init
+            Candidates = map.init
         ;
             progrep_search_proc(ProgRep, ProcLabel, ProcRep) 
         ->
             ProcRep ^ pr_defn = ProcDefnRep,
             ProcDefnRep ^ pdr_goal = Goal0,
             ProcDefnRep ^ pdr_var_table = VarTable,
-            Info = implicit_parallelism_info(Deep, ProgRep, Opts,
-                CallSitesMap, RecursiveCallSiteCostMap, VarTable, ProcLabel),
+           
+            deep_lookup_clique_index(Deep, PDPtr, CliquePtr),
+            proc_dynamic_paired_call_site_slots(Deep, PDPtr, Slots),
+            foldl(build_call_site_cost_and_callee_map(Deep), Slots, 
+                map.init, CallSitesMap), 
+            Info = implicit_parallelism_info(Deep, ProgRep, Opts, CliquePtr,
+                CallSitesMap, RecursiveCallSiteCostMap, RecursionType,
+                VarTable, ProcLabel),
             goal_annotate_with_instmap(Goal0, Goal,
                 initial_inst_map(ProcDefnRep), _FinalInstMap,
                 SeenDuplicateInstantiation, _ConsumedVars, _BoundVars),
@@ -748,14 +599,12 @@ candidate_parallel_conjunctions_proc(Opt
             (
                 SeenDuplicateInstantiation =
                     have_not_seen_duplicate_instantiation,
-                list.foldl2(
+                list.foldl(
                     build_candidate_par_conjunction_maps(ProcLabel, VarTable),
-                    Candidates0, map.init, Candidates, 
-                    multi_map.init, CandidatesByGoalPath)
+                    Candidates0, map.init, Candidates)
             ;
                 SeenDuplicateInstantiation = seen_duplicate_instantiation,
                 Candidates = map.init,
-                CandidatesByGoalPath = map.init,
                 append_message(proc(ProcLabel), 
                     notice_duplicate_instantiation(length(Candidates0)),
                     !Messages)
@@ -764,7 +613,6 @@ candidate_parallel_conjunctions_proc(Opt
             % Builtin procedures cannot be found in the program representation,
             % and cannot be parallelised either.
             Candidates = map.init,
-            CandidatesByGoalPath = map.init,
             append_message(proc(ProcLabel), warning_cannot_lookup_proc_defn,
                 !Messages)
         ),
@@ -773,14 +621,9 @@ candidate_parallel_conjunctions_proc(Opt
 
 :- pred build_candidate_par_conjunction_maps(string_proc_label::in,
     var_table::in, candidate_par_conjunction(pard_goal_detail)::in, 
-    candidate_par_conjunctions::in, candidate_par_conjunctions::out,
-    multi_map(goal_path_string, 
-        candidate_par_conjunction(pard_goal_detail))::in,
-    multi_map(goal_path_string, 
-        candidate_par_conjunction(pard_goal_detail))::out)
-    is det.
+    candidate_par_conjunctions::in, candidate_par_conjunctions::out) is det.
 
-build_candidate_par_conjunction_maps(ProcLabel, VarTable, Candidate, !Map, !GPMap) :-
+build_candidate_par_conjunction_maps(ProcLabel, VarTable, Candidate, !Map) :-
     ( map.search(!.Map, ProcLabel, CandidateProc0) ->
         CandidateProc0 = candidate_par_conjunctions_proc(VarTablePrime, CPCs0),
         CPCs = [ Candidate | CPCs0 ],
@@ -794,13 +637,10 @@ build_candidate_par_conjunction_maps(Pro
         CPCs = [ Candidate ]
     ),
     CandidateProc = candidate_par_conjunctions_proc(VarTable, CPCs),
-    svmap.set(ProcLabel, CandidateProc, !Map),
-    GoalPath = Candidate ^ cpc_goal_path,
-    multi_map.set(!.GPMap, GoalPath, Candidate, !:GPMap).
+    svmap.set(ProcLabel, CandidateProc, !Map).
 
 :- pred goal_get_conjunctions_worth_parallelising(
-    implicit_parallelism_info::in,
-    goal_rep(inst_map_info)::in, goal_path::in, 
+    implicit_parallelism_info::in, goal_rep(inst_map_info)::in, goal_path::in,
     list(candidate_par_conjunction(pard_goal_detail))::out,
     cord(message)::out) is det.
 
@@ -924,7 +764,7 @@ ite_get_conjunctions_worth_parallelising
     % At the end of every conjunction we call this predicate to check the list
     % of calls we've found and make any parallelisation decisions.
     %
-:- pred conj_build_candidate_conjunctions( implicit_parallelism_info::in,
+:- pred conj_build_candidate_conjunctions(implicit_parallelism_info::in,
     list(goal_rep(inst_map_info))::in, goal_path::in,
     cord(message)::out, 
     list(candidate_par_conjunction(pard_goal_detail))::out) is det.
@@ -933,13 +773,13 @@ conj_build_candidate_conjunctions(Info, 
         Candidates) :-
     ProcLabel = Info ^ ipi_proc_label,
     Location = goal(ProcLabel, GoalPath),
-    promise_equivalent_solutions [Messages, Candidates]
     some [!Messages] 
     (
         !:Messages = cord.empty,
 
-        map_foldl(goal_to_pard_goal(Info, GoalPath, 
-            ( func(Num) = step_conj(Num) ) ), Conjs, PardGoals, 1, _),
+        map_foldl2(goal_to_pard_goal(Info, GoalPath, 
+            ( func(Num) = step_conj(Num) ) ), Conjs, PardGoals, 1, _,
+            !Messages),
         foldl(count_costly_calls, PardGoals, 0, NumCostlyCalls),
         ( NumCostlyCalls > 1 -> 
             append_message(Location,
@@ -1098,8 +938,10 @@ pardgoals_build_candidate_conjunction(In
                 ; Location = goal(ProcLabel, _)
                 )
             ;
-                Location = clique(_),
-                error("Location is a clique when it should be a proc " ++
+                ( Location = clique(_)
+                ; Location = call_site_dynamic(_)
+                ),
+                error("Location is a clique or CSD when it should be a proc " ++
                     "or goal")
             ),
 
@@ -1231,7 +1073,7 @@ new_group([G | Gs], P) = GoalGroup :-
 
                 gfp_dependency_graphs       :: dependency_graphs,
                 gfp_costly_call_indexes     :: list(int),
-                gfp_num_calls               :: int
+                gfp_num_calls               :: float
             ).
 
 :- inst goals_for_parallelisation
@@ -1258,7 +1100,7 @@ preprocess_conjunction(Goals0, Algorithm
         GoalType = FirstCostlyCall ^ goal_annotation ^ pgd_pg_type,
         GoalType = pgt_call(_, _, _, CallSite)
     ->
-        NumCalls = CallSite ^ ccsr_call_site_summary ^ perf_row_calls
+        NumCalls = cs_cost_get_calls(CallSite ^ cac_cost)
     ;
         error(this_file ++ "Expected call goal")
     ),
@@ -1376,7 +1218,7 @@ find_best_parallelisation_complete_bnb(I
     
     branch_and_bound(
         generate_parallelisations(Info, Location, PartNum, LastCostlyCallIndex, 
-            NumCalls, DependencyMaps, GoalGroups),
+            round_to_int(NumCalls), DependencyMaps, GoalGroups),
         parallelisation_get_objective_value,
         Solutions, Profile),
     
@@ -1611,8 +1453,8 @@ find_best_parallelisation_greedy(Info, _
             !:GoalGroups = [ FirstGroup | !.GoalGroups ],
             GoalsBeforeConj = []
         ),
-        start_building_parallelisation(Info, NumCalls, GoalsBeforeConj,
-            !:Parallelisation),
+        start_building_parallelisation(Info, round_to_int(NumCalls), 
+            GoalsBeforeConj, !:Parallelisation),
 
         build_parallel_conjuncts_greedy(Info, DependencyMaps,
             CostlyCallIndexes, 0, [], [], LastParConj, !ConjNum,
@@ -2319,10 +2161,13 @@ adjust_time_for_waits_2(LastEnd, !Time, 
         % Do the adjustment.
         !:Time = !.Time + (Start - LastEnd),
 
-        ( !.Time < Start ->
-            error("adjust_time_for_waits: Adjustment didn't work, " ++
-                "time occurs before the current execution")
-        ; !.Time < End ->
+% This has been commented out as it is easily triggered by floating point
+% rounding errors and I don't know enough about ieee754 to fix it.
+%        ( !.Time < Start ->
+%            error(format("adjust_time_for_waits: Adjustment didn't work, " ++
+%                "time occurs before the current execution. " ++
+%                "Time: %f, Start: %f.", [f(!.Time), f(Start)]))
+        ( !.Time < End ->
             % The adjustment worked.
             true
         ;
@@ -2345,8 +2190,18 @@ adjust_time_for_waits_2(LastEnd, !Time, 
     var_rep::in, map(var_rep, float)::in, map(var_rep, float)::out) is det.
 
 var_production_time_to_map(TimeBefore, Goal, Var, !Map) :-
-    var_first_use_time(find_production, TimeBefore, Goal, Var, Time),
-    svmap.det_insert(Var, Time, !Map).
+    solutions(var_first_use_time(find_production, TimeBefore, Goal, Var), 
+        Times),
+    (
+        Times = [Time],
+        % A production can only occur once in a call's arguments, therefore
+        % there is only one solution here.
+        svmap.det_insert(Var, Time, !Map)
+    ;
+        Times = [_, _ | _],
+        error(this_file ++ 
+            "Too many solutions for var_first_use_time for a production")
+    ).
 
     % foldl(get_consumptions_list(Vars), Goals, 0.0, _, [], RevConsumptions),
     %
@@ -2379,7 +2234,15 @@ get_consumptions_list(Goal, !Vars, !Time
     pair.pair(var_rep, float)::out) is det.
 
 var_consumptions(TimeBefore, Goal, Var, Var - Time) :-
-    var_first_use_time(find_consumption, TimeBefore, Goal, Var, Time).
+    % This will only have multiple solutions where one variable appears a list
+    % of call arguments more than once.
+    solutions(var_first_use_time(find_consumption, TimeBefore, Goal, Var),
+        Times),
+    % The earliest consumption is the consumption that matters.  solutions/2
+    % returns the solutions in ascending sorted order so the first one will be
+    % the earliest one.
+    Times = [FirstTime | OtherTimes],
+    Time = foldl(float.min, OtherTimes, FirstTime).
 
 :- type find_production_or_consumption
     --->    find_production
@@ -2393,17 +2256,18 @@ var_consumptions(TimeBefore, Goal, Var, 
     %   Time is Time0 + the time that Goal first consumes Var.
     %
 :- pred var_first_use_time(find_production_or_consumption::in, 
-    float::in, pard_goal_detail::in, var_rep::in, float::out) is det.
+    float::in, pard_goal_detail::in, var_rep::in, float::out) is multi.
 
 var_first_use_time(FindProdOrCons, TimeBefore, Goal, Var, Time) :-
     GoalType = Goal ^ goal_annotation ^ pgd_pg_type,
     (
         GoalType = pgt_call(Cost, _, Args, _),
-        CostPercall = cs_cost_get_percall(Cost),
-        find_arg_by_var(Args, Var, MaybeArg),
         (
-            MaybeArg = yes(Arg),
-            Use = Arg ^ vmu_use,
+            member(Arg, Args),
+            Arg = var_mode_and_use(Var, _, LazyUse)
+        ->
+            CostPercall = cs_cost_get_percall(Cost),
+            Use = force(LazyUse),
             UseType = Use ^ vui_use_type,
             (
                 (
@@ -2425,19 +2289,7 @@ var_first_use_time(FindProdOrCons, TimeB
                         FindProdOrCons = find_consumption
                     )
                 ),
-                CostUntilUse = Use ^ vui_cost_until_use,
-                (
-                    CostUntilUse = cost_since_proc_start(UseTime)
-                ;
-                    CostUntilUse = cost_before_proc_end(CostBeforeProcEnd),
-                    UseTime = CostPercall - CostBeforeProcEnd,
-                    ( UseTime < 0.0 ->
-                        error("var_first_use_time: "
-                            ++ "CostPercall - CostBeforeProcEnd < 0.0")
-                    ;
-                        true
-                    )
-                )
+                UseTime = Use ^ vui_cost_until_use
             ;
                 UseType = var_use_other,
                 % The analysis didn't recognise the instantiation here, so use a
@@ -2452,7 +2304,6 @@ var_first_use_time(FindProdOrCons, TimeB
                 )
             )
         ;
-            MaybeArg = no,
             (
                 FindProdOrCons = find_production,
                 error("var_first_use_time: "
@@ -2479,17 +2330,6 @@ var_first_use_time(FindProdOrCons, TimeB
     ),
     Time = TimeBefore + UseTime.
 
-:- pred find_arg_by_var(list(var_mode_and_use)::in, var_rep::in,
-    maybe(var_mode_and_use)::out) is det. 
-
-find_arg_by_var([], _, no).
-find_arg_by_var([Arg | Args], Var, MaybeArg) :-
-    ( Arg = var_mode_and_use(Var, _, _) ->
-        MaybeArg = yes(Arg)
-    ;
-        find_arg_by_var(Args, Var, MaybeArg)
-    ).
-
 %----------------------------------------------------------------------------%
 
 :- pred pardgoal_consumed_vars_accum(pard_goal_detail::in,
@@ -2503,16 +2343,15 @@ pardgoal_consumed_vars_accum(Goal, !Vars
     % model_det and have a cost above the call site cost threshold.
     %
 :- pred can_parallelise_call(implicit_parallelism_info::in,
-    detism_rep::in, clique_call_site_report::in) is semidet.
+    detism_rep::in, cs_cost_csq::in) is semidet.
 
-can_parallelise_call(Info, Detism, CallSiteReport) :-
+can_parallelise_call(Info, Detism, Cost) :-
     ( Detism = det_rep
     ; Detism = cc_multidet_rep ),
-    CallSiteCost = get_call_site_cost(Info, CallSiteReport),
-    ( cs_cost_get_calls(CallSiteCost) > 0.0 ->
+    ( cs_cost_get_calls(Cost) > 0.0 ->
         % This is conditional so that we can gauretee that it never causes a
         % divide by zero error,
-        PercallCost = cs_cost_get_percall(CallSiteCost),
+        PercallCost = cs_cost_get_percall(Cost),
         PercallCost > float(Info ^ ipi_opts ^ cpcp_call_site_threshold)
     ;
         fail 
@@ -2520,10 +2359,11 @@ can_parallelise_call(Info, Detism, CallS
 
 :- pred maybe_costly_call(implicit_parallelism_info::in, goal_path::in,
     atomic_goal_rep::in, detism_rep::in, inst_map_info::in,
-    pard_goal_type::out(pgt_atomic_goal)) is det.
+    pard_goal_type::out(pgt_atomic_goal), cord(message)::out) is det.
 
 maybe_costly_call(Info, GoalPath, AtomicGoal, Detism,
-        InstMapInfo, GoalType) :-
+        InstMapInfo, GoalType, !:Messages) :-
+    !:Messages = cord.empty,
     InstMapBefore = InstMapInfo ^ im_before,
     InstMapAfter = InstMapInfo ^ im_after,
     (
@@ -2545,47 +2385,132 @@ maybe_costly_call(Info, GoalPath, Atomic
         ; AtomicGoal = method_call_rep(_, _, Args)
         ; AtomicGoal = plain_call_rep(_, _, Args)
         ),
-        map.lookup(Info ^ ipi_call_sites, GoalPath, CallSite),
+        
         % Lookup var use information.
-        CallSiteKind = CallSite ^ ccsr_kind_and_callee,
+        map.lookup(Info ^ ipi_call_sites, GoalPath, CallSite),
+        map_foldl(compute_var_modes_and_uses(Info, GoalPath, CallSite, 
+                InstMapBefore, InstMapAfter),
+            Args, VarsModesAndUses, 0, _),
+
         (
-            CallSiteKind = normal_call_and_callee(NormalCalleeId, _),
-            PSPtr = NormalCalleeId ^ pdesc_ps_ptr,
-            Deep = Info ^ ipi_deep,
-            create_proc_var_use_dump_report(Deep, PSPtr,
-                MaybeVarUseDumpInfo),
-            MaybeVarUseDumpInfo = ok(VarUseDumpInfo)
+            cost_and_callees_is_recursive(Info ^ ipi_clique, CallSite), 
+            map.search(Info ^ ipi_rec_call_sites, GoalPath, RecCost) 
         ->
-            VarUseInfos = VarUseDumpInfo ^ pvui_var_uses, 
-            list.map_corresponding((pred(Arg::in, VarUseInfo::in, 
-                        VarModeAndUse::out) is det :-
-                    var_get_mode(InstMapBefore, InstMapAfter, Arg, ArgMode),
-                    VarModeAndUse = var_mode_and_use(Arg, ArgMode,
-                        VarUseInfo)
-                ), Args, VarUseInfos, VarModeAndUses)
-        ;
-            list.map((pred(Arg::in, VarModeAndUse::out) is det :-
-                    var_get_mode(InstMapBefore, InstMapAfter, Arg, ArgMode),
-                    var_mode_to_var_use(ArgMode, VarUseType),
-                    pessimistic_var_use_info(VarUseType, VarUseInfo),
-                    VarModeAndUse = var_mode_and_use(Arg, ArgMode,
-                        VarUseInfo)
-                ), Args, VarModeAndUses)
+            Cost = RecCost
+        ;
+            Cost = CallSite ^ cac_cost
         ),
-
-        % XXX: The goal annotations cannot represent reasons why a goal can't
-        % be parallelised, for example it could be nondet, semidet or
+        % XXX: The goal annotations cannot represent reasons why a goal
+        % can't be parallelised, for example it could be nondet, semidet or
         % impure.
-        ( can_parallelise_call(Info, Detism, CallSite) ->
+        ( can_parallelise_call(Info, Detism, Cost) ->
             CostAboveThreshold = cost_above_par_threshold
         ;
             CostAboveThreshold = cost_not_above_par_threshold
         ),
-        CallSiteCost = get_call_site_cost(Info, CallSite),
-        GoalType = pgt_call(CallSiteCost, CostAboveThreshold, VarModeAndUses,
+        GoalType = pgt_call(Cost, CostAboveThreshold, VarsModesAndUses,
             CallSite) 
     ).
 
+:- pred compute_var_modes_and_uses(implicit_parallelism_info::in,
+    goal_path::in, cost_and_callees::in, inst_map::in, inst_map::in, 
+    var_rep::in, var_mode_and_use::out, int::in, int::out) is det.
+
+compute_var_modes_and_uses(Info, GoalPath, CostAndCallee, InstMapBefore, InstMapAfter, Arg,
+        VarModeAndUse, !ArgNum) :-
+    var_get_mode(InstMapBefore, InstMapAfter, Arg, Mode),
+    var_mode_to_var_use_type(Mode, VarUseType),
+    ArgNum = !.ArgNum,
+    LazyUse = delay((func) = compute_var_modes_and_uses_lazy(Info, GoalPath,
+        CostAndCallee, ArgNum, VarUseType)),
+    VarModeAndUse = var_mode_and_use(Arg, Mode, LazyUse),
+    !:ArgNum = !.ArgNum + 1.
+
+:- func compute_var_modes_and_uses_lazy(implicit_parallelism_info, 
+    goal_path, cost_and_callees, int, var_use_type) = var_use_info.
+
+compute_var_modes_and_uses_lazy(Info, GoalPath, CostAndCallee, ArgNum,
+        VarUseType) = Use :-
+    % Get cost
+    (
+        cost_and_callees_is_recursive(Info ^ ipi_clique, CostAndCallee),
+        map.search(Info ^ ipi_rec_call_sites, GoalPath, RecCost)
+    ->
+        % THe callsite is recursive and we know the cost of the
+        % recursive call.
+        Cost0 = RecCost
+    ;
+        Cost0 = CostAndCallee ^ cac_cost
+    ),
+    Cost = cs_cost_get_percall(Cost0),
+
+    HigherOrder = CostAndCallee ^ cac_call_site_is_ho,
+    (
+        HigherOrder = higher_order_call,
+        % We cannot push signals or waits into higher order calls.
+        pessimistic_var_use_info(VarUseType, Cost, Use)
+    ;
+        HigherOrder = first_order_call,
+        Callees = CostAndCallee ^ cac_callees,
+        ( singleton_set(Callees, Callee) ->
+            CSDPtr = Callee ^ c_csd
+        ;
+            error(this_file ++ 
+                "First-order call site has wrong number of CSDs")
+        ),
+        RecursionType = Info ^ ipi_recursion_type,
+        recursion_type_get_interesting_parallelisation_depth(
+            RecursionType, MaybeCurDepth),
+        compute_var_modes_and_uses_2(Info, ArgNum, RecursionType,
+            MaybeCurDepth, VarUseType, Cost, CSDPtr, Use, Messages),
+        trace [io(!IO)] (
+            stderr_stream(Stderr, !IO),
+            write_out_messages(Stderr, Messages, !IO)
+        )
+    ).
+
+:- pred compute_var_modes_and_uses_2(implicit_parallelism_info::in,
+    int::in, recursion_type::in, maybe(float)::in, var_use_type::in,
+    float::in, call_site_dynamic_ptr::in, var_use_info::out, 
+    cord(message)::out) is det.
+
+compute_var_modes_and_uses_2(Info, ArgNum, RecursionType, MaybeCurDepth,
+        VarUseType, Cost, CSDPtr, Use, !:Messages) :-
+    !:Messages = empty,
+    Deep = Info ^ ipi_deep,
+    CliquePtr = Info ^ ipi_clique,
+    call_site_dynamic_var_use_info(Deep, CliquePtr, CSDPtr, ArgNum,
+        RecursionType, MaybeCurDepth, Cost, set.init, VarUseType, MaybeUse),
+    (
+        MaybeUse = ok(Use)
+    ;
+        MaybeUse = error(Error),
+        pessimistic_var_use_info(VarUseType, Cost, Use),
+        append_message(call_site_dynamic(CSDPtr), 
+            warning_cannot_compute_arg_first_use_time(Error),
+            !Messages)
+    ).
+
+:- pred recursion_type_get_interesting_parallelisation_depth(
+    recursion_type::in, maybe(float)::out) is det.
+
+recursion_type_get_interesting_parallelisation_depth(RecursionType,
+        MaybeDepth) :-
+    (
+        RecursionType = rt_not_recursive,
+        MaybeDepth = yes(0.0)
+    ;
+        RecursionType = rt_single(_, _, Depth, _, _),
+        MaybeDepth = yes(Depth / 2.0)
+    ;
+        ( RecursionType = rt_divide_and_conquer(_, _)
+        ; RecursionType = rt_mutual_recursion(_)
+        ; RecursionType = rt_other(_)
+        ; RecursionType = rt_errors(_)
+        ),
+        MaybeDepth = no
+    ).
+
 :- type is_costly_goal
     --->    is_costly_goal
     ;       is_not_costly_goal
@@ -2625,45 +2550,49 @@ var_get_mode(InstMapBefore, InstMapAfter
 :- pred goal_to_pard_goal(implicit_parallelism_info::in, goal_path::in,
     (func(int) = goal_path_step)::in,
     goal_rep(inst_map_info)::in, pard_goal_detail::out,
-    int::in, int::out) is det.
+    int::in, int::out,
+    cord(message)::in, cord(message)::out) is det.
 
-goal_to_pard_goal(Info, GoalPath0, Step, !Goal, !GoalNum) :-
+goal_to_pard_goal(Info, GoalPath0, Step, !Goal, !GoalNum, !Messages) :-
     !.Goal = goal_rep(GoalExpr0, Detism, InstMapInfo),
     GoalPath = goal_path_add_at_end(GoalPath0, Step(!.GoalNum)),
     !:GoalNum = !.GoalNum + 1,
     (
         (
             GoalExpr0 = conj_rep(Conjs0),
-            map_foldl(goal_to_pard_goal(Info, GoalPath, 
-                ( func(Num) = step_conj(Num) ) ), Conjs0, Conjs, 1, _),
+            map_foldl2(goal_to_pard_goal(Info, GoalPath, 
+                ( func(Num) = step_conj(Num) ) ), Conjs0, Conjs, 1, _,
+                !Messages),
             GoalExpr = conj_rep(Conjs)
         ;
             GoalExpr0 = disj_rep(Disjs0),
-            map_foldl(goal_to_pard_goal(Info, GoalPath,
-                ( func(Num) = step_disj(Num) ) ), Disjs0, Disjs, 1, _),
+            map_foldl2(goal_to_pard_goal(Info, GoalPath,
+                ( func(Num) = step_disj(Num) ) ), Disjs0, Disjs, 1, _,
+                !Messages),
             GoalExpr = disj_rep(Disjs)
         ;
             GoalExpr0 = switch_rep(Var, CanFail, Cases0),
-            map_foldl(case_to_pard_goal(Info, GoalPath), Cases0, Cases, 1, _),
+            map_foldl2(case_to_pard_goal(Info, GoalPath), Cases0, Cases, 1, _,
+                !Messages),
             GoalExpr = switch_rep(Var, CanFail, Cases)
         ; 
             GoalExpr0 = ite_rep(Cond0, Then0, Else0),
             goal_to_pard_goal(Info, GoalPath, func(_) = step_ite_cond, 
-                Cond0, Cond, 1, _),
+                Cond0, Cond, 1, _, !Messages),
             goal_to_pard_goal(Info, GoalPath, func(_) = step_ite_then, 
-                Then0, Then, 1, _),
+                Then0, Then, 1, _, !Messages),
             goal_to_pard_goal(Info, GoalPath, func(_) = step_ite_else, 
-                Else0, Else, 1, _),
+                Else0, Else, 1, _, !Messages),
             GoalExpr = ite_rep(Cond, Then, Else)
         ; 
             GoalExpr0 = negation_rep(SubGoal0),
             goal_to_pard_goal(Info, GoalPath, func(_) = step_neg,
-                SubGoal0, SubGoal, 1, _),
+                SubGoal0, SubGoal, 1, _, !Messages),
             GoalExpr = negation_rep(SubGoal)
         ; 
             GoalExpr0 = scope_rep(SubGoal0, MaybeCut),
             goal_to_pard_goal(Info, GoalPath, func(_) = step_scope(MaybeCut),
-                SubGoal0, SubGoal, 1, _),
+                SubGoal0, SubGoal, 1, _, !Messages),
             GoalExpr = scope_rep(SubGoal, MaybeCut)
         ),
         % XXX: We my consider lifting calls out of non-atomic goals so that
@@ -2674,50 +2603,22 @@ goal_to_pard_goal(Info, GoalPath0, Step,
         GoalExpr0 = atomic_goal_rep(Context, Line, BoundVars, AtomicGoal),
         GoalExpr = atomic_goal_rep(Context, Line, BoundVars, AtomicGoal),
         maybe_costly_call(Info, GoalPath, AtomicGoal, Detism,
-            InstMapInfo, PardGoalType)
+            InstMapInfo, PardGoalType, Messages),
+        !:Messages = !.Messages ++ Messages
     ),
     PardGoalAnnotation = pard_goal_detail(PardGoalType, InstMapInfo, GoalPath),
     !:Goal = goal_rep(GoalExpr, Detism, PardGoalAnnotation).
 
 :- pred case_to_pard_goal(implicit_parallelism_info::in, goal_path::in,
     case_rep(inst_map_info)::in, case_rep(pard_goal_detail_annotation)::out, 
-    int::in, int::out) is det.
+    int::in, int::out, cord(message)::in, cord(message)::out) is det.
 
-case_to_pard_goal(Info, GoalPath0, !Case, !GoalNum) :-
+case_to_pard_goal(Info, GoalPath0, !Case, !GoalNum, !Messages) :-
     !.Case = case_rep(ConsId, OtherConsId, Goal0),
     goal_to_pard_goal(Info, GoalPath0, 
-        ( func(Num) = step_switch(Num, no) ), Goal0, Goal, !GoalNum),
+        ( func(Num) = step_switch(Num, no) ), Goal0, Goal, !GoalNum, !Messages),
     !:Case = case_rep(ConsId, OtherConsId, Goal).
 
-    % are_conjuncts_dependent(CallOutputs, InstMap, VarModeAndUse, !DepVars),
-    %
-    % Determine if a variables depends on an output from an earlier call either
-    % directly or indirectly.  If it does add it to the DepVars set.
-    %
-    % Retrieve the average cost of a call site.
-    %
-:- func get_call_site_cost(implicit_parallelism_info, clique_call_site_report) 
-    = cs_cost_csq.
-
-get_call_site_cost(Info, CallSite) = Cost :-
-    CSSummary = CallSite ^ ccsr_call_site_summary,
-    GoalPath = CSSummary ^ perf_row_subject ^ csdesc_goal_path,
-    ( map.search(Info ^ ipi_rec_call_sites, GoalPath, CostPrime) ->
-        Cost = CostPrime
-    ;
-        MaybePerfTotal = CSSummary ^ perf_row_maybe_total, 
-        (
-            MaybePerfTotal = yes(PerfTotal),
-            TotalCost = PerfTotal ^ perf_row_callseqs
-        ;
-            MaybePerfTotal = no,
-            error(this_file ++ 
-                "Could not retrieve total callseqs cost from call site")
-        ),
-        Calls = CSSummary ^ perf_row_calls,
-        Cost = build_cs_cost_csq(Calls, float(TotalCost))
-    ).
-
 %----------------------------------------------------------------------------%
 %
 % Annotate a goal with instantiation information.
@@ -3251,28 +3152,20 @@ format_pard_goal_annotation(GoalAnnotati
     io::di, io::uo) is det.
 
 debug_cliques_below_threshold(Clique, !IO) :-
-    Perf = Clique ^ ccc_perf,
-    CliquePtr = Perf ^ perf_row_subject ^ cdesc_clique_ptr,
+    CliquePtr = Clique ^ ccc_clique,
     CliquePtr = clique_ptr(CliqueNum),
-    Calls = Perf ^ perf_row_calls,
-    MaybeTotals = Perf ^ perf_row_maybe_total,
-    (
-        MaybeTotals = yes(Totals),
-        PercallCost = Totals ^ perf_row_callseqs_percall
-    ;
-        MaybeTotals = no,
-        PercallCost = Perf ^ perf_row_self ^ perf_row_callseqs_percall
-    ),
+    Calls = cs_cost_get_calls(Clique ^ ccc_cs_cost),
+    PercallCost = cs_cost_get_percall(Clique ^ ccc_cs_cost),
     io.format("D: Not entering clique: %d, " ++
         "it is below the clique threashold\n  " ++
-        "It has per-call cost %f and is called %d times\n\n",
-        [i(CliqueNum), f(PercallCost), i(Calls)], !IO).
+        "It has per-call cost %f and is called %f times\n\n",
+        [i(CliqueNum), f(PercallCost), f(Calls)], !IO).
 
 :- pred debug_cliques_exeeded_parallelism(candidate_child_clique::in,
     io::di, io::uo) is det.
 
 debug_cliques_exeeded_parallelism(Clique, !IO) :-
-    CliquePtr = Clique ^ ccc_perf ^ perf_row_subject ^ cdesc_clique_ptr,
+    CliquePtr = Clique ^ ccc_clique,
     CliquePtr = clique_ptr(CliqueNum),
     io.format("D: Not entiring clique %d, " ++
         "no-more parallelisation resources available at this context\n\n",
Index: deep_profiler/mdprof_feedback.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/mdprof_feedback.m,v
retrieving revision 1.27
diff -u -p -b -r1.27 mdprof_feedback.m
--- deep_profiler/mdprof_feedback.m	24 Aug 2010 00:01:47 -0000	1.27
+++ deep_profiler/mdprof_feedback.m	6 Oct 2010 23:29:36 -0000
@@ -131,17 +131,8 @@ main(!IO) :-
                             [s(OutputFileName), s(ErrorMessage)], !IO),
                         io.set_exit_status(1, !IO)
                     ),
-                    cord.foldl_pred(
-                        (pred(Message::in, IO0::di, IO::uo) is det :-
-                            Level = message_get_level(Message),
-                            ( message_level_to_int(Level) =< VerbosityLevel ->
-                                message_to_string(Message, MessageStr),
-                                io.write_string(Stderr, MessageStr, IO0, IO1),
-                                io.nl(Stderr, IO1, IO)
-                            ;
-                                IO = IO0
-                            )
-                        ), Messages, !IO)
+                    set_verbosity_level(VerbosityLevel, !IO),
+                    write_out_messages(Stderr, Messages, !IO)
                 ;
                     FeedbackReadResult = error(FeedbackReadError),
                     feedback.read_error_message_string(OutputFileName,
Index: deep_profiler/message.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/message.m,v
retrieving revision 1.7
diff -u -p -b -r1.7 message.m
--- deep_profiler/message.m	27 Sep 2010 05:51:02 -0000	1.7
+++ deep_profiler/message.m	6 Oct 2010 23:29:36 -0000
@@ -24,6 +24,7 @@
 
 :- import_module cord.
 :- import_module int.
+:- import_module io.
 :- import_module string.
 
 %-----------------------------------------------------------------------------%
@@ -49,7 +50,8 @@
 :- type program_location
     --->    proc(string_proc_label)
     ;       goal(string_proc_label, goal_path)
-    ;       clique(clique_ptr).
+    ;       clique(clique_ptr)
+    ;       call_site_dynamic(call_site_dynamic_ptr).
 
 %-----------------------------------------------------------------------------%
 
@@ -137,6 +139,12 @@
                 %
     ;       warning_cannot_compute_cost_of_recursive_calls(string)
 
+                % Couldn't compute the time at which a call site's argument is
+                % produced or consumed.
+                %
+                % The parameter contains extra information about this error.
+    ;       warning_cannot_compute_arg_first_use_time(string)
+
                 % We don't yet handle clique_proc_reports with multiple proc
                 % dynamics.
                 %
@@ -167,6 +175,22 @@
 :- func indent_size(int) = int.
 
 %-----------------------------------------------------------------------------%
+    
+    % Write out messages.
+    %
+:- pred write_out_messages(io.output_stream::in, cord(message)::in, 
+    io::di, io::uo) is det.
+
+    % Set the verbosity level to use above.  Higher levels print out more
+    % information.  Levels are in the inclusive range 0..4.
+    %
+:- pred set_verbosity_level(int::in, io::di, io::uo) is det.
+
+    % The default verbosity level if set_verbosity_level is never called.
+    %
+:- func default_verbosity_level = int.
+
+%-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
 :- implementation.
@@ -209,6 +233,10 @@ location_to_string(Level, goal(ProcLabel
 location_to_string(Level, clique(clique_ptr(Id)), String) :-
     format("clique %d", [i(Id)], String0),
     String = indent(Level) ++ singleton(String0).
+location_to_string(Level, call_site_dynamic(CSDPtr), String) :-
+    CSDPtr = call_site_dynamic_ptr(CSDNum),
+    format("call site dynamic %d", [i(CSDNum)], String0),
+    String = indent(Level) ++ singleton(String0).
 
 %-----------------------------------------------------------------------------%
 
@@ -253,6 +281,8 @@ message_type_to_level(warning_cannot_com
     message_warning.
 message_type_to_level(warning_cannot_compute_cost_of_recursive_calls(_)) = 
     message_warning.
+message_type_to_level(warning_cannot_compute_arg_first_use_time(_)) = 
+    message_warning.
 message_type_to_level(error_extra_proc_dynamics_in_clique_proc) = 
     message_error.
 message_type_to_level(error_coverage_procrep_error(_)) =
@@ -261,7 +291,6 @@ message_type_to_level(error_exception_th
 
 %-----------------------------------------------------------------------------%
 
-
 :- func message_type_to_string(message_type) = cord(string).
 
 message_type_to_string(MessageType) = Cord :-
@@ -320,6 +349,11 @@ message_type_to_string(MessageType) = Co
             MessageType =
                 warning_cannot_compute_cost_of_recursive_calls(ErrorStr),
             Template = "Cannot compute cost of recursive calls: %s"
+        ;
+            MessageType = 
+                warning_cannot_compute_arg_first_use_time(ErrorStr),
+            Template = "Cannot compute the production or consumption time of a"
+                ++ " call site's argument: %s"
         ),
         string.format(Template, [s(ErrorStr)], String)
     ),
@@ -342,6 +376,33 @@ nl = singleton("\n").
 
 indent_size(N) = 2 * N.
 
+%----------------------------------------------------------------------------%
+
+:- mutable(verbosity_level_mut, int, default_verbosity_level, ground, 
+    [attach_to_io_state, untrailed]).
+
+write_out_messages(Stream, Messages, !IO) :-
+    cord.foldl_pred(write_out_message(Stream), Messages, !IO).
+
+:- pred write_out_message(output_stream::in, message::in, io::di, io::uo) 
+    is det.
+
+write_out_message(Stream, Message, !IO) :-
+    Level = message_get_level(Message),
+    get_verbosity_level_mut(VerbosityLevel, !IO),
+    ( message_level_to_int(Level) =< VerbosityLevel ->
+        message_to_string(Message, MessageStr),
+        io.write_string(Stream, MessageStr, !IO),
+        io.nl(Stream, !IO)
+    ;
+        true
+    ).
+
+set_verbosity_level(VerbosityLevel, !IO) :-
+    set_verbosity_level_mut(VerbosityLevel, !IO).
+
+default_verbosity_level = 2.
+
 %-----------------------------------------------------------------------------%
 :- end_module message.
 %-----------------------------------------------------------------------------%
Index: deep_profiler/old_html_format.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/old_html_format.m,v
retrieving revision 1.7
diff -u -p -b -r1.7 old_html_format.m
--- deep_profiler/old_html_format.m	22 Sep 2010 12:56:54 -0000	1.7
+++ deep_profiler/old_html_format.m	6 Oct 2010 23:29:36 -0000
@@ -408,7 +408,7 @@ command_relevant_toggles(deep_cmd_dump_p
 command_relevant_toggles(deep_cmd_dump_call_site_static(_)) = [].
 command_relevant_toggles(deep_cmd_dump_call_site_dynamic(_)) = [].
 command_relevant_toggles(deep_cmd_dump_clique(_)) = [].
-command_relevant_toggles(deep_cmd_dump_proc_var_use(_)) = [].
+command_relevant_toggles(deep_cmd_call_site_dynamic_var_use(_)) = [].
 command_relevant_toggles(Cmd) = [] :-
     ( Cmd = deep_cmd_static_procrep_coverage(_)
     ; Cmd = deep_cmd_dynamic_procrep_coverage(_)
Index: deep_profiler/old_query.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/old_query.m,v
retrieving revision 1.7
diff -u -p -b -r1.7 old_query.m
--- deep_profiler/old_query.m	22 Sep 2010 12:56:54 -0000	1.7
+++ deep_profiler/old_query.m	6 Oct 2010 23:29:36 -0000
@@ -142,7 +142,7 @@ old_exec(deep_cmd_dump_clique(CliquePtr)
 old_exec(Cmd, _, _, HTML, !IO) :-
     ( Cmd = deep_cmd_static_procrep_coverage(_)
     ; Cmd = deep_cmd_dynamic_procrep_coverage(_)
-    ; Cmd = deep_cmd_dump_proc_var_use(_)
+    ; Cmd = deep_cmd_call_site_dynamic_var_use(_)
     ; Cmd = deep_cmd_module_getter_setters(_)
     ; Cmd = deep_cmd_clique_recursive_costs(_)
     ; Cmd = deep_cmd_recursion_types_frequency
Index: deep_profiler/query.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/query.m,v
retrieving revision 1.37
diff -u -p -b -r1.37 query.m
--- deep_profiler/query.m	22 Sep 2010 12:56:54 -0000	1.37
+++ deep_profiler/query.m	6 Oct 2010 23:29:36 -0000
@@ -146,8 +146,8 @@
     ;       deep_cmd_dump_clique(
                 cmd_dcl_id                      :: clique_ptr
             )
-    ;       deep_cmd_dump_proc_var_use(
-                cmd_dpvu_id                     :: proc_static_ptr
+    ;       deep_cmd_call_site_dynamic_var_use(
+                cmd_csdvu_id                    :: call_site_dynamic_ptr 
             ).
 
 :- type caller_groups
@@ -460,7 +460,7 @@ exec(Cmd, Prefs, Deep, HTMLStr, !IO) :-
         ( Cmd = deep_cmd_static_procrep_coverage(_)
         ; Cmd = deep_cmd_dynamic_procrep_coverage(_)
         ; Cmd = deep_cmd_module_getter_setters(_)
-        ; Cmd = deep_cmd_dump_proc_var_use(_)
+        ; Cmd = deep_cmd_call_site_dynamic_var_use(_)
         ; Cmd = deep_cmd_clique_recursive_costs(_)
         ; Cmd = deep_cmd_recursion_types_frequency
         ),
@@ -708,10 +708,11 @@ cmd_to_string(Cmd) = CmdStr :-
         CmdStr = string.format("%s%c%d",
             [s(cmd_str_dump_raw_clique), c(cmd_separator_char), i(CliqueNum)])
     ;
-        Cmd = deep_cmd_dump_proc_var_use(PSPtr),
-        PSPtr = proc_static_ptr(PSI),
+        Cmd = deep_cmd_call_site_dynamic_var_use(CSDPtr),
+        CSDPtr = call_site_dynamic_ptr(CSDI),
         CmdStr = string.format("%s%c%d",
-            [s(cmd_str_dump_proc_var_use), c(cmd_separator_char), i(PSI)])
+            [s(cmd_str_call_site_dynamic_var_use), c(cmd_separator_char),
+             i(CSDI)])
     ).
 
 :- func preferences_to_string(preferences) = string.
@@ -885,11 +886,11 @@ string_to_maybe_cmd(QueryString) = Maybe
         Cmd = deep_cmd_dump_clique(CliquePtr),
         MaybeCmd = yes(Cmd)
     ;
-        Pieces = [cmd_str_dump_proc_var_use, PSIStr],
-        string.to_int(PSIStr, PSI)
+        Pieces = [cmd_str_call_site_dynamic_var_use, CSDIStr],
+        string.to_int(CSDIStr, CSDI)
     ->
-        PSPtr = proc_static_ptr(PSI),
-        Cmd = deep_cmd_dump_proc_var_use(PSPtr),
+        CSDPtr = call_site_dynamic_ptr(CSDI),
+        Cmd = deep_cmd_call_site_dynamic_var_use(CSDPtr),
         MaybeCmd = yes(Cmd)
     ;
         Pieces = [cmd_str_timeout, TimeOutStr],
@@ -1405,8 +1406,8 @@ cmd_str_dump_call_site_dynamic = "dump_c
 :- func cmd_str_dump_raw_clique = string.
 cmd_str_dump_raw_clique = "dump_raw_clique".
 
-:- func cmd_str_dump_proc_var_use = string.
-cmd_str_dump_proc_var_use = "dump_proc_var_use".
+:- func cmd_str_call_site_dynamic_var_use = string.
+cmd_str_call_site_dynamic_var_use = "call_site_dynamic_var_use".
 
 %----------------------------------------------------------------------------%
 :- end_module query.
Index: deep_profiler/recursion_patterns.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/recursion_patterns.m,v
retrieving revision 1.4
diff -u -p -b -r1.4 recursion_patterns.m
--- deep_profiler/recursion_patterns.m	27 Sep 2010 05:51:02 -0000	1.4
+++ deep_profiler/recursion_patterns.m	6 Oct 2010 23:29:36 -0000
@@ -20,6 +20,7 @@
 :- import_module report.
 :- import_module profile.
 
+:- import_module float.
 :- import_module maybe.
             
 %----------------------------------------------------------------------------%
@@ -31,10 +32,16 @@
     maybe_error(recursion_types_frequency_report)::out) is det.
 
 %----------------------------------------------------------------------------%
+
+:- pred recursion_type_get_maybe_avg_max_depth(recursion_type::in,
+    maybe(float)::out) is det.
+
+%----------------------------------------------------------------------------%
 %----------------------------------------------------------------------------%
 
 :- implementation.
 
+:- import_module analysis_utils.
 :- import_module array_util.
 :- import_module coverage.
 :- import_module create_report.
@@ -108,7 +115,6 @@ create_clique_recursion_costs_report(Dee
 
 proc_get_recursion_type(Deep, ThisClique, PDPtr, ParentCalls,
         MaybeRecursionType) :-
-    lookup_proc_dynamics(Deep ^ proc_dynamics, PDPtr, PD),
     deep_lookup_pd_own(Deep, PDPtr, PDOwn),
     TotalCalls = calls(PDOwn),
     create_dynamic_procrep_coverage_report(Deep, PDPtr, MaybeCoverageReport),
@@ -116,8 +122,9 @@ proc_get_recursion_type(Deep, ThisClique
         MaybeCoverageReport = ok(CoverageReport),
         ProcRep = CoverageReport ^ prci_proc_rep, 
         Goal = ProcRep ^ pr_defn ^ pdr_goal,
-        array.foldl(build_call_site_cost_and_callee_map(Deep), 
-            PD ^ pd_sites, map.init, CallSitesMap),
+        proc_dynamic_paired_call_site_slots(Deep, PDPtr, Slots),
+        foldl(build_call_site_cost_and_callee_map(Deep), 
+            Slots, map.init, CallSitesMap),
         goal_recursion_data(ThisClique, CallSitesMap, empty_goal_path,
             Goal, RecursionData),
         recursion_data_to_recursion_type(ParentCalls, TotalCalls,
@@ -128,73 +135,6 @@ proc_get_recursion_type(Deep, ThisClique
         MaybeRecursionType = error(Error)
     ).
 
-:- type cost_and_callees
-    --->    cost_and_callees(
-                cac_cost            :: cs_cost_csq,
-                cac_callees         :: set(clique_ptr)
-            ).
-
-:- pred build_call_site_cost_and_callee_map(deep::in, 
-    call_site_array_slot::in,
-    map(goal_path, cost_and_callees)::in, map(goal_path, cost_and_callees)::out)
-    is det.
-
-build_call_site_cost_and_callee_map(Deep, slot_normal(CSDPtr), !CallSitesMap)
-        :-
-    ( valid_call_site_dynamic_ptr(Deep, CSDPtr) ->
-        call_site_dynamic_get_callee_and_costs(Deep, CSDPtr, CalleeCliquePtr, 
-            Own, Inherit),
-        CostCsq = build_cs_cost_csq(calls(Own), 
-            float(callseqs(Own) + inherit_callseqs(Inherit))),
-        CostAndCallees = cost_and_callees(CostCsq, set([CalleeCliquePtr])),
-        lookup_call_site_static_map(Deep ^ call_site_static_map, CSDPtr,
-            CSSPtr),
-        lookup_call_site_statics(Deep ^ call_site_statics, CSSPtr, CSS),
-        goal_path_from_string_det(CSS ^ css_goal_path, GoalPath),
-        svmap.det_insert(GoalPath, CostAndCallees, !CallSitesMap)
-    ;
-        true
-    ).
-build_call_site_cost_and_callee_map(Deep, slot_multi(_, CSDPtrsArray),
-        !CallSitesMap) :-
-    to_list(CSDPtrsArray, CSDPtrs),
-    (
-        CSDPtrs = []
-        % There is no way of finding the goal path, so we can't put such a goal
-        % in our map.  This probably can't happen in reality anyway.
-    ;
-        CSDPtrs = [FirstCSDPtr | _],
-
-        map3(call_site_dynamic_get_callee_and_costs(Deep), CSDPtrs, 
-            CalleeCliquePtrs, Owns, Inherits),
-        Own = sum_own_infos(Owns),
-        Inherit = sum_inherit_infos(Inherits),
-        CostCsq = build_cs_cost_csq(calls(Own), 
-            float(callseqs(Own) + inherit_callseqs(Inherit))),
-        CostAndCallees = cost_and_callees(CostCsq, set(CalleeCliquePtrs)),
-
-        % The goal path of the call site will be the same regardless of the
-        % callee, so we get it from the first.
-        lookup_call_site_static_map(Deep ^ call_site_static_map, FirstCSDPtr,
-            FirstCSSPtr),
-        lookup_call_site_statics(Deep ^ call_site_statics, FirstCSSPtr,
-            FirstCSS),
-        goal_path_from_string_det(FirstCSS ^ css_goal_path, GoalPath),
-        svmap.det_insert(GoalPath, CostAndCallees, !CallSitesMap)
-    ).
-
-:- pred call_site_dynamic_get_callee_and_costs(deep::in, 
-    call_site_dynamic_ptr::in, clique_ptr::out, own_prof_info::out, 
-    inherit_prof_info::out) is det.
-
-call_site_dynamic_get_callee_and_costs(Deep, CSDPtr, CalleeCliquePtr, Own,
-        Inherit) :-
-    lookup_call_site_dynamics(Deep ^ call_site_dynamics, CSDPtr, CSD),
-    lookup_csd_desc(Deep ^ csd_desc, CSDPtr, Inherit),
-    PDPtr = CSD ^ csd_callee,
-    lookup_clique_index(Deep ^ clique_index, PDPtr, CalleeCliquePtr), 
-    Own = CSD ^ csd_own_prof.
-
 :- pred recursion_data_to_recursion_type(int::in, int::in, recursion_data::in,
     recursion_type::out) is det.
 
@@ -553,19 +493,13 @@ atomic_goal_recursion_data(ThisClique, C
         ),
 
         % Get the cost of the call.
-        ( map.search(CallSiteMap, GoalPath, CostAndCallees) ->
-            CostAndCallees = cost_and_callees(Cost, Callees),
-            CostPercall = cs_cost_get_percall(Cost)
-        ;
-            CostPercall = 0.0,
-            set.init(Callees)
-        ),
-
-        ( member(ThisClique, Callees) ->
+        map.lookup(CallSiteMap, GoalPath, CostAndCallees),
+        ( cost_and_callees_is_recursive(ThisClique, CostAndCallees) ->
             % Cost will be 1.0 for for each call to recursive calls but we
             % calculate this later.
             RecursionLevel = 1 - recursion_level(0.0, certain)
         ;
+            CostPercall = cs_cost_get_percall(CostAndCallees ^ cac_cost),
             RecursionLevel = 0 - recursion_level(CostPercall, certain)
         )
     ),
@@ -987,6 +921,17 @@ finalize_histogram_proc_rec_type(Deep, N
     own_and_inherit_to_perf_row_data(Deep, ProcDesc, Own, Inherit, Summary).
 
 %----------------------------------------------------------------------------%
+%----------------------------------------------------------------------------%
+
+recursion_type_get_maybe_avg_max_depth(rt_not_recursive, yes(0.0)).
+recursion_type_get_maybe_avg_max_depth(rt_single(_, _, Depth, _, _),
+    yes(Depth)).
+recursion_type_get_maybe_avg_max_depth(rt_divide_and_conquer(_, _), no).
+recursion_type_get_maybe_avg_max_depth(rt_mutual_recursion(_), no).
+recursion_type_get_maybe_avg_max_depth(rt_other(_), no).
+recursion_type_get_maybe_avg_max_depth(rt_errors(_), no).
+
+%----------------------------------------------------------------------------%
 
 :- func this_file = string.
 
Index: deep_profiler/report.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/report.m,v
retrieving revision 1.25
diff -u -p -b -r1.25 report.m
--- deep_profiler/report.m	27 Sep 2010 05:51:02 -0000	1.25
+++ deep_profiler/report.m	6 Oct 2010 23:29:36 -0000
@@ -35,7 +35,6 @@
 :- import_module map.
 :- import_module maybe.
 :- import_module set.
-:- import_module string.
 :- import_module unit.
 
 %-----------------------------------------------------------------------------%
@@ -92,8 +91,8 @@
     ;       report_clique_dump(
                 maybe_error(clique_dump_info)
             )
-    ;       report_proc_var_use_dump(
-                maybe_error(proc_var_use_dump_info)
+    ;       report_call_site_dynamic_var_use(
+                maybe_error(call_site_dynamic_var_use_info)
             ).
 
 :- type message_report
@@ -243,6 +242,20 @@
                 rte_errors                  :: list(string)
             ).
 
+    % Recursion types for which call costs at any level can be calculated.
+    %
+:- inst recursion_type_known_costs
+    --->    rt_not_recursive
+    ;       rt_single(ground, ground, ground, ground, ground).
+
+    % Recursion types for which call costs at any level can not be calculated.
+    %
+:- inst recursion_type_unknown_costs
+    --->    rt_divide_and_conquer(ground, ground)
+    ;       rt_mutual_recursion(ground)
+    ;       rt_other(ground)
+    ;       rt_errors(ground).
+
 :- type recursion_level_report
     --->    recursion_level_report(
                 rlr_level                       :: int,
@@ -499,10 +512,16 @@
                 cdi_member_pdptrs           :: list(proc_dynamic_ptr)
             ).
 
-:- type proc_var_use_dump_info
-    --->    proc_var_use_dump_info(
-                pvui_total_proc_time        :: float,
-                pvui_var_uses               :: list(var_use_info)
+:- type call_site_dynamic_var_use_info
+    --->    call_site_dynamic_var_use_info(
+                csdvui_total_cost           :: float,
+                csdvui_var_uses             :: list(var_use_and_name)
+            ).
+
+:- type var_use_and_name
+    --->    var_use_and_name(
+                vun_var_name                :: string,
+                vun_use                     :: var_use_info
             ).
 
     % This type represents information about the performance of the subject.
Index: deep_profiler/var_use_analysis.m
===================================================================
RCS file: /home/mercury1/repository/mercury/deep_profiler/var_use_analysis.m,v
retrieving revision 1.4
diff -u -p -b -r1.4 var_use_analysis.m
--- deep_profiler/var_use_analysis.m	22 Sep 2010 12:56:54 -0000	1.4
+++ deep_profiler/var_use_analysis.m	7 Oct 2010 01:52:23 -0000
@@ -31,7 +31,8 @@
     %
 :- type var_use_info
     --->    var_use_info(
-                vui_cost_until_use          :: cost_until_var_use,
+                vui_cost_until_use          :: float,
+                vui_proc_cost               :: float,
                 vui_use_type                :: var_use_type
             ).
 
@@ -43,65 +44,272 @@
     ;       var_use_other.
             % The variable is used in some other way.
 
-:- type cost_until_var_use
-    --->    cost_since_proc_start(float)
-    ;       cost_before_proc_end(float).
+%-----------------------------------------------------------------------------%
 
-:- func cost_until_to_cost_since_start(cost_until_var_use, float) = float.
-:- func cost_until_to_cost_before_end(cost_until_var_use, float) = float.
+:- func average_var_use(list(var_use_info)) = var_use_info.
 
-    % generic_vars_first_use(HeadVarsToVars, Deep, PSPtr, CallStack,
-    %   VarUseInfos, ProcAverageCost).
-    %
-    % Find the first uses of the given variables.
-    %
-    % CallStack is used to prevent unbound recursion; initialise it to
-    % set.init.
-    %
-:- pred generic_vars_first_use(
-    pred(list(head_var_rep), list(var_rep), list(var_use_type))
-        ::in(pred(in, out, out) is det),
-    deep::in, proc_static_ptr::in, set(proc_static_ptr)::in,
-    maybe_error(proc_var_use_dump_info)::out) is det.
+:- pred var_mode_to_var_use_type(var_mode_rep::in, var_use_type::out) is det.
 
-:- pred head_vars_all(list(head_var_rep)::in, list(var_rep)::out,
-    list(var_use_type)::out) is det.
+:- pred pessimistic_var_use_info(var_use_type::in, float::in, 
+    var_use_info::out) is det.
 
-:- pred var_mode_to_var_use(var_mode_rep::in, var_use_type::out) is det.
+:- pred pessimistic_var_use_time(var_use_type::in, float::in, float::out) 
+    is det.
 
-:- pred pessimistic_var_use_info(var_use_type::in, var_use_info::out) is det.
+%-----------------------------------------------------------------------------%
+
+    % call_site_dynamic_var_use_info(Deep, CSDPtr, ArgPos, RT, VarUseType,
+    %   MaybeVarUseInfo):
+    %
+    % Lookup when the call site CSDPtr will produce ArgPos.  RT is the type of
+    % recursion in the call site's parent clique and VarUseType is the type of
+    % variable use that is expected, VarUseType is used to produce conservative
+    % defaults if a callee cannot be analyzed.
+    %
+:- pred call_site_dynamic_var_use_info(deep::in, call_site_dynamic_ptr::in,
+    int::in, recursion_type::in, float::in, var_use_type::in,
+    maybe_error(var_use_info)::out) is det.
+
+    % call_site_dynamic_var_use_info(Deep, CliquePtr, CSDPtr, ArgPos, RT,
+    %   MaybeCurDepth, Cost, CallStack, VarUseType, MaybeVarUseInfo).
+    %
+    % As above.  This alternative should be used if searching starts at a
+    % different recursion level or from within a current variable use analysis.
+    %
+    % CliquePtr is the current clique, MaybeCurDepth is the current recursion
+    % depth within the clique if known, CallStack is the set of proc dynamics
+    % that have already been explored.
+    %
+    % Cost is the cost of the call.
+    %
+:- pred call_site_dynamic_var_use_info(deep::in, clique_ptr::in,
+    call_site_dynamic_ptr::in, int::in, recursion_type::in, maybe(float)::in,
+    float::in, set(proc_dynamic_ptr)::in,
+    var_use_type::in, maybe_error(var_use_info)::out) is det.
+
+:- pred clique_var_use_info(deep::in, clique_ptr::in, int::in,
+    var_use_type::in, maybe_error(var_use_info)::out) is det.
+
+:- pred proc_dynamic_var_use_info(deep::in, clique_ptr::in,
+    proc_dynamic_ptr::in, int::in, recursion_type::in, maybe(float)::in,
+    float::in, set(proc_dynamic_ptr)::in, var_use_type::in,
+    maybe_error(var_use_info)::out) is det.
 
 %-----------------------------------------------------------------------------%
 
 :- implementation.
 
+:- import_module analysis_utils.
 :- import_module coverage.
 :- import_module create_report.
 :- import_module measurements.
 :- import_module program_representation_utils.
+:- import_module recursion_patterns.
 
 :- import_module float.
 :- import_module int.
 :- import_module io.
 :- import_module map.
 :- import_module require.
-:- import_module set.
+:- import_module solutions.
 :- import_module string.
 
 %-----------------------------------------------------------------------------%
 
-cost_until_to_cost_since_start(cost_since_proc_start(Cost), _WholeCost) =
-    Cost.
-cost_until_to_cost_since_start(cost_before_proc_end(Cost), WholeCost) =
-    WholeCost - Cost.
-
-cost_until_to_cost_before_end(cost_since_proc_start(Cost), WholeCost) =
-    WholeCost - Cost.
-cost_until_to_cost_before_end(cost_before_proc_end(Cost), _WholeCost) =
-    Cost.
+average_var_use(Uses) = var_use_info(CostUntilUse, AvgProcCost, Type) :-
+    (
+        Uses = [],
+        error(this_file ++ "average_var_use: Cannot average zero items")
+    ;
+        Uses = [var_use_info(_, _, Type) | _],
+        foldl2(sum_use_info_costs, Uses, 0.0, SumCost, 0.0, SumProcCost),
+        Num = float(length(Uses)),
+        CostUntilUse = SumCost / Num,
+        AvgProcCost = SumProcCost / Num, 
+        require(all_true((pred(var_use_info(_, _, TypeI)::in) is semidet :- 
+                    Type = TypeI
+                ), Uses),
+            "average_var_use: Use types do not match")
+    ).
+
+:- pred sum_use_info_costs(var_use_info::in, float::in, float::out, 
+    float::in, float::out) is det.
+
+sum_use_info_costs(var_use_info(Cost, ProcCost, _), !AccCost, !AccProcCost) :-
+    !:AccCost = !.AccCost + Cost,
+    !:AccProcCost = !.AccProcCost + ProcCost.
 
 %-----------------------------------------------------------------------------%
 
+var_mode_to_var_use_type(var_mode_rep(InitialInst, FinalInst), VarUseType) :-
+    (
+        InitialInst = ir_ground_rep,
+        FinalInst = ir_ground_rep
+    ->
+        VarUseType = var_use_consumption
+    ;
+        InitialInst = ir_free_rep,
+        FinalInst = ir_ground_rep
+    ->
+        VarUseType = var_use_production
+    ;
+        VarUseType = var_use_other
+    ).
+
+%-----------------------------------------------------------------------------%
+
+pessimistic_var_use_info(VarUseType, ProcCost, VarUseInfo) :-
+    pessimistic_var_use_time(VarUseType, ProcCost, CostUntilUse),
+    VarUseInfo = var_use_info(CostUntilUse, ProcCost, VarUseType).
+
+pessimistic_var_use_time(VarUseType, ProcCost, CostUntilUse) :-
+    (
+        VarUseType = var_use_consumption,
+        CostUntilUse = 0.0
+    ;
+        ( VarUseType = var_use_production
+        ; VarUseType = var_use_other
+        ),
+        CostUntilUse = ProcCost
+    ).
+
+%-----------------------------------------------------------------------------%
+
+    % XXX: If the CSD represents a closure then the argument position will be
+    % incorrect.  This is not currently important as we assume that the
+    % compiler will not push signals and waits into higher order calls.
+    % Therefore this should never be called for a higher order call site.
+    %
+call_site_dynamic_var_use_info(Deep, CSDPtr, ArgPos, RT, Cost, VarUseType,
+        MaybeVarUseInfo) :-
+    deep_lookup_call_site_dynamics(Deep, CSDPtr, CSD),
+    deep_lookup_clique_index(Deep, CSD ^ csd_caller, ParentCliquePtr),
+    recursion_type_get_maybe_avg_max_depth(RT, MaybeCurDepth),
+    call_site_dynamic_var_use_info(Deep, ParentCliquePtr, CSDPtr, ArgPos, RT, 
+        MaybeCurDepth, Cost, set.init, VarUseType, MaybeVarUseInfo).
+
+call_site_dynamic_var_use_info(Deep, ParentCliquePtr, CSDPtr, ArgNum,
+        RecursionType, MaybeDepth0, Cost, CallStack, VarUseType,
+        MaybeVarUseInfo) :-
+    deep_lookup_clique_maybe_child(Deep, CSDPtr, MaybeCalleeCliquePtr),
+    ( 
+        MaybeCalleeCliquePtr = yes(CalleeCliquePtr),
+        % This is a non-recursive call site.
+        clique_var_use_info(Deep, CalleeCliquePtr, ArgNum, VarUseType,
+            MaybeVarUseInfo)
+    ;
+        MaybeCalleeCliquePtr = no,
+        % This is a recursive call site.
+        deep_lookup_call_site_dynamics(Deep, CSDPtr, CSD),
+        CalleePDPtr = CSD ^ csd_callee,
+        ( member(CalleePDPtr, CallStack) ->
+            % Don't follow this recursive call, doing so would make this
+            % analysis recurse forever.  XXX: There should be a way to
+            % compute and solve a finite series for this.
+            pessimistic_var_use_info(VarUseType, Cost, VarUseInfo),
+            MaybeVarUseInfo = ok(VarUseInfo)
+        ;
+            (
+                MaybeDepth0 = yes(Depth0),
+                % Depth is measured from the bottom, hence this is -1
+                MaybeDepth = yes(Depth0 - 1.0)
+            ;
+                MaybeDepth0 = no,
+                MaybeDepth = no
+            ),
+            proc_dynamic_var_use_info(Deep, ParentCliquePtr, CalleePDPtr,
+                ArgNum, RecursionType, MaybeDepth, Cost, CallStack,
+                VarUseType, MaybeVarUseInfo)
+        )
+    ).
+
+clique_var_use_info(Deep, CliquePtr, ArgNum, VarUseType, MaybeVarUseInfo) :-
+    deep_lookup_clique_parents(Deep, CliquePtr, CSDPtr),
+    deep_lookup_csd_desc(Deep, CSDPtr, CSDDesc),
+    Cost = float(inherit_callseqs(CSDDesc)),
+    find_clique_first_and_other_procs(Deep, CliquePtr, MaybeFirstProc,
+        _OtherProcs),
+    (
+        MaybeFirstProc = yes(FirstPDPtr),
+        create_clique_recursion_costs_report(Deep, CliquePtr,
+            MaybeRecursionReport),
+        (
+            MaybeRecursionReport = ok(RecursionReport),
+            RecursionType = RecursionReport ^ crr_recursion_type
+        ;
+            MaybeRecursionReport = error(Error),
+            RecursionType = rt_errors([Error])
+        ),
+        recursion_type_get_maybe_avg_max_depth(RecursionType, MaybeDepth),
+        proc_dynamic_var_use_info(Deep, CliquePtr, FirstPDPtr, ArgNum,
+            RecursionType, MaybeDepth, Cost, set.init, VarUseType,
+            MaybeVarUseInfo)
+    ;
+        MaybeFirstProc = no,
+        error(this_file ++ "Clique has no first procedure")
+    ).
+
+proc_dynamic_var_use_info(Deep, CliquePtr, PDPtr, ArgNum, RecursionType,
+        MaybeDepth0, ProcCost, CallStack0, VarUseType, MaybeVarUseInfo) :-
+    set.insert(CallStack0, PDPtr, CallStack),
+    create_dynamic_procrep_coverage_report(Deep, PDPtr, MaybeProcrepCoverage),
+    (
+        MaybeProcrepCoverage = ok(ProcrepCoverage),
+        ProcDefn = ProcrepCoverage ^ prci_proc_rep ^ pr_defn,
+        HeadVars = ProcDefn ^ pdr_head_vars,
+        ( index0(HeadVars, ArgNum, head_var_rep(Var, Mode)) ->
+            var_mode_to_var_use_type(Mode, ComputedUse),
+            ( VarUseType = ComputedUse ->
+                true
+            ;
+                PDPtr = proc_dynamic_ptr(PDNum),
+                error(format(
+                    "%s: Var uses do not match, passed: %s calculated from "
+                    ++ "procrep: %s, Arg %d in proc dynamic %d",
+                    [s(this_file), s(string(VarUseType)),
+                     s(string(ComputedUse)), i(ArgNum), i(PDNum)]))
+            ),
+
+            % Prepare callsite information.
+            proc_dynamic_paired_call_site_slots(Deep, PDPtr, Slots),
+            foldl(build_call_site_cost_and_callee_map(Deep),
+                Slots, map.init, CallSiteCostMap),
+            
+            % We're following a recursive call, therefore we descend one level.
+            (
+                MaybeDepth0 = yes(Depth0),
+                MaybeDepth = yes(Depth0 - 1.0)
+            ;
+                MaybeDepth0 = no,
+                MaybeDepth = no
+            ),
+            build_recursive_call_site_cost_map(Deep, CliquePtr, PDPtr,
+                RecursionType, MaybeDepth, MaybeRecursiveCallSiteCostMap),
+            (
+                MaybeRecursiveCallSiteCostMap = ok(RecursiveCallSiteCostMap)
+            ;
+                MaybeRecursiveCallSiteCostMap = error(_),
+                % Try to compute var use information without recursion costs.
+                RecursiveCallSiteCostMap = map.init
+            ),
+
+            % Do the actual computation.
+            Goal = ProcDefn ^ pdr_goal,
+            goal_var_first_use_wrapper(Deep, CliquePtr, CallStack,
+                CallSiteCostMap, RecursiveCallSiteCostMap, RecursionType,
+                MaybeDepth, Goal, ProcCost, Var, VarUseType, VarUseInfo),
+            MaybeVarUseInfo = ok(VarUseInfo)
+        ;
+            PDPtr = proc_dynamic_ptr(PDNum),
+            MaybeVarUseInfo = error(format(
+                "proc_dynamic_var_use_info: ArgNum %d out of range for PD %d",
+                [i(ArgNum), i(PDNum)]))
+        )
+    ;
+        MaybeProcrepCoverage = error(Error),
+        MaybeVarUseInfo = error(Error)
+    ).
+
     % This type represents whether the first use of a variable has been found
     % or not. If it has then the call sequence counts since it was found is
     % stored in this type also.
@@ -118,14 +326,19 @@ cost_until_to_cost_before_end(cost_befor
 :- type var_first_use_static_info
     --->    var_first_use_static_info(
                 fui_deep                :: deep,
-                fui_call_site_map       :: map(goal_path, call_site_perf),
+                fui_clique              :: clique_ptr,
+                fui_call_site_map       :: map(goal_path, cost_and_callees),
+                fui_rec_call_site_map   :: map(goal_path, cs_cost_csq),
                 fui_var                 :: var_rep,
                 fui_var_use             :: var_use_type,
 
                 % A set of call sites whose analysis has started but not yet
                 % completed. We keep this set to prevent infinite recursion
                 % in the analysis itself.
-                fui_call_stack          :: set(proc_static_ptr)
+                fui_call_stack          :: set(proc_dynamic_ptr),
+
+                fui_recursion_type      :: recursion_type,
+                fui_maybe_cur_depth     :: maybe(float)
             ).
 
     % Find the first use of a variable in a goal.
@@ -142,7 +355,25 @@ cost_until_to_cost_before_end(cost_befor
     is det.
 
 goal_var_first_use(GoalPath, Goal, StaticInfo, !CostSoFar, FoundFirstUse) :-
-    Goal = goal_rep(GoalExpr, Detism, _Coverage),
+    Goal = goal_rep(GoalExpr, Detism, Coverage),
+    (
+        % Do not bother exploring this goal if it is never entered.  Or never
+        % finishes and we're looking for a production.
+        (
+            get_coverage_before(Coverage, 0)
+        ;
+            StaticInfo ^ fui_var_use = var_use_production, 
+            (
+                Detism = erroneous_rep
+            ;
+                Detism = failure_rep
+            ;
+                get_coverage_after(Coverage, 0)
+            )
+        )
+    ->
+        FoundFirstUse = have_not_found_first_use
+    ;
     (
         GoalExpr = conj_rep(Conjuncts),
         conj_var_first_use(GoalPath, 1, Conjuncts, StaticInfo, !CostSoFar,
@@ -157,15 +388,16 @@ goal_var_first_use(GoalPath, Goal, Stati
             StaticInfo, !CostSoFar, FoundFirstUse)
     ;
         GoalExpr = ite_rep(Cond, Then, Else),
-        ite_var_first_use(GoalPath, Cond, Then, Else, StaticInfo, !CostSoFar,
-            FoundFirstUse)
+            ite_var_first_use(GoalPath, Cond, Then, Else, StaticInfo,
+                !CostSoFar, FoundFirstUse)
     ;
         (
             GoalExpr = negation_rep(SubGoal),
             SubGoalPath = goal_path_add_at_end(GoalPath, step_neg)
         ;
             GoalExpr = scope_rep(SubGoal, ScopeIsCut),
-            SubGoalPath = goal_path_add_at_end(GoalPath, step_scope(ScopeIsCut))
+                SubGoalPath = goal_path_add_at_end(GoalPath,
+                    step_scope(ScopeIsCut))
         ),
         goal_var_first_use(SubGoalPath, SubGoal, StaticInfo, !CostSoFar,
             FoundFirstUse)
@@ -194,6 +426,7 @@ goal_var_first_use(GoalPath, Goal, Stati
             atomic_trivial_var_first_use(AtomicGoal, BoundVars, !.CostSoFar,
                 StaticInfo, FoundFirstUse)
         )
+        )
     ),
     trace [compile_time(flag("debug_first_var_use")), io(!IO)] (
         io.format("Trace: goal_var_first_use: %s\n",
@@ -211,121 +444,146 @@ goal_var_first_use(GoalPath, Goal, Stati
 
 call_var_first_use(AtomicGoal, BoundVars, GoalPath, StaticInfo,
         CostSoFar, NextCostSoFar, FoundFirstUse) :-
-    Var = StaticInfo ^ fui_var,
-    VarUseType = StaticInfo ^ fui_var_use,
-    map.lookup(StaticInfo ^ fui_call_site_map, GoalPath, CallSitePerf),
+    StaticInfo = var_first_use_static_info(_Deep, CliquePtr, CostMap,
+        RecCostMap, Var, VarUseType, _CallStack, _RecursionType,
+        _MaybeCurDepth),
+    map.lookup(CostMap, GoalPath, CostAndCallees),
 
-    % Calculate the cost of this call and add it to the cost so far.
-    CallSitePerf ^ csf_summary_perf ^ perf_row_maybe_total =
-        MaybeTotal,
-    (
-        MaybeTotal = yes(RowData)
-    ;
-        MaybeTotal = no,
-        CallSitePerf ^ csf_summary_perf ^ perf_row_self =
-            RowData
-    ),
-    ProcCost = RowData ^ perf_row_callseqs_percall,
-    % XXX: this doesn't work for (mutually-)recursive calls, the deep profiler
-    % sets their cost to 1.0. For now we just have to hope that the variables
-    % we're searching for are used in the recursive call so the trick below
-    % works.
-    NextCostSoFar = CostSoFar + ProcCost,
+    % Get the cost of the call.
+    (
+        cost_and_callees_is_recursive(CliquePtr, CostAndCallees),
+        map.search(RecCostMap, GoalPath, CostPrime)
+    ->
+        Cost0 = CostPrime
+    ;
+        Cost0 = CostAndCallees ^ cac_cost
+    ),
+    ( cs_cost_get_calls(Cost0) = 0.0 ->
+        Cost = 0.0
+    ;
+        Cost = cs_cost_get_percall(Cost0)
+    ),
+    NextCostSoFar = CostSoFar + Cost,
 
     % Determine if the variable we're searching for uses of is involved with
     % this call.
     (
         AtomicGoal = plain_call_rep(_, _, Args),
-        (
-            nth_member_search(Args, Var, ArgNum),
-            (
-                VarUseType = var_use_consumption
+        Vars = Args
             ;
-                % If we're looking for a production ensure that this call binds
-                % this variable.
-                VarUseType = var_use_production,
-                member(Var, BoundVars)
+        ( AtomicGoal = higher_order_call_rep(HOVar, Args)
+        ; AtomicGoal = method_call_rep(HOVar, _, Args)
+        ),
+        Vars = [HOVar | Args]
+    ),
+    ( member(Var, Vars) ->
+        
+        solutions((pred(TimeI::out) is nondet :-
+                (
+                    consume_ho_arg(AtomicGoal, Var, TimeI)
             ;
-                VarUseType = var_use_other
+                    call_args_first_use(Args, Cost, StaticInfo, 
+                        CostAndCallees, TimeI)
             )
-        ->
-            (
-                CallSitePerf ^ csf_kind =
-                    normal_call_and_info(Callee)
-            ->
-                PSPtr = Callee ^ nci_callee_desc ^ pdesc_ps_ptr,
-                CallStack0 = StaticInfo ^ fui_call_stack,
-                ( contains(CallStack0, PSPtr) ->
-                    % Don't follow recursive or mutually recursive calls.
-                    % XXX: I'd like to create the result that is the sum of the
-                    % recursive expression: Cost(i) = Cost + Cost(i - 1).
-                    % Note: this makes a pessimistic assumption instead. Note:
-                    % It doesn't matter what type of variable use we're
-                    % searching for either the cost before a consumer is 0.0 or
-                    % the cost after a producer is 0.0. So asserting
-                    % TimeBeforeUse = 0.0 works in both cases.
-                    ProcCostUntilUse = 0.0
-                ;
-                    CallStack = set.insert(CallStack0, PSPtr),
-                    proc_var_first_use(StaticInfo ^ fui_deep, PSPtr, ArgNum,
-                        VarUseType, CallStack, ProcVarUseInfo),
-                    ProcVarUseInfo = var_use_info(ProcCostUntilUse0, _),
+            ), Times),
                     (
-                        VarUseType = var_use_consumption,
+            Times = [],
+            error(this_file ++ ": No solutions for variable first use time")
+        ;
+            Times = [FirstTime | OtherTimes],
+            FoundFirstUse = found_first_use(FirstTime + CostSoFar),
                         (
-                            ProcCostUntilUse0 =
-                                cost_since_proc_start(ProcCostUntilUse)
+                VarUseType = var_use_production
+            =>
+                OtherTimes = []
+            ->
+                true
                         ;
-                            ProcCostUntilUse0 =
-                                cost_before_proc_end(ProcCostAfterUse),
-                            ProcCostUntilUse = ProcCost - ProcCostAfterUse
+                error(this_file ++ 
+                    ": Multiple solutions for variable production time")
                         )
+        ),
+        
+        % Assertion
+        (
+            VarUseType = var_use_production
+        =>
+            member(Var, BoundVars)
+        ->
+            true
                     ;
-                        ( VarUseType = var_use_production
-                        ; VarUseType = var_use_other
+            error(this_file ++ 
+                ": A bound var must be produced by a call if it's an argument.")
                         ),
                         (
-                            ProcCostUntilUse0 =
-                                cost_since_proc_start(ProcCostBeforeUse),
-                            ProcCostUntilUse = ProcCost - ProcCostBeforeUse
+            VarUseType = var_use_consumption
+        =>
+            not member(Var, BoundVars)
+        ->
+            true
                         ;
-                            ProcCostUntilUse0 =
-                                cost_before_proc_end(ProcCostUntilUse)
-                        )
+            error(this_file ++
+                ": A consumed var must not be mentioned in BoundVars.")
+        ),
+        (
+            VarUseType = var_use_production
+        =>
+            not (
+                ( AtomicGoal = higher_order_call_rep(Var, _)
+                ; AtomicGoal = method_call_rep(Var, _, _)
                     )
                 )
+        ->
+            true
             ;
-                % Some builtin calls show up as plain calls in the procedure
-                % representation. Namely builtin.compare. In these cases use
-                % a pessimistic default.
-                ProcCostUntilUse = 0.0
-            ),
-            FoundFirstUse = found_first_use(CostSoFar + ProcCostUntilUse),
-            trace [compile_time(flag("debug_first_var_use")), io(!IO)] (
-                io.format(
-                    "Trace: Set first use info for variable use in call: %s\n",
-                    [s(string(FoundFirstUse))], !IO)
+            error(this_file ++ 
+                ": A HO call site cannot produce it's own HO value.")
             )
         ;
             FoundFirstUse = have_not_found_first_use
-        )
-    ;
-        ( AtomicGoal = higher_order_call_rep(HOVar, Args)
-        % The first argument of this functor is really the type info variable,
-        % not a higher order term.
-        ; AtomicGoal = method_call_rep(HOVar, _, Args)
-        ),
+    ).
+
+:- pred consume_ho_arg(atomic_goal_rep::in(atomic_goal_rep_call),
+    var_rep::in, float::out) is semidet.
+
+consume_ho_arg(higher_order_call_rep(Var, _), Var, 0.0).
+consume_ho_arg(method_call_rep(Var, _, _), Var, 0.0).
+
+:- pred call_args_first_use(list(var_rep)::in, float::in, 
+    var_first_use_static_info::in, cost_and_callees::in, float::out) is nondet.
+
+call_args_first_use(Args, Cost, StaticInfo, CostAndCallees, Time) :-
+    StaticInfo = var_first_use_static_info(Deep, CliquePtr, _CostMap,
+        _RecCostMap, Var, VarUseType, CallStack, RecursionType, MaybeCurDepth),
+    HigherOrder = CostAndCallees ^ cac_call_site_is_ho,
+    Callees = CostAndCallees ^ cac_callees,
+    member_index0(Var, Args, ArgNum), 
+    (
+        HigherOrder = first_order_call,
+        ( empty(Callees) ->
+            % There are no callees, this code is never called.
+            pessimistic_var_use_time(VarUseType, Cost, Time)
+        ; singleton_set(Callees, SingletonCallee) ->
+            CSDPtr = SingletonCallee ^ c_csd,
+            call_site_dynamic_var_use_info(Deep, CliquePtr, CSDPtr,
+                ArgNum, RecursionType, MaybeCurDepth, Cost, CallStack,
+                VarUseType, MaybeVarUseInfo),
         (
-            ( HOVar = Var
-            ; member(Var, Args)
+                MaybeVarUseInfo = ok(VarUseInfo),
+                VarUseInfo = var_use_info(Time, _, _)
+            ;
+                MaybeVarUseInfo = error(_),
+                pessimistic_var_use_time(VarUseType, Cost, Time)
             )
-        ->
-            % XXX: Make a pessimistic default, since we don't bother to perform
-            % this analysis recursively for higher order or method calls.
-            FoundFirstUse = found_first_use(CostSoFar)
         ;
-            FoundFirstUse = have_not_found_first_use
+            error(this_file ++ 
+                "Wrong number of callees for normal call site")
         )
+    ;
+        HigherOrder = higher_order_call,
+        % The compiler cannot push signals or waits into higher order
+        % calls.  Therefore we assume a pessimistic default here.
+        pessimistic_var_use_time(VarUseType, Cost, Time)
     ).
 
 :- inst atomic_trivial_goal_rep
@@ -378,22 +636,8 @@ conj_var_first_use(_, _, [], _, !Cost, h
 conj_var_first_use(GoalPath, ConjNum, [Conj | Conjs], StaticInfo, !CostSoFar,
         FoundFirstUse) :-
     ConjGoalPath = goal_path_add_at_end(GoalPath, step_conj(ConjNum)),
-    VarUseType = StaticInfo ^ fui_var_use,
-    (
-        VarUseType = var_use_consumption,
         goal_var_first_use(ConjGoalPath, Conj, StaticInfo, !CostSoFar,
             HeadFoundFirstUse),
-        conj_var_first_use(GoalPath, ConjNum + 1, Conjs, StaticInfo,
-            !CostSoFar, TailFoundFirstUse)
-    ;
-        ( VarUseType = var_use_production
-        ; VarUseType = var_use_other
-        ),
-        conj_var_first_use(GoalPath, ConjNum + 1, Conjs, StaticInfo,
-            !CostSoFar, TailFoundFirstUse),
-        goal_var_first_use(ConjGoalPath, Conj, StaticInfo, !CostSoFar,
-            HeadFoundFirstUse)
-    ),
     (
         % XXX: if a variable is bound more than once, because it's used with
         % partial instantiation then we want to use the last time it is bound.
@@ -403,6 +647,8 @@ conj_var_first_use(GoalPath, ConjNum, [C
         FoundFirstUse = HeadFoundFirstUse
     ;
         HeadFoundFirstUse = have_not_found_first_use,
+        conj_var_first_use(GoalPath, ConjNum + 1, Conjs, StaticInfo,
+            !CostSoFar, TailFoundFirstUse),
         FoundFirstUse = TailFoundFirstUse
     ).
 
@@ -449,21 +695,10 @@ disj_var_first_use_2(GoalPath, DisjNum, 
         FoundFirstUse) :-
     DisjGoalPath = goal_path_add_at_end(GoalPath, step_disj(DisjNum)),
     VarUseType = StaticInfo ^ fui_var_use,
-    (
-        VarUseType = var_use_consumption,
         goal_var_first_use(DisjGoalPath, Disj, StaticInfo, !CostSoFar,
             HeadFoundFirstUse),
         disj_var_first_use_2(GoalPath, DisjNum + 1, Disjs, StaticInfo,
-            !CostSoFar, TailFoundFirstUse)
-    ;
-        ( VarUseType = var_use_production
-        ; VarUseType = var_use_other
-        ),
-        disj_var_first_use_2(GoalPath, DisjNum + 1, Disjs, StaticInfo,
             !CostSoFar, TailFoundFirstUse),
-        goal_var_first_use(DisjGoalPath, Disj, StaticInfo, !CostSoFar,
-            HeadFoundFirstUse)
-    ),
     (
         HeadFoundFirstUse = have_not_found_first_use,
         TailFoundFirstUse = have_not_found_first_use,
@@ -532,15 +767,14 @@ switch_var_first_use(GoalPath, SwitchedO
             FoundFirstUse = have_not_found_first_use
         ;
             VarUseType = StaticInfo ^ fui_var_use,
-            % XXX: These are the wrong way around.
             (
                 VarUseType = var_use_consumption,
-                DefaultCost = CostBeforeSwitch
+                DefaultCost = CostAfterSwitch
             ;
                 ( VarUseType = var_use_production
                 ; VarUseType = var_use_other
                 ),
-                DefaultCost = CostAfterSwitch
+                DefaultCost = CostBeforeSwitch
             ),
             list.map(ffu_to_float(DefaultCost), FoundFirstUseCases,
                 FirstUseTimes),
@@ -591,8 +825,6 @@ ite_var_first_use(GoalPath, Cond, Then, 
     ThenGoalPath = goal_path_add_at_end(GoalPath, step_ite_then),
     ElseGoalPath = goal_path_add_at_end(GoalPath, step_ite_else),
     VarUseType = StaticInfo ^ fui_var_use,
-    (
-        VarUseType = var_use_consumption,
         CostBeforeITE = !.CostSoFar,
         goal_var_first_use(CondGoalPath, Cond, StaticInfo,
             CostBeforeITE, CostAfterCond, CondFoundFirstUse),
@@ -602,22 +834,7 @@ ite_var_first_use(GoalPath, Cond, Then, 
             CostAfterCond, CostAfterElse, ElseFoundFirstUse),
         weighted_average(Weights, [CostAfterThen, CostAfterElse],
             CostAfterITE),
-        !:CostSoFar = CostAfterITE
-    ;
-        ( VarUseType = var_use_production
-        ; VarUseType = var_use_other
-        ),
-        CostAfterITE = !.CostSoFar,
-        goal_var_first_use(ThenGoalPath, Then, StaticInfo,
-            CostAfterITE, CostAfterThen, ThenFoundFirstUse),
-        goal_var_first_use(ElseGoalPath, Else, StaticInfo,
-            CostAfterITE, CostAfterElse, ElseFoundFirstUse),
-        weighted_average(Weights, [CostAfterThen, CostAfterElse],
-            CostAfterCond),
-        goal_var_first_use(CondGoalPath, Cond, StaticInfo,
-            CostAfterCond, CostBeforeITE, CondFoundFirstUse),
-        !:CostSoFar = CostBeforeITE
-    ),
+    !:CostSoFar = CostAfterITE,
     (
         CondFoundFirstUse = found_first_use(_),
         FoundFirstUse = CondFoundFirstUse
@@ -660,149 +877,42 @@ ffu_to_float(_, found_first_use(CostBefo
 
 %----------------------------------------------------------------------------%
 
-    % proc_var_first_use(Deep, PSPtr, N, VarUseType, CallStack, VarUseInfo).
-    %
-    % Find the first use of the Nth argument of the procedure given by PSPtr.
-    %
-:- pred proc_var_first_use(deep::in, proc_static_ptr::in, int::in,
-    var_use_type::in, set(proc_static_ptr)::in, var_use_info::out) is det.
-
-proc_var_first_use(Deep, PSPtr, ArgNum, VarUseType, CallStack, VarUseInfo) :-
-    generic_vars_first_use(head_var_by_pos(ArgNum), Deep, PSPtr, CallStack,
-        MaybeProcVarUseInfo),
-    (
-        MaybeProcVarUseInfo = ok(proc_var_use_dump_info(_, VarUseInfos)),
-        ( VarUseInfos = [VarUseInfoPrime] ->
-            VarUseInfo = VarUseInfoPrime
-        ;
-            error(this_file ++
-                "Expecting exactly one result in proc_var_first_use")
-        )
-    ;
-        MaybeProcVarUseInfo = error(_),
-        % Some errors can be caused by trying to look up procedures that can't
-        % be found. For example float.round_to_int is defined using foreign
-        % code, when it gets inlined into another predicate the proc static
-        % pointer points to the foreign code which can't be looked up even
-        % though it uses a 'plain_call' call site.
-        % Return a pessimistic default here.
-        pessimistic_var_use_info(VarUseType, VarUseInfo)
-    ),
-    trace [compile_time(flag("debug_first_var_use")), io(!IO)] (
-        io.format("Trace: prog_var_first_use: %s\n",
-            [s(string(PSPtr))], !IO)
-    ).
-
-:- pred head_var_by_pos(int::in, list(head_var_rep)::in,
-    list(var_rep)::out, list(var_use_type)::out) is det.
-
-head_var_by_pos(ArgPos, HeadVars, [Var], [VarUseType]) :-
-    list.index1_det(HeadVars, ArgPos, HeadVar),
-    HeadVar = head_var_rep(Var, VarMode),
-    var_mode_to_var_use(VarMode, VarUseType).
-
-head_vars_all(HeadVars, Vars, VarUseTypes) :-
-    list.map2((pred(HeadVar::in, Var::out, VarUseType::out) is det :-
-            HeadVar = head_var_rep(Var, VarMode),
-            var_mode_to_var_use(VarMode, VarUseType)
-        ), HeadVars, Vars, VarUseTypes).
-
-var_mode_to_var_use(var_mode_rep(InitialInst, FinalInst), VarUseType) :-
-    (
-        InitialInst = ir_ground_rep,
-        FinalInst = ir_ground_rep
-    ->
-        VarUseType = var_use_consumption
-    ;
-        InitialInst = ir_free_rep,
-        FinalInst = ir_ground_rep
-    ->
-        VarUseType = var_use_production
-    ;
-        VarUseType = var_use_other
-    ).
-
-pessimistic_var_use_info(VarUseType, VarUseInfo) :-
-    (
-        VarUseType = var_use_consumption,
-        CostUntilUse = cost_since_proc_start(0.0)
-    ;
-        ( VarUseType = var_use_production
-        ; VarUseType = var_use_other
-        ),
-        CostUntilUse = cost_before_proc_end(0.0)
-    ),
-    VarUseInfo = var_use_info(CostUntilUse, VarUseType).
-
-    % Perform the var_first_use for the vars returned by the closure.
-    %
-generic_vars_first_use(VarsPred, Deep, PSPtr, CallStack, MaybeResult) :-
-    create_proc_report(Deep, PSPtr, MaybeProcReport),
-    (
-        MaybeProcReport = ok(ProcReport),
-        create_static_procrep_coverage_report(Deep, PSPtr, MaybeProcRepCoverage),
-        (
-            MaybeProcRepCoverage = ok(ProcRepCoverageInfo),
-            ProcRepCoverageInfo = procrep_coverage_info(_, ProcRep),
-            ProcDefnRep = ProcRep ^ pr_defn,
-            HeadVars = ProcDefnRep ^ pdr_head_vars,
-            VarsPred(HeadVars, Vars, VarUseTypes),
-            ProcReport = proc_report(ProcSummary, CallSiteSummaries),
-            MaybeTotal = ProcSummary ^ perf_row_maybe_total,
-            (
-                MaybeTotal = yes(RowData)
-            ;
-                MaybeTotal = no,
-                RowData = ProcSummary ^ perf_row_self
-            ),
-            ProcAverageCost = RowData ^ perf_row_callseqs_percall,
-            GoalRep = ProcDefnRep ^ pdr_goal,
-            list.foldl((pred(CSS::in, Map0::in, Map::out) is det :-
-                    GoalPath = CSS ^ csf_summary_perf ^ perf_row_subject
-                        ^ csdesc_goal_path,
-                    map.det_insert(Map0, GoalPath, CSS, Map)
-                ), CallSiteSummaries, map.init, CallSiteMap),
-            list.map_corresponding(goal_var_first_use_wrapper(Deep, CallStack,
-                CallSiteMap, GoalRep), Vars, VarUseTypes, VarUseInfos),
-            MaybeResult =
-                ok(proc_var_use_dump_info(ProcAverageCost, VarUseInfos))
-        ;
-            MaybeProcRepCoverage = error(Error),
-            MaybeResult = error(Error)
-        )
-    ;
-        MaybeProcReport = error(Error),
-        MaybeResult = error(Error)
-    ).
-
-:- pred goal_var_first_use_wrapper(deep::in, set(proc_static_ptr)::in,
-    map(goal_path, call_site_perf)::in, goal_rep(coverage_info)::in,
-    var_rep::in, var_use_type::in, var_use_info::out) is det.
+:- pred goal_var_first_use_wrapper(deep::in, clique_ptr::in,
+    set(proc_dynamic_ptr)::in, 
+    map(goal_path, cost_and_callees)::in, map(goal_path, cs_cost_csq)::in, 
+    recursion_type::in, maybe(float)::in,
+    goal_rep(coverage_info)::in, float::in, var_rep::in, 
+    var_use_type::in, var_use_info::out) is det.
 
-goal_var_first_use_wrapper(Deep, CallStack, CallSiteMap, Goal, Var,
+goal_var_first_use_wrapper(Deep, CliquePtr, CallStack, CallSiteMap,
+        RecursiveCallSiteMap, RT, MaybeCurDepth, Goal, ProcCost, Var,
         VarUseType, VarUseInfo) :-
     goal_var_first_use(empty_goal_path, Goal,
-        var_first_use_static_info(Deep, CallSiteMap, Var, VarUseType,
-            CallStack),
+        var_first_use_static_info(Deep, CliquePtr, CallSiteMap,
+            RecursiveCallSiteMap, Var, VarUseType, CallStack, RT,
+            MaybeCurDepth),
         0.0, _Cost, FoundFirstUse),
     (
-        FoundFirstUse = found_first_use(CostUntilUseRaw),
+        FoundFirstUse = found_first_use(CostUntilUse),
+        VarUseInfo = var_use_info(CostUntilUse, ProcCost, VarUseType)
+    ;
+        FoundFirstUse = have_not_found_first_use,
+        % If the first use has not been found then:
+        %  a) for productions, they must have been produced, this is an error.
+        %  b) For consumptions. the compiler will insert a wait so any calls
+        %     following this one can assume that a wait has already been
+        %     performed.
         (
-            VarUseType = var_use_consumption,
-            CostUntilUse = cost_since_proc_start(CostUntilUseRaw)
+            VarUseType = var_use_production,
+            error(this_file ++ 
+                ": Goal did not produce a variable that it should have")
         ;
-            ( VarUseType = var_use_production
-            ; VarUseType = var_use_other
-            ),
-            CostUntilUse = cost_before_proc_end(CostUntilUseRaw)
-        ),
-        VarUseInfo = var_use_info(CostUntilUse, VarUseType)
+            VarUseType = var_use_consumption,
+            VarUseInfo = var_use_info(ProcCost, ProcCost, VarUseType)
     ;
-        % If the first use has not been found, then use the average cost of the
-        % procedure as the cost before the first use, since the variable is
-        % never used.
-        FoundFirstUse = have_not_found_first_use,
-        pessimistic_var_use_info(VarUseType, VarUseInfo)
+            VarUseType = var_use_other,
+            pessimistic_var_use_info(VarUseType, ProcCost, VarUseInfo)
+        )
     ).
 
 %-----------------------------------------------------------------------------%
Index: library/Mercury.options
===================================================================
RCS file: /home/mercury1/repository/mercury/library/Mercury.options,v
retrieving revision 1.33
diff -u -p -b -r1.33 Mercury.options
--- library/Mercury.options	12 Jun 2009 05:00:26 -0000	1.33
+++ library/Mercury.options	6 Oct 2010 23:43:20 -0000
@@ -89,3 +89,8 @@ MCFLAGS-thread.semaphore += --no-local-t
 
 # Work around a problem in the HiPE compiler (as of Erlang R11B5).
 MCFLAGS-bitmap += --no-erlang-native-code
+
+# Work around a warning for termination analysis of the user defined equality
+# and comparison code for lazy values.
+MCFLAGS-lazy += --no-halt-at-warn
+
Index: library/lazy.m
===================================================================
RCS file: library/lazy.m
diff -N library/lazy.m
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ library/lazy.m	7 Oct 2010 00:09:47 -0000
@@ -0,0 +1,160 @@
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
+%-----------------------------------------------------------------------------%
+% Copyright (C) 1999, 2006, 2009-2010 The University of Melbourne.
+% This file may only be copied under the terms of the GNU Library General
+% Public License - see the file COPYING.LIB in the Mercury distribution.
+%-----------------------------------------------------------------------------%
+%
+% lazy.m - provides support for optional explicit lazy evaluation.
+%
+% Author: fjh, pbone.
+% Stability: medium.
+%
+% This module provides the data type `lazy(T)' and the functions `val',
+% `delay', and `force', which can be used to emulate lazy evaluation.
+%
+% A field within a data structure can be made lazy by wrapping it within a lazy
+% type.  Or a lazy data-structure can be implemented, for example:
+% 
+% :- type lazy_list(T)
+%     --->    lazy_list(
+%                 lazy(list_cell(T))
+%             ).
+% 
+% :- type list_cell(T)
+%     --->    cons(T, lazy_list(T))
+%     ;       nil.
+%
+% Note that this makes every list cell lazy, whereas:
+%
+%   lazy(list(T)) 
+%
+% uses only one thunk for the entire list. And:
+%
+%   list(lazy(T)) 
+%
+% uses one thunk for every element, but the list's structure is not lazy.
+%
+%-----------------------------------------------------------------------------%
+
+:- module lazy.
+:- interface.
+
+    % A lazy(T) is a value of type T which will only be evaluated on
+    % demand.
+    %
+:- type lazy(T).
+
+    % Convert a value from type T to lazy(T)
+    %
+:- func val(T) = lazy(T).
+
+    % Construct a lazily-evaluated lazy(T) from a closure
+    %
+:- func delay((func) = T) = lazy(T).
+
+    % Force the evaluation of a lazy(T), and return the result as type T.
+    % Note that if the type T may itself contains subterms of type lazy(T),
+    % as is the case when T is a recursive type like the lazy_list(T) type
+    % defined in lazy_list.m, those subterms will not be evaluated --
+    % force/1 only forces evaluation of the lazy/1 term at the top level.
+    %
+    % A second call to force will not re-evaluate the lazy expression, it will
+    % simply return T.
+    %
+:- func force(lazy(T)) = T.
+
+    % Test lazy values for equality.
+    %
+:- pred equal_values(lazy(T)::in, lazy(T)::in) is semidet.
+
+:- pred compare_values(comparison_result::uo, lazy(T)::in, lazy(T)::in) is det.
+
+%-----------------------------------------------------------------------------%
+%
+% The declarative semantics of the above constructs are given by the
+% following equations:
+%
+%   val(X) = delay((func) = X).
+%
+%   force(delay(F)) = apply(F).
+%
+% The operational semantics satisfy the following:
+%
+% - val/1 and delay/1 both take O(1) time and use O(1) additional space.
+%   In particular, delay/1 does not evaluate its argument using apply/1.
+%
+% - When force/1 is first called for a given term, it uses apply/1 to
+%   evaluate the term, and then saves the result computed by destructively
+%   modifying its argument; subsequent calls to force/1 on the same term
+%   will return the same result.  So the time to evaluate force(X), where
+%   X = delay(F), is O(the time to evaluate apply(F)) for the first call,
+%   and O(1) time for subsequent calls.
+%
+% - Equality on values of type lazy(T) is implemented by calling force/1
+%   on both arguments and comparing the results.  So if X and Y have type
+%   lazy(T), and both X and Y are ground, then the time to evaluate X = Y
+%   is O(the time to evaluate (X1 = force(X)) + the time to evaluate
+%   (Y1 = force(Y)) + the time to unify X1 and Y1).
+%
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module mutvar.
+
+:- type lazy(T)
+    --->    lazy(mutvar(lazy_state(T)))
+    where equality is equal_values,
+          comparison is compare_values.
+
+    % Note that we use a user-defined equality predicate to ensure
+    % that unifying two lazy(T) values will do the right thing.
+    %
+:- type lazy_state(T)
+    --->    value(T)
+    ;       closure((func) = T).
+
+%-----------------------------------------------------------------------------%
+
+equal_values(X, Y) :-
+    force(X) = force(Y).
+
+compare_values(R, X, Y) :-
+    compare(R, force(X), force(Y)).
+
+%-----------------------------------------------------------------------------%
+
+val(X) = lazy(Mutvar) :-
+    promise_pure (
+        impure new_mutvar(value(X), Mutvar)
+    ).
+
+delay(F) = lazy(Mutvar) :-
+    promise_pure (
+        impure new_mutvar(closure(F), Mutvar)
+    ).
+
+%-----------------------------------------------------------------------------%
+
+force(Lazy) = Value :-
+    % The promise_equivalent_solutions scope is needed to tell the compiler
+    % that force will return equal answers given arguments that are equal
+    % but that have different representations.
+    promise_equivalent_solutions [Mutvar] (
+        Lazy = lazy(Mutvar)
+    ),
+    promise_pure (
+        impure get_mutvar(Mutvar, State),
+        (
+            State = value(Value)
+        ;
+            State = closure(Thunk),
+            Value = apply(Thunk),
+            impure set_mutvar(Mutvar, value(Value))
+        )
+    ).
+
+%-----------------------------------------------------------------------------%
Index: library/library.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/library.m,v
retrieving revision 1.126
diff -u -p -b -r1.126 library.m
--- library/library.m	16 Sep 2010 00:39:10 -0000	1.126
+++ library/library.m	7 Oct 2010 01:56:32 -0000
@@ -79,6 +79,7 @@
 :- import_module int.
 :- import_module integer.
 :- import_module io.
+:- import_module lazy.
 :- import_module lexer.
 :- import_module list.
 :- import_module map.
@@ -251,6 +252,7 @@ mercury_std_library_module("injection").
 mercury_std_library_module("int").
 mercury_std_library_module("integer").
 mercury_std_library_module("io").
+mercury_std_library_module("lazy").
 mercury_std_library_module("lexer").
 mercury_std_library_module("library").
 mercury_std_library_module("list").
Index: library/list.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/list.m,v
retrieving revision 1.194
diff -u -p -b -r1.194 list.m
--- library/list.m	16 Sep 2010 00:39:10 -0000	1.194
+++ library/list.m	6 Oct 2010 23:29:36 -0000
@@ -162,6 +162,15 @@
     %
 :- pred list.member(T::out, list(T)::in, list(T)::out) is nondet.
 
+    % list.member_index0(Elem, List, Index).
+    %
+    % True iff `List' contains `Elem' at the zero-based index `Index'.
+    %
+:- pred list.member_index0(T, list(T), int).
+:- mode list.member_index0(in, in, in) is semidet.
+:- mode list.member_index0(in, in, out) is nondet.
+:- mode list.member_index0(out, in, out) is nondet.
+
     % list.contains(List, Elem) iff list.member(Elem, List).
     % Sometimes you need the arguments in this order, because you want to
     % construct a closure with only the list.
@@ -1708,6 +1717,10 @@ list.member(Element, List, SubList) :-
     SubList = [Element | _],
     list.append(_, SubList, List).
 
+list.member_index0(X, [X | _], 0).
+list.member_index0(X, [_ | Xs], Index + 1) :-
+    list.member_index0(X, Xs, Index).
+
 list.contains(List, Elem) :-
     list.member(Elem, List).
 
Index: library/maybe.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/maybe.m,v
retrieving revision 1.4
diff -u -p -b -r1.4 maybe.m
--- library/maybe.m	9 Jan 2010 05:49:41 -0000	1.4
+++ library/maybe.m	6 Oct 2010 23:29:36 -0000
@@ -28,6 +28,9 @@
     --->    no
     ;       yes(I).
 
+:- inst maybe_yes(I)
+    --->    yes(I).
+
 :- type maybe_error
     --->    ok
     ;       error(string).
@@ -40,6 +43,9 @@
     --->    ok(I)
     ;       error(ground).
 
+:- inst maybe_error_ok(I)
+    --->    ok(I).
+
     % map_maybe(P, yes(Value0), yes(Value)) :- P(Value, Value).
     % map_maybe(_, no, no).
     %
Index: mdbcomp/feedback.automatic_parallelism.m
===================================================================
RCS file: /home/mercury1/repository/mercury/mdbcomp/feedback.automatic_parallelism.m,v
retrieving revision 1.2
diff -u -p -b -r1.2 feedback.automatic_parallelism.m
--- mdbcomp/feedback.automatic_parallelism.m	24 Aug 2010 06:47:03 -0000	1.2
+++ mdbcomp/feedback.automatic_parallelism.m	6 Oct 2010 23:29:36 -0000
@@ -21,7 +21,6 @@
 
 :- import_module mdbcomp.program_representation.
 
-:- import_module int.
 :- import_module list.
 :- import_module set.
 :- import_module string.
@@ -281,16 +280,18 @@
     %
 :- func parallel_exec_metrics_get_time_saving(parallel_exec_metrics) = float.
 
+    % The amount of time spent 'on cpu', (seq time plus non-dead overheads).
+    %
+:- func parallel_exec_metrics_get_cpu_time(parallel_exec_metrics) = float.
+
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
 :- implementation.
 
-:- import_module bool.
 :- import_module exception.
 :- import_module float.
 :- import_module map.
-:- import_module maybe.
 :- import_module require.
 :- import_module svmap.
 :- import_module unit.
@@ -306,6 +307,10 @@ parallel_exec_metrics_get_time_saving(PE
     SeqTime = PEM ^ pem_seq_time,
     ParTime = PEM ^ pem_par_time.
 
+parallel_exec_metrics_get_cpu_time(PEM) = SeqTime + Overheads :-
+    SeqTime = PEM ^ pem_seq_time,
+    Overheads = PEM ^ pem_par_overheads.
+
 %-----------------------------------------------------------------------------%
 %
 % Helper predicates for the candidate parallel conjunctions type.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: Digital signature
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20101007/3078fee7/attachment.sig>


More information about the reviews mailing list