[m-rev.] for review: implicit_parallelism and distance_granularity (new modules)

Julien Fischer juliensf at csse.unimelb.edu.au
Tue Nov 28 16:11:01 AEDT 2006


Hi Jerome,

Here are some initial comments,


On Tue, 28 Nov 2006, Jerome Tannier wrote:

> Estimated hours taken: 60
> Branches: main
>
> Two new modules for review : implicit_parallelism (already submitted) and
> distance_granularity. I have corrected a few mistakes since 
> implicit_parallelism was
> submitted the first time.
>

The log message needs more detail on what this change actually adds,
e.g.

 	This change adds two new passes.  The first, implicit parallism,
 	uses deep profiling feedback information, generate by the
 	mdprof_feedback, to introduce parallel conjunction where ...


Since this is a fairly large change you should go into a bit of detail.



> compiler/dep_par_conj.m:
>        Moved find_shared_variables in the interface (needed for 
> implicit_parallelism.m).

There should be a space between the entry for each file.

> compiler/distance_granularity.m:
> 	New module which controls the granularity using the distance metric.
> compiler/hlds_pred.m:
> 	Add a predicate to set the arity of a predicate (needed for 
> distance_granularity).
> compiler/implicit_parallelism.m:
>        New module which reads the profiling feedback file and decides where
>        parallelism should be introduced.
> compiler/mercury_compile.m:
>        Add the calls to apply implicit parallelism and to control 
> granularity using
> 	distance metric.
> compiler/options:
>        Add implicit-parallelism, feedback-file and distance-granularity 
> options.
> compiler/pred_table.m:
> 	Add a predicate to get the next pred_id available (needed for
> 	distance_granularity).
> compiler/transform_hlds.m:
>        Add transform_paralellism and distance_granularity in the imported 
> modules.

Actually you're including them, not importing them.


> deep_profiler/dump.m:
>        Add "all" option to dump everything out of the Deep.data file.
> deep_profiler/mdprof_feedback.m:
>        Rename distribution to measure.
>        Add handling of dump_stages and dump_options.
>        Correct the way the list of CSS is built (elems were put in the wrong 
> order).
>
>
...

> Index: compiler/implicit_parallelism.m
> ===================================================================
> RCS file: compiler/implicit_parallelism.m
> diff -N compiler/implicit_parallelism.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ compiler/implicit_parallelism.m	24 Nov 2006 01:16:44 -0000
> @@ -0,0 +1,935 @@
> +%-----------------------------------------------------------------------------%
> +% vim: ft=mercury ts=4 sw=4 et
> +%-----------------------------------------------------------------------------%
> +% Copyright (C) 2006 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% Author: tannier.
> +%
> +% This module reads the profiling feedback file generated by the 
> mdprof_feedback
> +% module and decides where parallelism should be used (implicit 
> parallelism).
> +%
> +%TODO - Once a call which has to be parallelized is found, search forward 
> AND
> +%       backward for the closet goal to be parallelized/parallel conjunction 
> and
> +%       determine which side is the best (on the basis of the number of 
> shared
> +%       variables).
> +%
> +%-----------------------------------------------------------------------------%
> +
> +:- module transform_hlds.implicit_parallelism.
> +:- interface.
> +
> +:- import_module hlds.hlds_goal.
> +:- import_module hlds.hlds_module.
> +
> +:- import_module io.
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- pred apply_implicit_parallelism_transformation(module_info::in,
> +    module_info::out, string::in, io::di, io::uo) is det.
> +
> +    % Create a conjunction.
> +    %
> +:- pred create_conj(hlds_goal::in, hlds_goal::in, conj_type::in,
> +    hlds_goal::out) is det.
> +
> +
> +%-----------------------------------------------------------------------------%

...

> +    % Represent a call site static which has to be parallelized.
> +    %


I suggest renaming that type to: candidate_call_site.

> +:- type css_to_be_parallelized
> +    --->    css_to_be_parallelized(
> +                caller      :: string,
> +                slot_number :: int,
> +                kind        :: string,
> +                callee      :: string
> +            ).

(Please document what each of the fields means.)

This type needs more structure.  The `kind' field should definitely not
be a string.  There should be a type:

 	:- type call_site_kind
 		--->	csk_normal
 		;	csk_special
 		;	csk_higher_order
 		;	csk_method
 		;	csk_callback.

The parser should convert the values in the feedback file to values of
this kind.

...

> +apply_implicit_parallelism_transformation(!ModuleInfo, FeedbackFile, !IO) :-
> +    parse_feedback_file(FeedbackFile, MaybeCSSListToBeParallelized, !IO),

I suggest: s/MaybeCSSListToBeParallelized/MaybeCandidateCallSites/

> +    (
> +        MaybeCSSListToBeParallelized = error(Err),
> +        io.stderr_stream(Stderr, !IO),
> +        io.write_string(Stderr, Err ++ "\n", !IO)
> +    ;
> +        MaybeCSSListToBeParallelized = ok(CSSListToBeParallelized),
> +        module_info_predids(!.ModuleInfo, PredIds),
> +        process_preds_for_implicit_parallelism(PredIds,
> +            CSSListToBeParallelized, !ModuleInfo)
> +    ).
> +
> +    % Process predicates for implicit parallelism.
> +    %
> +:- pred process_preds_for_implicit_parallelism(list(pred_id)::in,
> +    list(css_to_be_parallelized)::in, module_info::in, module_info::out)
> +    is det.
> +
> +process_preds_for_implicit_parallelism([], _CSSListToBeParallelized,
> +        !ModuleInfo).
> +process_preds_for_implicit_parallelism([PredId | PredIdList],
> +        CSSListToBeParallelized, !ModuleInfo) :-
> +    module_info_pred_info(!.ModuleInfo, PredId, PredInfo),
> +    ProcIds = pred_info_non_imported_procids(PredInfo),
> +    process_procs_for_implicit_parallelism(PredId, ProcIds,
> +        CSSListToBeParallelized, !ModuleInfo),
> +    process_preds_for_implicit_parallelism(PredIdList,
> +        CSSListToBeParallelized, !ModuleInfo).
> +
> +    % Process procedures for implicit parallelism.
> +    %
> +:- pred process_procs_for_implicit_parallelism(pred_id::in,
> +    list(proc_id)::in, list(css_to_be_parallelized)::in,
> +    module_info::in, module_info::out) is det.
> +
> +process_procs_for_implicit_parallelism(_PredId, [],
> +        _CSSListToBeParallelized, !ModuleInfo).
> +process_procs_for_implicit_parallelism(PredId, [ProcId | ProcIds],
> +        CSSListToBeParallelized, !ModuleInfo) :-
> +    module_info_pred_proc_info(!.ModuleInfo, PredId, ProcId,
> +        PredInfo0, ProcInfo0),
> +    % Initialize the counter for the slot number.
> +    Counter = counter.init(0),

I suggest: s/Counter/SiteNumCounter/
(and similarly throughout.)


> +    pred_proc_id_to_raw_id(PredInfo0, ProcId, CallerRawId),
> +    get_callees_feedback(CallerRawId, CSSListToBeParallelized, [],
> +        CalleesList),

I suggest: s/CalleesList/CallSites/

> +    list.length(CalleesList, CalleesListLength),

and s/CalleesListLength/NumCallSites/

In general I would change the variable naming to refer to CallSites
rather than Callees throughout.


> +    ( CalleesListLength = 0 ->
> +        % No calls to be parallelized in this procedure.
> +        process_procs_for_implicit_parallelism(PredId, ProcIds,
> +            CSSListToBeParallelized, !ModuleInfo)
> +    ;
> +        proc_info_get_goal(ProcInfo0, Body0),
> +        process_goal_for_implicit_parallelism(Body0, Body, ProcInfo0,
> +            !ModuleInfo, no, _, 0, _, CalleesList, _, Counter, _),
> +        proc_info_set_goal(Body, ProcInfo0, ProcInfo1),
> +        pred_info_get_markers(PredInfo0, Markers0),
> +        add_marker(marker_may_have_parallel_conj, Markers0, Markers),
> +        pred_info_set_markers(Markers, PredInfo0, PredInfo1),

See my comments about the changes to mercury_compile.m below.

> +        requantify_proc(ProcInfo1, ProcInfo2),
> +        RecomputeAtomic = no,
> +        recompute_instmap_delta_proc(RecomputeAtomic, ProcInfo2, ProcInfo,
> +            !ModuleInfo),
> +        pred_info_set_proc_info(ProcId, ProcInfo, PredInfo1, PredInfo),
> +        module_info_set_pred_info(PredId, PredInfo, !ModuleInfo),
> +        process_procs_for_implicit_parallelism(PredId, ProcIds,
> +            CSSListToBeParallelized, !ModuleInfo)
> +    ).
> +
> +    % By using the feedback file, build a list of css_to_be_parallelized 
> whose
> +    % caller is equal to the first parameter.
> +    %

The comment doesn't make the function of this procedure clear.  What the
procedure does is filter the list of call site information from the
feedback file so that the resulting list only contains those call sites
that belong to the first argument, e.g. the caller.

> +:- pred get_callees_feedback(string::in, list(css_to_be_parallelized)::in,
> +    list(css_to_be_parallelized)::in, list(css_to_be_parallelized)::out) is 
> det.
> +
> +get_callees_feedback(_Caller, [], !ResultAcc).
> +get_callees_feedback(Caller, [CSSToBeParallelized | 
> CSSListToBeParallelized],
> +        !ResultAcc) :-
> +    CSSToBeParallelized = css_to_be_parallelized(CSSCaller, _, _, _),
> +    ( Caller = CSSCaller ->
> +        !:ResultAcc = [ CSSToBeParallelized | !.ResultAcc ],
> +        get_callees_feedback(Caller, CSSListToBeParallelized, !ResultAcc)
> +    ;
> +        get_callees_feedback(Caller, CSSListToBeParallelized, !ResultAcc)
> +    ).
> +
> +    % Process a goal for implicit parallelism.
> +    % MaybeConj is the conjunction which contains HLDSGoal.
> +    %
> +:- pred process_goal_for_implicit_parallelism(hlds_goal::in, hlds_goal::out,
> +    proc_info::in, module_info::in, module_info::out,
> +    maybe(hlds_goal_expr)::in, maybe(hlds_goal_expr)::out, int ::in, 
> int::out,
> +    list(css_to_be_parallelized)::in, list(css_to_be_parallelized)::out,
> +    counter::in, counter::out) is det.
> +
> +process_goal_for_implicit_parallelism(!HLDSGoal, ProcInfo, !ModuleInfo,
> +    !MaybeConj, !IndexInConj, !CalleeListToBeParallelized, !Counter) :-
> +    !.HLDSGoal = HLDSGoalExpr0 - HLDSGoalInfo,

I suggest:

 	s/HLDSGoal/Goal/
 	s/HLDSGoalExpr/GoalExpr/
 	s/HLDSGoalInfo/GoalInfo/

since the latter are what is commonly used throughout the compiler.


> +    (
> +        HLDSGoalExpr0 = unify(_, _, _, _, _),
> +        increment_index_if_in_conj(!.MaybeConj, !IndexInConj)
> +    ;
> +        HLDSGoalExpr0 = plain_call(_, _, _, _, _, _),
> +        process_call_for_implicit_parallelism(!.HLDSGoal, ProcInfo, 
> !ModuleInfo,
> +            !IndexInConj, !MaybeConj, !CalleeListToBeParallelized, !Counter)
> +        % We deal with the index in the conjunction in
> +        % process_call_for_implicit_parallelism.
> +    ;
> +        HLDSGoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _),
> +        process_call_for_implicit_parallelism(!.HLDSGoal, ProcInfo, 
> !ModuleInfo,
> +            !IndexInConj, !MaybeConj, !CalleeListToBeParallelized, !Counter)
> +    ;
> +        HLDSGoalExpr0 = generic_call(Details, _, _, _),
> +        (
> +            Details = higher_order(_, _, _, _),
> +            process_call_for_implicit_parallelism(!.HLDSGoal, ProcInfo,
> +                !ModuleInfo, !IndexInConj, !MaybeConj,
> +                !CalleeListToBeParallelized, !Counter)
> +        ;
> +            Details = class_method(_, _, _, _),
> +            process_call_for_implicit_parallelism(!.HLDSGoal, ProcInfo,
> +                !ModuleInfo, !IndexInConj, !MaybeConj,
> +                !CalleeListToBeParallelized, !Counter)
> +        ;
> +            Details = event_call(_),
> +            increment_index_if_in_conj(!.MaybeConj, !IndexInConj)
> +        ;
> +            Details = cast(_),
> +            increment_index_if_in_conj(!.MaybeConj, !IndexInConj)
> +        )
> +    ;
> +        % No distinction is made between plain conjunctions and parallel
> +        % conjunctions. We have to process the parallel conjunction for the
> +        % slot number.
> +        HLDSGoalExpr0 = conj(_, _),
> +        process_conj_for_implicit_parallelism(HLDSGoalExpr0, HLDSGoalExpr, 
> 1,
> +            ProcInfo, !ModuleInfo, !CalleeListToBeParallelized, !Counter),
> +        % A plain conjunction will never be contained in an other plain
> +        % conjunction. As for parallel conjunctions, they wont

s/wont/will not/

> +        % be modified. Therefore, incrementing the index suffices (no need 
> to
> +        % call update_conj_and_index).
> +        !:HLDSGoal = HLDSGoalExpr - HLDSGoalInfo,
> +        increment_index_if_in_conj(!.MaybeConj, !IndexInConj)
> +    ;
> +        HLDSGoalExpr0 = disj(HLDSGoals0),
> +        process_disj_for_implicit_parallelism(HLDSGoals0, [], HLDSGoals,
> +            ProcInfo, !ModuleInfo, !CalleeListToBeParallelized, !Counter),
> +        GoalProcessed = disj(HLDSGoals) - HLDSGoalInfo,
> +        update_conj_and_index(!MaybeConj, GoalProcessed, !IndexInConj),
> +        % If we are not in a conjunction, then we need to return the 
> modified
> +        % value of HLDSGoal. In we are in a conjunction, that information is 
> not
> +        % read (see process_conj_for_implicit_parallelism).
> +        !:HLDSGoal = GoalProcessed
> +    ;
> +        HLDSGoalExpr0 = switch(Var, CanFail, Cases0),
> +        process_switch_cases_for_implicit_parallelism(Cases0, [], Cases,
> +            ProcInfo, !ModuleInfo, !CalleeListToBeParallelized, !Counter),
> +        GoalProcessed = switch(Var, CanFail, Cases) - HLDSGoalInfo,
> +        update_conj_and_index(!MaybeConj, GoalProcessed, !IndexInConj),
> +        !:HLDSGoal = GoalProcessed
> +    ;
> +        HLDSGoalExpr0 = negation(HLDSGoal0),
> +        process_goal_for_implicit_parallelism(HLDSGoal0, HLDSGoal, ProcInfo,
> +            !ModuleInfo, !MaybeConj, !IndexInConj, 
> !CalleeListToBeParallelized,
> +            !Counter),
> +        GoalProcessed = negation(HLDSGoal) - HLDSGoalInfo,
> +        update_conj_and_index(!MaybeConj, GoalProcessed, !IndexInConj),
> +        !:HLDSGoal = GoalProcessed
> +    ;
> +        HLDSGoalExpr0 = scope(_, _),
> +        increment_index_if_in_conj(!.MaybeConj, !IndexInConj)
> +    ;
> +        HLDSGoalExpr0 = if_then_else(Vars, If0, Then0, Else0),
> +        % 0 is the default value when we are not in a conjunction (in this 
> case
> +        % an if then else).
> +        process_goal_for_implicit_parallelism(If0, If, ProcInfo, 
> !ModuleInfo,
> +            no, _, 0, _, !CalleeListToBeParallelized, !Counter),
> +        process_goal_for_implicit_parallelism(Then0, Then, ProcInfo, 
> !ModuleInfo
> +            , no, _, 0, _, !CalleeListToBeParallelized, !Counter),
> +        process_goal_for_implicit_parallelism(Else0, Else, ProcInfo, 
> !ModuleInfo
> +            , no, _, 0, _, !CalleeListToBeParallelized, !Counter),
> +        GoalProcessed = if_then_else(Vars, If, Then, Else) - HLDSGoalInfo,
> +        update_conj_and_index(!MaybeConj, GoalProcessed, !IndexInConj),
> +        !:HLDSGoal = GoalProcessed
> +    ;
> +        HLDSGoalExpr0 = shorthand(_),
> +        increment_index_if_in_conj(!.MaybeConj, !IndexInConj)
> +    ).
> +
> +    % Increment the index if we are in a conjunction.
> +    %
> +:- pred increment_index_if_in_conj(maybe(hlds_goal_expr)::in, int::in, 
> int::out)
> +    is det.
> +
> +increment_index_if_in_conj(MaybeConj, !IndexInConj) :-
> +    (
> +        MaybeConj = yes(_),
> +        !:IndexInConj = !.IndexInConj + 1
> +    ;
> +        MaybeConj = no
> +    ).
> +
> +    % Process a call for implicit parallelism.
> +    %
> +:- pred process_call_for_implicit_parallelism(hlds_goal::in, proc_info::in,
> +    module_info::in, module_info::out, int::in, int::out,
> +    maybe(hlds_goal_expr)::in, maybe(hlds_goal_expr)::out,
> +    list(css_to_be_parallelized)::in, list(css_to_be_parallelized)::out,
> +    counter::in, counter::out) is det.
> +
> +process_call_for_implicit_parallelism(Call, ProcInfo, !ModuleInfo, 
> !IndexInConj
> +    , !MaybeConj, !CalleeListToBeParallelized, !Counter) :-
> +    counter.allocate(SlotNumber, !Counter),
> +    get_call_kind_and_callee(!.ModuleInfo, Call, Kind, CalleeRawId),
> +    ( !.MaybeConj = yes(Conj0), Conj0 = conj(plain_conj, ConjGoals0)
> +    ->
> +        (is_in_css_list_to_be_parallelized(Kind, SlotNumber, CalleeRawId,
> +            !.CalleeListToBeParallelized, [], !:CalleeListToBeParallelized)
> +        ->
> +            ( build_goals_surrounded_by_calls_to_be_parallelized(ConjGoals0,
> +                !.ModuleInfo, [Call], Goals, !.IndexInConj + 1, End, 
> !Counter,
> +                !CalleeListToBeParallelized)
> +            ->
> +                parallelize_calls(Goals, !.IndexInConj, End, Conj0, Conj,
> +                    ProcInfo, !ModuleInfo),
> +                !:IndexInConj = End,
> +                !:MaybeConj = yes(Conj)
> +            ;
> +                % The next call is not in the feedback file or we've hit a
> +                % plain conjunction/disjunction/switch/if then else.
> +                !:IndexInConj = !.IndexInConj + 1
> +            )
> +        ;
> +            % Not to be parallelized.
> +            !:IndexInConj = !.IndexInConj + 1
> +        )
> +    ;
> +        % Call is not in a conjunction or the call is already in a parallel
> +        % conjunction.
> +        true
> +    ).
> +
> +    % Give the raw id (the same as in the deep profiler) of a callee 
> contained
> +    % in a call.
> +    %
> +:- pred get_call_kind_and_callee(module_info::in, hlds_goal::in, 
> string::out,
> +    string::out) is det.
> +
> +get_call_kind_and_callee(ModuleInfo, Call, Kind, CalleeRawId) :-
> +    GoalExpr = fst(Call),
> +    ( GoalExpr = plain_call(PredId, ProcId, _, _, _, _)
> +    ->
> +        module_info_pred_proc_info(ModuleInfo, PredId, ProcId,
> +            PredInfo, _),
> +        pred_proc_id_to_raw_id(PredInfo, ProcId, CalleeRawId),
> +        Kind = "normal_call"
> +    ;
> +        ( GoalExpr = call_foreign_proc(_, PredId, ProcId, _, _, _, _)
> +        ->
> +            module_info_pred_proc_info(ModuleInfo, PredId, ProcId,
> +                PredInfo, _),
> +            pred_proc_id_to_raw_id(PredInfo, ProcId, CalleeRawId),
> +            Kind = "special_call"
> +        ;
> +            ( GoalExpr = generic_call(Details, _, _, _)
> +            ->
> +                (
> +                    Details = higher_order(_, _, _, _),
> +                    CalleeRawId = "",
> +                    Kind = "higher_order_call"
> +                ;
> +                    Details = class_method(_, _, _, _),
> +                    CalleeRawId = "",
> +                    Kind = "method_call"
> +                ;
> +                    Details = event_call(_),
> +                    error("get_call_kind_and_callee:: the call is an event" 
> ++
> +                        " call")

Code in the compiler should called unexpected/2 rather than error/1.
(unexpected/2 is defined in the compiler_util module.)

> +                ;
> +                    Details = cast(_),
> +                    error("get_call_kind_and_callee: the call is a cast")
> +                )
> +            ;
> +                error("get_call_kind_and_callee: not a call")
> +            )
> +        )
> +    ).
> +
> +    % Convert a pred_info and a proc_id to the raw procedure id (the same 
> used
> +    % in the deep profiler).
> +    %
> +:- pred pred_proc_id_to_raw_id(pred_info::in, proc_id::in, string::out) is 
> det.
> +
> +pred_proc_id_to_raw_id(PredInfo, ProcId, RawId) :-
> +    Module = pred_info_module(PredInfo),
> +    Name = pred_info_name(PredInfo),
> +    OrigArity = pred_info_orig_arity(PredInfo),
> +    IsPredOrFunc = pred_info_is_pred_or_func(PredInfo),
> +    ModuleString = sym_name_to_string(Module),
> +    ProcIdInt = proc_id_to_int(ProcId),
> +    RawId = string.append_list([ModuleString, ".", Name, "/",
> +        string.int_to_string(OrigArity),
> +        ( IsPredOrFunc = function -> "+1" ; ""), "-",
> +        string.from_int(ProcIdInt)]).
> +
> +    % Succeed if the caller, slot number and callee correspond to a
> +    % css_to_be_parallelized in the list given as a parameter.
> +    % Fail otherwise.
> +    %
> +:- pred is_in_css_list_to_be_parallelized(string::in, int::in, string::in,
> +    list(css_to_be_parallelized)::in,
> +    list(css_to_be_parallelized)::in, list(css_to_be_parallelized)::out)
> +    is semidet.
> +
> +is_in_css_list_to_be_parallelized(Kind, SlotNumber, CalleeRawId,
> +    CSSListToBeParallelized, !ResultAcc) :-
> +    (
> +        CSSListToBeParallelized = [],
> +        fail
> +    ;
> +        CSSListToBeParallelized = [ CSSToBeParallelized |
> +            CSSListToBeParallelized0],
> +        CSSToBeParallelized = css_to_be_parallelized(_, CSSSlotNumber, 
> CSSKind,
> +            CSSCallee),
> +        % =< because there is not a one to one correspondance with the 
> source
> +        % code. New calls might have been added by the previous stages of 
> the
> +        % compiler.
> +        ( CSSSlotNumber =< SlotNumber, CSSKind = Kind, CSSCallee = 
> CalleeRawId
> +        ->
> +            list.append(!.ResultAcc, CSSListToBeParallelized0, !:ResultAcc)
> +        ;
> +            list.append(!.ResultAcc, [CSSToBeParallelized], !:ResultAcc),
> +            is_in_css_list_to_be_parallelized(Kind, SlotNumber, CalleeRawId,
> +                CSSListToBeParallelized0, !ResultAcc)
> +        )
> +    ).
> +
> +    % Build a list of goals surrounded by two calls which are in the 
> feedback
> +    % file or by a call which is in the feedback file and a parallel
> +    % conjunction.
> +    %
> +    % Succeed if we can build that list of goals.
> +    % Fail otherwise.
> +    %
> +:- pred 
> build_goals_surrounded_by_calls_to_be_parallelized(list(hlds_goal)::in,
> +    module_info::in, list(hlds_goal)::in, list(hlds_goal)::out,
> +    int::in, int::out, counter::in, counter::out,
> +    list(css_to_be_parallelized)::in, list(css_to_be_parallelized)::out)
> +    is semidet.
> +
> +build_goals_surrounded_by_calls_to_be_parallelized(ConjGoals, ModuleInfo,
> +    !ResultAcc, !Index, !Counter, !CalleeListToBeParallelized) :-
> +    list.length(ConjGoals, Length),
> +    ( !.Index > Length
> +    ->
> +        fail
> +    ;
> +        list.index1_det(ConjGoals, !.Index, Goal),
> +        GoalExpr = fst(Goal),
> +        ( ( GoalExpr = disj(_)
> +            ; GoalExpr = switch(_, _, _)
> +            ; GoalExpr = if_then_else(_, _, _, _)
> +            ; GoalExpr = conj(plain_conj, _)
> +        ) ->
> +            fail
> +        ;
> +            ( is_a_conjunct(Goal, parallel_conj) ->
> +                list.append(!.ResultAcc, [Goal], !:ResultAcc)
> +            ;
> +                ( is_a_call(Goal)
> +                ->
> +                    counter.allocate(SlotNumber, !Counter),
> +                    get_call_kind_and_callee(ModuleInfo, Goal, Kind,
> +                        CalleeRawId),
> +                    ( is_in_css_list_to_be_parallelized(Kind, SlotNumber,
> +                        CalleeRawId, !.CalleeListToBeParallelized, [],
> +                        !:CalleeListToBeParallelized)
> +                    ->
> +                        list.append(!.ResultAcc, [Goal], !:ResultAcc)
> +                    ;
> +                        list.append(!.ResultAcc, [Goal], !:ResultAcc),
> +                        !:Index = !.Index + 1,
> +                        build_goals_surrounded_by_calls_to_be_parallelized(
> +                            ConjGoals, ModuleInfo, !ResultAcc, !Index, 
> !Counter,
> +                            !CalleeListToBeParallelized)
> +                    )
> +                ;
> +                    list.append(!.ResultAcc, [Goal], !:ResultAcc),
> +                    !:Index = !.Index + 1,
> +                    build_goals_surrounded_by_calls_to_be_parallelized(
> +                        ConjGoals, ModuleInfo, !ResultAcc, !Index, !Counter,
> +                        !CalleeListToBeParallelized)
> +                )
> +            )
> +        )
> +    ).
> +
> +    % Succeed if the Goal is a conjunction.
> +    % Fail otherwise.
> +    %
> +:- pred is_a_conjunct(hlds_goal::in, conj_type::out) is semidet.

A conjunct and a conjunction are different things.
I suggest calling that predicate: goal_is_conjunction.

Also the comment should be:

 	Succeeds if Goal is a conjunction (of any type).
 	Fail otherwise.

> +
> +is_a_conjunct(Goal, Type) :-
> +    GoalExpr = fst(Goal),
> +    GoalExpr = conj(Type, _).
> +
> +    % Succeed if Goal is a call or a call inside a negation.
> +    % Fail otherwise.
> +    %

Likwise I suggest naming this:

 	goal_is_call_or_negated_call

> +:- pred is_a_call(hlds_goal::in) is semidet.
> +
> +is_a_call(Goal) :-
> +    GoalExpr = fst(Goal),
> +    (
> +        GoalExpr = plain_call(_, _, _, _, _, _)
> +    ;
> +        GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
> +    ;
> +        GoalExpr = generic_call(Details, _, _, _),
> +        (
> +            Details = class_method(_, _, _, _)
> +        ;
> +            Details = higher_order(_, _, _, _)
> +        )
> +    ;
> +        GoalExpr = negation(GoalNeg),
> +        GoalNegExpr = fst(GoalNeg),
> +        (
> +            GoalNegExpr = plain_call(_, _, _, _, _, _)
> +        ;
> +            GoalNegExpr = call_foreign_proc(_, _, _, _, _, _, _)
> +        ;
> +            GoalNegExpr = generic_call(Details, _, _, _),
> +            (
> +                Details = class_method(_, _, _, _)
> +            ;
> +                Details = higher_order(_, _, _, _)
> +            )
> +        )
> +    ).
> +


...

> +%-----------------------------------------------------------------------------%
> +
> +    % Parse the feedback file (header and body).
> +    %
> +:- pred parse_feedback_file(string::in,
> +    maybe_error(list(css_to_be_parallelized))::out, io::di, io::uo) is det.
> +
> +parse_feedback_file(InputFile, MaybeCSSListToBeParallelized, !IO) :-
> +    io.open_input(InputFile, Result, !IO),
> +    (
> +        Result = io.error(ErrInput),
> +        MaybeCSSListToBeParallelized = error(io.error_message(ErrInput))
> +    ;
> +        Result = ok(InStrm),

s/InStrm/Stream/

> +        io.read_file_as_string(InStrm, MaybeFileAsString, !IO),
> +        (
> +            MaybeFileAsString = ok(FileAsString),
> +            LineList = string.words_separator(is_carriage_return,
> +                FileAsString),
> +            process_header(LineList, MaybeBodyFileAsListString, !IO),
> +            (
> +                MaybeBodyFileAsListString = error(ErrProcessHeader),
> +                MaybeCSSListToBeParallelized = error(ErrProcessHeader)
> +            ;
> +                MaybeBodyFileAsListString = ok(BodyFileAsListString),
> +                process_body(BodyFileAsListString,
> +                    MaybeCSSListToBeParallelized0),
> +                (
> +                    MaybeCSSListToBeParallelized0 =
> +                        ok(CSSListToBeParallelized),
> +                    MaybeCSSListToBeParallelized = 
> ok(CSSListToBeParallelized)
> +                ;
> +                    MaybeCSSListToBeParallelized0 = error(Err),
> +                    MaybeCSSListToBeParallelized = error(Err)
> +                )
> +            )
> +        ;
> +            MaybeFileAsString = error(_, ErrReadFileAsString),
> +            MaybeCSSListToBeParallelized =
> +                error(io.error_message(ErrReadFileAsString))
> +        ),
> +        io.close_input(InStrm, !IO)
> +    ).
> +
> +:- pred is_carriage_return(char::in) is semidet.
> +
> +is_carriage_return(Char) :- Char = '\n'.
> +
> +    % Process the header of the feedback file.
> +    %
> +:- pred process_header(list(string)::in, maybe_error(list(string))::out,
> +    io::di, io::uo) is det.
> +
> +process_header(FileAsListString, MaybeFileAsListStringWithoutHeader, !IO) :-
> +    ( list.index0(FileAsListString, 0, Type) ->
> +        ( Type = "Profiling feedback file" ->
> +            (list.index0(FileAsListString, 1, Version) ->
> +                ( Version = "Version = 1.0" ->
> +                    list.det_split_list(4, FileAsListString, _,
> +                        FileAsListStringWithoutHeader),
> +                    MaybeFileAsListStringWithoutHeader =
> +                        ok(FileAsListStringWithoutHeader)
> +                ;
> +                    MaybeFileAsListStringWithoutHeader = error("Profiling" 
> ++
> +                    " feedback file version incorrect")
> +                )
> +            ;
> +                MaybeFileAsListStringWithoutHeader = error("Not a profiling"
> +                ++ " feedback file")
> +            )
> +        ;
> +            MaybeFileAsListStringWithoutHeader = error("Not a profiling" ++
> +                " feedback file")
> +        )
> +    ;
> +        MaybeFileAsListStringWithoutHeader = error("Not a profiling 
> feedback"
> +            ++ " file")
> +    ).
> +
> +    % Process the body of the feedback file.
> +    %
> +:- pred process_body(list(string)::in,
> +    maybe_error(list(css_to_be_parallelized))::out) is det.
> +
> +process_body(CoreFileAsListString, MaybeCSSListToBeParallelized) :-
> +    ( process_body2(CoreFileAsListString, [], CSSListToBeParallelized) ->
> +        MaybeCSSListToBeParallelized = ok(CSSListToBeParallelized)
> +    ;
> +        MaybeCSSListToBeParallelized = error("Profiling feedback file has 
> been"
> +            ++ " tampered with")
> +    ).
> +
> +:- pred process_body2(list(string)::in, list(css_to_be_parallelized)::in,
> +    list(css_to_be_parallelized)::out) is semidet.
> +
> +process_body2([], !CSSListToBeParallelizedAcc).
> +process_body2([Line | Lines], !CSSListToBeParallelizedAcc) :-
> +    Words = string.words_separator(is_whitespace, Line),
> +    list.index0_det(Words, 0, Caller),
> +    ( Caller = "Mercury" ->
> +        process_body2(Lines, !CSSListToBeParallelizedAcc)
> +    ;
> +        list.index0_det(Words, 1, SlotNumber),
> +        string.to_int(SlotNumber, IntSlotNumber),
> +        list.index0_det(Words, 2, Kind),
> +        ( Kind = "normal_call" ->
> +            list.index0_det(Words, 3, Callee),
> +            CSStoBeParallelized = css_to_be_parallelized(Caller, 
> IntSlotNumber,
> +                Kind, Callee)
> +        ;
> +            CSStoBeParallelized = css_to_be_parallelized(Caller, 
> IntSlotNumber,
> +                Kind, "")
> +        ),
> +        !:CSSListToBeParallelizedAcc = [ CSStoBeParallelized |
> +            !.CSSListToBeParallelizedAcc ],
> +        process_body2(Lines, !CSSListToBeParallelizedAcc)
> +    ).
> +

...


> Index: compiler/mercury_compile.m
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/compiler/mercury_compile.m,v
> retrieving revision 1.414
> diff -u -r1.414 mercury_compile.m
> --- compiler/mercury_compile.m	3 Nov 2006 08:31:09 -0000	1.414
> +++ compiler/mercury_compile.m	27 Nov 2006 05:13:50 -0000
> @@ -78,6 +78,7 @@
> :- import_module transform_hlds.trailing_analysis.
> :- import_module transform_hlds.tabling_analysis.
> :- import_module transform_hlds.higher_order.
> +:- import_module transform_hlds.implicit_parallelism.
> :- import_module transform_hlds.accumulator.
> :- import_module transform_hlds.tupling.
> :- import_module transform_hlds.untupling.
> @@ -89,6 +90,7 @@
> :- import_module transform_hlds.unused_args.
> :- import_module transform_hlds.unneeded_code.
> :- import_module transform_hlds.lco.
> +:- import_module transform_hlds.distance_granularity.
> :- import_module transform_hlds.ctgc.
> :- import_module transform_hlds.ctgc.structure_reuse.
> :- import_module transform_hlds.ctgc.structure_reuse.analysis.
> @@ -2414,7 +2416,7 @@
>     module_info_get_globals(!.HLDS, Globals),
>     globals.lookup_bool_option(Globals, verbose, Verbose),
>     globals.lookup_bool_option(Globals, statistics, Stats),
> -
> +
>     maybe_read_experimental_complexity_file(!HLDS, !IO),
>
>     tabling(Verbose, Stats, !HLDS, !IO),
> @@ -2523,9 +2525,15 @@
>     maybe_structure_reuse_analysis(Verbose, Stats, !HLDS, !IO),
>     maybe_dump_hlds(!.HLDS, 195, "structure_reuse", !DumpInfo, !IO),
>
> +    maybe_implicit_parallelism(Verbose, Stats, !HLDS, !IO),
> +    maybe_dump_hlds(!.HLDS, 199, "implicit_parallelism", !DumpInfo, !IO),
> +

I think that this stage occurs too late.  The problem is that lots of
optimizations that are (currently) not enabled in .profdeep grades,
e.g. inlining**, will be enabled during the compile run in which the
implicit parallism transformation is run.

We don't want to be inlining conjuncts that are candidates for
implicity parallism so at the least there should be a stage 
before optimization takes places that finds such conjuncts and
puts a marker on them so that other optimizations don't affect them.

A related problem is that we will not have feedback information for
procedures created by optimizations, e.g. the auxiliary procedures
created by loop invariant hoisting.

One solution to that is to distinguish between deep profiling runs
that are for purpose of gathering feedback data and runs for other
purposes.  In the former we would enable most optimizations, in so far
as those optimizations are compatible with deep profiling.  Extending
the --profile-optimized option below would be one way of doing this.

** inlining can be enabled in deep profiling grades with
--profile-optimized.

To be continued ...

Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list