[m-rev.] for review: procedure representations for the deep profiler

Julien Fischer juliensf at csse.unimelb.edu.au
Wed Sep 12 02:20:13 AEST 2007


On Tue, 11 Sep 2007, Zoltan Somogyi wrote:

> Make a representation of the program available to the deep profiler. We do
> this by letting the user request, via the option "--deep-procrep-file"
> in MERCURY_OPTIONS, that when the Deep.data file is written, a Deep.procrep
> file should be written alongside it.
>
> The intended use of this information is the discovery of profitable
> parallelism. When a conjunction contains two expensive calls, e.g. p(...) and
> q(...) connected by some shared variables, the potential gain from executing
> them in parallel is limited by how early p produces those variables and how
> late q consumes them, and knowing this requires access to the code of p and q.
>
> Since the debugger and the deep profiler both need access to program
> representations, put the relevant data structures and the operations on them
> in mdbcomp. The data structures are significantly expanded, since the deep
> profiler deals with the whole program, while the debugger was interested only
> in one procedure at a time.
>
> The layout structures have to changes as well. In a previous change, I changed

s/changes/change/

> proc layout structures to make room for the procedure representation even in
> non-debugging grades, but this isn't enough, since the procedure representation
> refers to the module's string table. This diff therefore makes some parts of
> the module layout structure, including of course the string table, also
> available in non-debugging grades.
>
> configure.in:
> 	Check whether the installed compiler can process switches on foreign
> 	enums correctly, since this diff depends on that.
>
> runtime/mercury_stack_layout.[ch]:
> runtime/mercury_types.h:
> 	Add a new structure, MR_ModuleCommonLayout, that holds the part of
> 	the module layout that is common to deep profiling and debugging.
>
> runtime/mercury_deep_profiling.[ch]:
> 	The old "deep profiling token" enum type was error prone, since at
> 	each point in the data file, only a subset was applicable. This diff
> 	breaks up the this enum into several enums, each consisting of the
> 	choice applicable at a given point.
>
> 	This also allows some of the resulting enums to be used in procrep
> 	files.
>
> 	Rename some enums and functions to avoid ambiguities, and in one case
> 	to conform to our naming scheme.
>
> 	Make write_out_proc_statics take a second argument. This is a FILE *
> 	that (if not NULL) asks write_out_proc_statics to write the
> 	representation of the current module to specified stream.
>
> 	These module representations go into the middle part of the program
> 	representation file. Add functions to write out the prologue and
> 	epilogue of this file.
>
> 	Write out procedure representations if this is requested.
>
> 	Factor out some code that is now used in more than one place.
>
> runtime/mercury_deep_profiling_hand.h:
> 	Conform to the changes to mercury_deep_profiling.h.
>
> runtime/mercury_builtin_types.c:
> 	Pass the extra argument in the argument lists of invocations of
> 	write_out_proc_statics.
>
> runtime/mercury_trace_base.[ch]:
> 	Conform to the name change from proc_rep to proc_defn_rep in mdbcomp.
>
> runtime/mercury_grade.h:
> 	Due to the change to layout structures, increment the binary
> 	compatibility version numbers for both debug and deep profiling grades.
>
> runtime/mercury_wrapper.[ch]:
> 	Provide two new MERCURY_OPTION options. The first --deep-procrep-file,
> 	allows the user to ask for the program representation to be generated.
> 	The second, --deep-random-write, allows tools/bootcheck to request that
> 	only a fraction of all program invocations should generate any deep
> 	profiling output.
>
> 	The first option will be documented once it is tested much more fully.
> 	The second option is deliberately not documented.
>
> 	Update the type of the variable that holds the address of the
> 	(mkinit-generated) write_out_proc_statics function to accept the second
> 	argument.
>
> util/mkinit.c:
> 	Pass the extra argument in the argument list of write_out_proc_statics.
>
> mdbcomp/program_representation.m:
> 	Extend the existing data structures for representing a procedure body
> 	to represent a procedure (complete with name), a module and a program.
> 	The name is implemented a string_proc_label, a form of proc_label that

s/a/as/

> 	can be written out to files. This replaces the old proc_id type the
> 	deep profiler.
>
> 	Extend the representation of switches to record the identity of the
> 	variable being switched on, and the cons_ids of the arms. Without the
> 	former, we cannot be sure when a variable is first used, and the latter
> 	is needed for meaningful prettyprinting of procedure bodies.
>
> 	Add code for reading in files of bytecodes, and for making sense of the
> 	bytecodes themselves. (It is this code that uses foreign enums.)
>
> mdbcomp/prim_data.m:
> 	Note the relationship of proc_label with string_proc_label.
>
> mdbcomp/rtti_access.m:
> 	Add the access operations needed to find module string tables with the
> 	new organization of layout structures.
>
> 	Provide operations on bytecodes and string tables generally.
>
> trace/mercury_trace_cmd_browsing.c:
> 	Conform to the change to mdbcomp/program_representation.m.
>
> compiler/layout.m:
> 	Add support for a MR_ModuleCommonLayout.
>
> 	Rename some function symbols to avoid ambiguities.
>
> compiler/layout_out.m:
> 	Handle the new structure.
>
> compiler/stack_layout.m:
> 	Generate the new structure and the procedure representation bytecode
> 	in deep profiling grades.
>
> compiler/llds_out.m:
> 	Generate the code required to write out the prologue and epilogue
> 	of program representation files.
>
> 	Pass the extra argument in the argument lists of invocations of
> 	write_out_proc_statics that tells those invocations to write out
> 	the module representations between the prologue and the epilogue.
>
> compiler/prog_rep.m:
> 	When generating bytecodes, include the new information for switches.
>
> compiler/continuation_info.m:
> 	Replace a bool with a more expressive type.
>
> compiler/proc_rep.m:
> 	Conform to the change to continuation_info.m.
>
> compiler/opt_debug.m:
> 	Conform to the change to layout.m.
>
> deep_profiler/mdprof_procrep.m:
> 	A new test program to test the reading of program representations.
>
> deep_profiler/DEEP_FLAGS.in:
> deep_profiler/Mmakefile:
> 	Copy the contents of the mdbcomp module to this directory on demand,
> 	instead of linking to it. This is necessary now that the deep profiler
> 	depends directly on mdbcomp even if it is compiled in a non-debugging
> 	grade.
>
> 	The arrangements for doing is were copied from the slice directory,

s/is/this/

> 	which has long done the same.
>
> 	Avoid a duplicate include of Mmake.deep.params.
>
> 	Add the new test program to the list of programs in this directory.
>
> Mmakefile:
> 	Go through deep_profiler/Mmakefile when deciding whether to do "mmake
> 	depend" in the deep_profiler directory. The old actions won't work
> 	correctly now that we need to copy some files from mdbcomp before we
> 	can run "mmake depend".
>
> deep_profiler/profile.m:
> 	Remove the code that was moved (in cleaned-up form) to mdbcomp.
>
> deep_profiler/dump.m:
> deep_profiler/profile.m:
> 	Conform to the changes above.
>
> browser/declarative_execution.m:
> browser/declarative_tree.m:
> 	Conform to the changes in mdbcomp.
>
> doc/user_guide.texi:
> 	Add commented out documentation of the two new options.
>
> slice/Mmakefile:
> 	Fix formatting, and a bug.
>
> library/exception.m:
> library/par_builtin.m:
> library/thread.m:
> library/thread.semaphore.m:
> 	Update all the handwritten modules to pass the extra argument now
> 	required by write_out_proc_statics.
>
> tests/debugger/declarative/dependency.exp:
> 	Conform to the change from proc_rep to proc_defn_rep.
>
> tools/bootcheck:
> 	Write out deep profiling data only from every 25th invocation, since
> 	otherwise the time for a bootcheck takes six times as long in deep
> 	profiling grades than in asm_fast.gc.

...

> Index: deep_profiler/mdprof_procrep.m
> ===================================================================
> RCS file: deep_profiler/mdprof_procrep.m
> diff -N deep_profiler/mdprof_procrep.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ deep_profiler/mdprof_procrep.m	10 Sep 2007 14:58:39 -0000
> @@ -0,0 +1,370 @@
> +%-----------------------------------------------------------------------------%
> +% vim: ft=mercury ts=4 sw=4 et
> +%-----------------------------------------------------------------------------%
> +% Copyright (C) 2007 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% File: mdprof_procrep.m.
> +% Author: zs.
> +%

Add a brief comment describing what this program is for, the contents
of log message for this file would suffice.

...

> Index: mdbcomp/prim_data.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/mdbcomp/prim_data.m,v
> retrieving revision 1.23
> diff -u -b -r1.23 prim_data.m
> --- mdbcomp/prim_data.m	23 Aug 2007 04:29:04 -0000	1.23
> +++ mdbcomp/prim_data.m	6 Sep 2007 01:39:43 -0000
> @@ -72,6 +72,10 @@
>     % from `.opt' files, the defining module's name may need to be added
>     % as a qualifier to the label.
>     %
> +    % The type string_proc_label in program_representation.m parallels this
> +    % type, but differs from it in being used not inside the compiler by

s/by/but/

> +    % outside, which means it needs to use different types for many fields.
> +    %
> :- type proc_label
>     --->    ordinary_proc_label(
>                 ord_defining_module     :: module_name,

...

> Index: mdbcomp/program_representation.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/mdbcomp/program_representation.m,v
> retrieving revision 1.23
> diff -u -b -r1.23 program_representation.m
> --- mdbcomp/program_representation.m	8 Aug 2007 05:06:06 -0000	1.23
> +++ mdbcomp/program_representation.m	10 Sep 2007 14:50:11 -0000
> @@ -9,8 +9,8 @@
> % File: program_representation.m
> % Authors: zs, dougl
> %
> -% This module defines the representation of procedure bodies
> -% used by the declarative debugger.
> +% This module defines the representation of procedure bodies used by the
> +% declarative debugger (and maybe the deep profiler).
> %
> % One of the things we want the declarative debugger to be able to do
> % is to let the user specify which part of which output argument of an
> @@ -18,16 +18,17 @@
> % that particular subterm came from, i.e. where it was bound. Doing this
> % requires knowing what the bodies of that procedure and its descendants are.
> %
> -% If the Mercury compiler is invoked with options requesting declarative
> -% debugging, it will include in each procedure layout a pointer to a simplified
> -% representation of the goal that is the body of the corresponding procedure.
> -% We use a simplified representation partly because we want to insulate the
> -% code of the declarative debugger from irrelevant changes in HLDS types,
> -% and partly because we want to minimize the space taken in up in executables
> -% by these representations.
> +% If the Mercury compiler is invoked with the right options, it will include
> +% in each procedure layout a pointer to a simplified representation of the goal
> +% that is the body of the corresponding procedure. We use a simplified
> +% representation partly because we want to insulate the code using procedure
> +% representations from irrelevant changes in HLDS types, and partly because
> +% we want to minimize the space taken in up in executables by these
> +% representations.
> %
> % The current representation is intended to contain all the information
> -% we are pretty sure can be usefully exploited by the declarative debugger.
> +% we are pretty sure can be usefully exploited by the declarative debugger
> +% and/or the deep profiler.
>
> %-----------------------------------------------------------------------------%
>
> @@ -39,22 +40,80 @@
>
> :- import_module bool.
> :- import_module char.
> +:- import_module io.
> :- import_module list.
> :- import_module maybe.
> :- import_module type_desc.
>
> -    % A representation of the goal we execute. These need to be generated
> -    % statically and stored inside the executable.
> +    % read_prog_rep_file(FileName, Result, !IO)
>     %
> -    % Each element of this structure will correspond one-to-one
> -    % to the original stage 90 HLDS.
> +:- pred read_prog_rep_file(string::in, io.res(prog_rep)::out, io::di, io::uo)
> +    is det.
> +
> +:- type prog_rep
> +    --->    prog_rep(
> +                list(module_rep)
> +            ).
> +
> +:- type module_rep
> +    --->    module_rep(
> +                mr_name         :: string,          % The module name.
> +                mr_string_table :: string_table,
> +                mr_procs        :: list(proc_rep)
> +            ).
>
> :- type proc_rep
>     --->    proc_rep(
> -                list(var_rep),      % The head variables, in order,
> -                                    % including the ones introduced
> +                pr_id       :: string_proc_label,
> +                pr_defn     :: proc_defn_rep
> +            ).
> +
> +    % A string_proc_label is a data structure that uniquely identifies a
> +    % procedure. It is a version of the proc_label type from prim_data.m
> +    % that can be used outside the compiler, e.g. in RTTI data structures
> +    % and in data filed generated by deep profiling.
> +    %
> +:- type string_proc_label
> +    --->    str_ordinary_proc_label(
> +                s_ord_pred_or_func      :: pred_or_func,
> +                s_ord_decl_module       :: string,
> +                s_ord_def_module        :: string,
> +                s_ord_name              :: string,
> +                s_ord_arity             :: int,
> +                s_ord_mode              :: int
> +            )
> +    ;       str_special_proc_label(
> +                s_spec_type_name        :: string,
> +                s_spec_type_module      :: string,
> +                s_spec_def_module       :: string,
> +                s_spec_pred_name        :: string,
> +                s_spec_arity            :: int,
> +                s_spec_mode             :: int
> +            ).
> +
> +:- type proclabel_kind_token
> +    --->    proclabel_user_predicate
> +    ;       proclabel_user_function
> +    ;       proclabel_special.
> +
> +:- pred is_proclabel_kind(int::in, proclabel_kind_token::out) is semidet.
> +
> +    % A representation of the procedure definitions (clause heads and bodies)
> +    % that we execute. These are generated by the compiler, which stores them
> +    % in the form of a bytecode representation in a field of proc_layout

... of the proc_layout ...

> @@ -435,7 +508,7 @@
>
> goal_generates_internal_event(conj_rep(_)) = no.
> goal_generates_internal_event(disj_rep(_)) = yes.
> -goal_generates_internal_event(switch_rep(_)) = yes.
> +goal_generates_internal_event(switch_rep(_, _)) = yes.
> goal_generates_internal_event(ite_rep(_, _, _)) = yes.
> goal_generates_internal_event(negation_rep(_)) = yes.
> goal_generates_internal_event(scope_rep(_, _)) = no.
> @@ -458,9 +531,9 @@
>     yes(atomic_goal_id(Module, Name, length(Args))).
> atomic_goal_identifiable(event_call_rep(_, _)) = no.
>
> -:- pragma export(proc_rep_type = out, "ML_proc_rep_type").
> +:- pragma export(proc_defn_rep_type = out, "ML_proc_defn_rep_type").

Replace that with pragma foreign_export - likewise it looks as though
there is still at least one other export pragma in this module.

> +proc_defn_rep_type = type_of(_ : proc_defn_rep).
>
> :- pragma export(goal_rep_type = out, "ML_goal_rep_type").
>
> @@ -493,7 +566,7 @@
> path_step_from_string_2('s', Str, step_switch(N, MaybeM)) :-
>     string.words_separator(unify('-'), Str) = [NStr, MStr],
>     string.to_int(NStr, N),
> -    % short for "not applicable"
> +    % "na" is short for "not applicable"
>     ( MStr = "na" ->
>         MaybeM = no
>     ;
> @@ -592,105 +665,298 @@
>
> %-----------------------------------------------------------------------------%
>
> -:- pragma foreign_export("C", read_proc_rep(in, in, out),
> -    "MR_MDBCOMP_trace_read_rep").
> +:- pred read_file_as_bytecode(string::in, io.res(bytecode)::out,
> +    io::di, io::uo) is det.
> +
> +read_file_as_bytecode(FileName, Result, !IO) :-
> +    read_file_as_bytecode_2(FileName, ByteCode, Size, Error, !IO),
> +    ( Size < 0 ->
> +        io.make_err_msg(Error, "opening " ++ FileName, Msg, !IO),
> +        Result = error(io.make_io_error(Msg))
> +    ;
> +        Result = ok(bytecode(ByteCode, Size))
> +    ).
> +
> +:- pred read_file_as_bytecode_2(string::in, bytecode_bytes::out, int::out,
> +    io.system_error::out, io::di, io::uo) is det.
> +
> +:- pragma foreign_proc("C",
> +    read_file_as_bytecode_2(FileName::in, Bytes::out, Size::out, Error::out,
> +        _IO0::di, _IO::uo),
> +    [will_not_call_mercury, thread_safe, promise_pure],
> +"
> +    struct  stat statbuf;
>
> -read_proc_rep(Bytecode, Label, ProcRep) :-
> +    if (stat(FileName, &statbuf) != 0) {
> +        Bytes = NULL;
> +        Size = -1;
> +        Error = errno;
> +    } else {
> +        int     fd;
> +        char    *buf;
> +
> +        Size = statbuf.st_size;
> +        MR_allocate_aligned_string_msg(buf, Size, MR_PROC_LABEL);
> +        fd = open(FileName, O_RDONLY, 0);
> +        if (fd < 0) {
> +            Bytes = NULL;
> +            Size = -1;
> +            Error = errno;
> +        } else {
> +            if (read(fd, buf, Size) != Size) {
> +                Bytes = NULL;
> +                Size = -1;
> +                Error = errno;
> +            } else {
> +                if (close(fd) != 0) {
> +                    Bytes = NULL;
> +                    Size = -1;
> +                    Error = errno;
> +                } else {
> +                    Bytes = (MR_uint_least8_t *) buf;
> +                    Error = 0;
> +                }
> +            }
> +        }
> +    }
> +").

The above code uses various posix functions that are not standard C.
Use the macros in the runtime, e.g. MR_HAVE_OPEN, MR_HAVE_STAT, to
protect the code appropriately with #ifdef(s) as appropriate.  It 
will be ok on most sensible systems as is but I suspect it will
probably break if compiled with MSVC.  (In that case can just
call MR_fatal_error, but the code itself still needs to compile.)

The diff looks okay otherwise.

Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list