[m-dev.] For review: implementation of collect for Opium-M

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Oct 28 00:42:58 AEST 1999


I'll start off with the easy review comments.
(I have one or two more complicated ones brewing... ;-)

On 27-Oct-1999, Erwan Jahier <Erwan.Jahier at irisa.fr> wrote:
collect_lib.m:
> +%		3) the predicate filter/3 which updates it at each execution 
> +%		   event. filter/3 should have the following declarations:
> +%			:- pred filter(event, collected_type, collected_type).
> +%			:- mode filter(in, di, uo) is det.

It would be better if this predicate also returned a bool
to say whether or not to keep on going.
The collection process would stop when this predicate returns `no'.

> +:- module collect_lib.
...
> +:- import_module int, list, std_util, io, char.
> +:- import_module name_mangle, dl.

Do you need to import `name_mangle'?

> +:- pragma export(link_collect(out, out, out, out, out, di, uo), 
> +	"ML_DI_link_collect").
> +
> +% We need Handle to be able to close the shared object (dlclose) later on.
> +% When the link failed, we output NULL pointers instead of maybe pointers
> +% for performance reasons; indeed, filter will be called at every events
> +% so we don't want to pay the price of the maybe variable de-construction
> +% at each event.
> +link_collect(Filter, Initialize, SendResult, HandlePtr, Result) -->
> +	%
> +	% Link in the object code for the module `collect' from
> +	% the file `collect.so'.
> +	%
> +	dl__open("./collect.so", lazy, local, MaybeHandle),

It would be better to avoid hard-coding the name "./collect.so".
On some platforms it might be something else, e.g. ".\\collect.dll".
It would be better to let the name be passed in as an argument here.

> +		%
> +		% Look up the address of the first mode (mode number 0)
> +		% of the predicates initialize/1 and filter/14  in the 
> +		% module collect.
> +		%
> +		dl__sym(Handle, "Collect_initialize", MaybeInitialize),
> +		dl__sym(Handle, "Collect_filter", MaybeFilter),
> +		dl__sym(Handle, "Collect_send_collect_result", MaybeSendResult),

The comment here is misleading.  You're actually looking up the
address of the corresponding C functions.

> +:- type io_pred == pred(io__state, io__state).
> +:- inst io_pred == (pred(di, uo) is det).
> +
> +:- func inst_cast(io_pred) = io_pred.
> +:- mode inst_cast(in) = out(io_pred) is det.
> +
> +:- pragma c_code(inst_cast(X::in) = (Y::out(io_pred)),
> +	[will_not_call_mercury, thread_safe], "Y = X").

That code is all unused.

> +			% dynamically link the collect module with the 
> +			% current execution
> +	;	link_collect

As discussed above, I think that should be
`link_collect(string)', where the string specifies
the name of the shared object to link in.

In fact a general `link' command to link in an arbitrary shared object
might be useful.  You might want to think about whether the
parts of this command's functionality which are related specifically
to collect could be put in the `collect' command rather than
in the `link_collect' command.

There should probably be a corresponding `unlink' command,
because on some platforms you will run into problems if you
try to link in the same library a second time without
having unlinked it first.

> Index: browser/dl.m
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/browser/dl.m,v
> retrieving revision 1.2
> diff -u -r1.2 dl.m
> --- dl.m	1999/04/16 06:04:08	1.2
> +++ dl.m	1999/10/27 13:29:03
> @@ -24,6 +24,7 @@
>  :- type handle.
>  :- type result(T) ---> ok(T) ; error(string).
>  :- type result ---> ok ; error(string).
> +:- type handle ---> handle(c_pointer).
>  
>  % interface to the C function dlopen()
>  :- pred dl__open(string::in, (mode)::in, scope::in, dl__result(handle)::out,
> @@ -62,8 +63,6 @@
>  	#include <dlfcn.h>
>  #endif
>  ").
> -
> -:- type handle ---> handle(c_pointer).

Hmm, why do you need to do this?

> +++ mercury_trace_external.c	1999/10/27 13:29:09
> @@ -30,6 +30,7 @@
>  #include "mercury_trace_vars.h"
>  
>  #include "debugger_interface.h"
> +#include "collect_lib.h"
>  #include "std_util.h"
>  
>  #include <stdio.h>
> @@ -42,6 +43,8 @@
>  #include <arpa/inet.h>
>  #include <netinet/in.h>
>  #include <netdb.h>
> +#include <dlfcn.h>

The use of <dlfcn.h> should be inside `#ifdef HAVE_DLFCN_H'.

>  /*
> +** Type of a local variable that indicates in which mode the external 
> +** debugger is. When the external debugger is in mode:
> +** - `searching', it tries to find an event that matches a forward move request,
> +** - `reading_request', it reads a new request on the socket,
> +** - `collecting', it is collecting information (after a `collect' request).
> +*/
> +typedef enum {
> +	searching, reading_request, collecting
> +} MR_external_debugger_mode_type;

I suggest you add `MR_' prefixes to those names.

>  Code *
>  MR_trace_event_external(MR_Trace_Cmd_Info *cmd, MR_Event_Info *event_info)
>  {
> -	static bool	searching = FALSE;
> +	static MR_external_debugger_mode_type 
> +			external_debugger_mode = reading_request;
>  	static Word	search_data;
> +	static Word	collecting_variable;
> +	static Word	(*initialize_ptr)(Word *) = NULL;
> +	static Word    	(*filter_ptr)(Integer, Integer, Integer, MR_Trace_Port,
> +				MR_PredFunc, String, String, String, Integer,
> +				Integer, Integer, String, Word, Word *) = NULL;
> +	static Word	(*send_collect_result_ptr)(Word, Word);
> +	static Word    	*handle = NULL;
> +	static bool    	collect_linked = FALSE;

The function pointer types look wrong.  For example, why do `initialize_ptr'
and `send_collect_result_ptr' point to functions that return values
of type `Word', rather than `void'?  Also, the use of MR_Trace_Port
and MR_PredFunc looks dangerous; the type here should exactly match
the type with which the function is defined, and I'll bet that it
will be defined using Integer or Word rather than MR_Trace_Port.

> +	switch((MR_external_debugger_mode_type) external_debugger_mode) {
> +		case searching:

The cast here is unnecessary and undesirable.
The indentation of the `case' should be the same as that of the `switch'.

> +		case collecting:
> +			if (seqno == 1 && port == MR_PORT_EXIT) {
> +				/* The end of the execution is reached */

That condition is wrong; it does not necessarily indicate the last
event.  For one thing, you should use MR_port_is_final(port) rather
than `port == MR_PORT_EXIT', because the final port could be an
exception or a fail port.  Even that is not enough, though:  if `main'
is compiled without tracing, but some other modules that it calls are
compiled with tracing, then you can reach the exit port for event #1
long before all the events have finished.

> +				(*filter_ptr)(
> +			MR_trace_event_number,
> +			seqno,
> +			depth,
> +			port,
> +			layout->MR_sll_entry->MR_sle_user.MR_user_pred_or_func,
> +			(String)
> +			layout->MR_sll_entry->MR_sle_user.MR_user_decl_module,
> +			(String)
> +			layout->MR_sll_entry->MR_sle_user.MR_user_def_module,
> +			(String)
> +			layout->MR_sll_entry->MR_sle_user.MR_user_name,
> +			layout->MR_sll_entry->MR_sle_user.MR_user_arity,
> +			layout->MR_sll_entry->MR_sle_user.MR_user_mode,
> +			layout->MR_sll_entry->MR_sle_detism,
> +			(String) path,
> +			(Word) collecting_variable,
> +			&collecting_variable);

The cast to (Word) there is unnecessary.

You need to call MR_make_permanent() on the collecting_variable
here to ensure that the memory allocated for it won't be deallocated
on backtracking.

> +			case MR_REQUEST_LINK_COLLECT:
> +			  {
> +			        Char	result;
> +
> +				if (MR_debug_socket) {
> +					fprintf(stderr, "\nMercury runtime: "
> +						"REQUEST_LINK_COLLECT\n");
> +				}
> +				MR_TRACE_CALL_MERCURY(
> +					ML_DI_link_collect(
> +					    (Word *) &filter_ptr,
> +					    (Word *) &initialize_ptr,
> +					    (Word *) &send_collect_result_ptr,
> +					    (Word *) &handle,
> +					    (Char *) &result
> +					    ));

The cast to (Char *) here is unnecessary.

> +			case MR_REQUEST_COLLECT:
...
> +					/*
> +					** In order to perform the collect from
> +					** the current event, we need to call 
> +					** filter once here.
> +					*/
> +					(*filter_ptr)(
> +			MR_trace_event_number,
> +			seqno,
> +			depth,
> +			port,
> +			layout->MR_sll_entry->MR_sle_user.MR_user_pred_or_func,
> +			(String)
> +			layout->MR_sll_entry->MR_sle_user.MR_user_decl_module,
> +			(String)
> +			layout->MR_sll_entry->MR_sle_user.MR_user_def_module,
> +			(String)
> +			layout->MR_sll_entry->MR_sle_user.MR_user_name,
> +			layout->MR_sll_entry->MR_sle_user.MR_user_arity,
> +			layout->MR_sll_entry->MR_sle_user.MR_user_mode,
> +			layout->MR_sll_entry->MR_sle_detism,
> +			(String) path,
> +			(Word) collecting_variable,
> +			&collecting_variable);

The code here duplicates that above.
You should avoid the duplication,
e.g. by abstracting this out into a separate function.

Content-Description: collect.op
...
> % There are several things to do in order to be able to execute a 
> % collect/1 command:
> % 1) create a file that will that contain the definition of collected_type, 
> %	initialize/1 and filter/3,
> % 2) generate `collect.m' from this file (generate_collect/1),
> % 3) compile collect.m (compile_collect/0),
> % 4) dynamically link it with the current execution (dyn_link_collect/0).
> % 5) run the command (run_command/1).

It would be nice if there was a single command that did steps 2-5
all at once.

> opium_scenario(
> 	name		: collect,
> 	files		: [collect],
> 	scenarios	: [],
> 	message		:
> "Scenario that implements the collect/2 monitoring command that collects \
> runtime information from Mercury program executions. It is intended to let \
> users easily implement their own monitors with acceptable performances.\n\
> \n\
> To use it, users just need to define 4 things in a file, using the Mercury \
> syntax:\n\
> 	(1) a `collected_type' which is the type of the collecting  \n\
> 	   variable that will contain the result of the monitoring\n\
> 	   activity.\n\
> 	(2) The predicate initialize/1 which initializes this \n\
> 	   collecting variable. initialize/1 have the \n\
> 	   following declarations:\n\
> 		:- pred initialize(collected_type).\n\
> 		:- mode initialize(out) is det.\n\
> 	(3) the predicate filter/3 which updates it at each execution \n\
> 	   event. filter/3 should have the following declarations:\n\
> 		:- pred filter(event, collected_type, collected_type).\n\
> 		:- mode filter(in, acc_in, acc_out) is det.\n\
> 	   where `acc_in' and `acc_out' have `in' and `out' respectively\n\
> 	   as default values.\n\
> 	(4) and optionally the mode definition of `acc_in' and `acc_out'\n\
> 	   in one want to override their default values.\n\
> \n\
> Then, this file is used to generate the Mercury module `collect.m', \
> which is compiled and dynamically linked with the current execution. \
> When a `collect' request is made from the external debugger, a variable \
> of type collected_type is first initialized (with initialize/1) and \
> then updated (with filter/3) for all the events of the remaining \
> execution. When the end of the execution is reached, the last value of \
> the collecting variable is send to Opium-M.\n\
> \n\
> collect/2 can be specified by the following pseudo Opium-M code: \n\
> `	collect_spec(File, Result) :-\n\
> 		generate_collect(File),\n\
> 		compile_collect,\n\
> 		dyn_link_collect,\n\
> 		get_event_list(EventList),\n\
> 		initialize(Start),\n\
> 		foldl(filter, EventList, Start, Result).\n\
> '\n\
> where: \n\
>  *) generate_collect/1 generates from `File' the Mercury module \
> `collect.m'; \n\
>  *) compile_collect/0 compiles `collect.m' (generating `File.so'); \n\
>  *) dynamically_link_collect/0 links the corresponding object file \n\
> (`File.so') with the current program execution; \n\
>  *) get_event_list/1 collects in a list all the events from the current \
> event until the end of the execution; \n\
>  *) foldl/4 is the classical fold predicate operating left-to-right. \n\
>  \n\
> Hence, collect/2 does the same thing as collect_spec/2 except it operates \
> on the fly without creating any list of events."
> 	).

"Hence" is the wrong word to use here.  Perhaps just delete the "Hence, ".

That documentation is incomplete because it does not describe
the `event' type.

> Then the command `collect(count_call, Result)' will unify Result with the \
> number of calls that occurs during the program execution.\

s/occurs/occur/

> compile_collect_Op :-
> 	write("Compiling collect.m...\n"),
> 	sh("rm -f collect.so  collect.o"),
> 	current_grade(Grade),
> 	concat_string([
> 		"mmc --grade ", 
> 		Grade,
> 		" -O6",
> 		" -c --pic-reg collect.m"], Command1),
> 	sh(Command1), 
> 	print(Command1), nl,
> 	concat_string([
> 		"ml --grade ", 
> 		Grade, 
> 		" --make-shared-lib ",
> 		"--pic-reg -o collect.so collect.o"], Command2),
> 	sh(Command2), 
> 	print(Command2), nl,
> 	exists("collect.so"),
> 	!,
> 	opium_write_debug("collect.m has been compiled successfully.\n").

It's probably better to print the commands _before_ executing them.
Also, perhaps you should check `exists("collect.o")' after invoking mmc,
in case the compilation failed (e.g. due to a syntax error in the input file).

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list