[m-dev.] For review: implementation of collect for Opium-M

Erwan Jahier Erwan.Jahier at irisa.fr
Fri Oct 29 00:35:23 AEST 1999


| I'll start off with the easy review comments.
| (I have one or two more complicated ones brewing... ;-)
| 
| On 27-Oct-1999, Erwan Jahier <Erwan.Jahier at irisa.fr> wrote:
| collect_lib.m:
| > +%		3) the predicate filter/3 which updates it at each execution 
| > +%		   event. filter/3 should have the following declarations:
| > +%			:- pred filter(event, collected_type, collected_type).
| > +%			:- mode filter(in, di, uo) is det.
| 
| It would be better if this predicate also returned a bool
| to say whether or not to keep on going.
| The collection process would stop when this predicate returns `no'.

I will think about that.

| > +:- module collect_lib.
| ...
| > +:- import_module int, list, std_util, io, char.
| > +:- import_module name_mangle, dl.
| 
| Do you need to import `name_mangle'?

No. 
Removed.

| > +:- pragma export(link_collect(out, out, out, out, out, di, uo), 
| > +	"ML_DI_link_collect").
| > +
| > +% We need Handle to be able to close the shared object (dlclose) later on.
| > +% When the link failed, we output NULL pointers instead of maybe pointers
| > +% for performance reasons; indeed, filter will be called at every events
| > +% so we don't want to pay the price of the maybe variable de-construction
| > +% at each event.
| > +link_collect(Filter, Initialize, SendResult, HandlePtr, Result) -->
| > +	%
| > +	% Link in the object code for the module `collect' from
| > +	% the file `collect.so'.
| > +	%
| > +	dl__open("./collect.so", lazy, local, MaybeHandle),
| 
| It would be better to avoid hard-coding the name "./collect.so".
| On some platforms it might be something else, e.g. ".\\collect.dll".
| It would be better to let the name be passed in as an argument here.

Done.
 
| > +		%
| > +		% Look up the address of the first mode (mode number 0)
| > +		% of the predicates initialize/1 and filter/14  in the 
| > +		% module collect.
| > +		%
| > +		dl__sym(Handle, "Collect_initialize", MaybeInitialize),
| > +		dl__sym(Handle, "Collect_filter", MaybeFilter),
| > +		dl__sym(Handle, "Collect_send_collect_result", MaybeSendResult),
| 
| The comment here is misleading.  You're actually looking up the
| address of the corresponding C functions.

ok.
I have replaced that comment by:
		%
		% Look up the address of the initialize/1 and filter/14 
		% corresponding C functions in the module collect.
		%

| > +:- type io_pred == pred(io__state, io__state).
| > +:- inst io_pred == (pred(di, uo) is det).
| > +
| > +:- func inst_cast(io_pred) = io_pred.
| > +:- mode inst_cast(in) = out(io_pred) is det.
| > +
| > +:- pragma c_code(inst_cast(X::in) = (Y::out(io_pred)),
| > +	[will_not_call_mercury, thread_safe], "Y = X").
| 
| That code is all unused.

Indeed.
Removed.
 
| > +			% dynamically link the collect module with the 
| > +			% current execution
| > +	;	link_collect
| 
| As discussed above, I think that should be
| `link_collect(string)', where the string specifies
| the name of the shared object to link in.

Sure.
 
| In fact a general `link' command to link in an arbitrary shared object
| might be useful.  You might want to think about whether the
| parts of this command's functionality which are related specifically
| to collect could be put in the `collect' command rather than
| in the `link_collect' command.

Is it ok if I deal with that in a separate change later on?

| There should probably be a corresponding `unlink' command,
| because on some platforms you will run into problems if you
| try to link in the same library a second time without
| having unlinked it first.
| 
| > Index: browser/dl.m
| > ===================================================================
| > RCS file: /home/mercury1/repository/mercury/browser/dl.m,v
| > retrieving revision 1.2
| > diff -u -r1.2 dl.m
| > --- dl.m	1999/04/16 06:04:08	1.2
| > +++ dl.m	1999/10/27 13:29:03
| > @@ -24,6 +24,7 @@
| >  :- type handle.
| >  :- type result(T) ---> ok(T) ; error(string).
| >  :- type result ---> ok ; error(string).
| > +:- type handle ---> handle(c_pointer).
| >  
| >  % interface to the C function dlopen()
| >  :- pred dl__open(string::in, (mode)::in, scope::in, dl__result(handle)::out,
| > @@ -62,8 +63,6 @@
| >  	#include <dlfcn.h>
| >  #endif
| >  ").
| > -
| > -:- type handle ---> handle(c_pointer).
| 
| Hmm, why do you need to do this?

It is used in link_collect/6: open/4 outputs of variable of type maybe(handle) 
which is then used line 103:

		{ Handle = handle(HandlePtr) }

 
| > +++ mercury_trace_external.c	1999/10/27 13:29:09
| > @@ -30,6 +30,7 @@
| >  #include "mercury_trace_vars.h"
| >  
| >  #include "debugger_interface.h"
| > +#include "collect_lib.h"
| >  #include "std_util.h"
| >  
| >  #include <stdio.h>
| > @@ -42,6 +43,8 @@
| >  #include <arpa/inet.h>
| >  #include <netinet/in.h>
| >  #include <netdb.h>
| > +#include <dlfcn.h>
| 
| The use of <dlfcn.h> should be inside `#ifdef HAVE_DLFCN_H'.

Done.
 
| >  /*
| > +** Type of a local variable that indicates in which mode the external 
| > +** debugger is. When the external debugger is in mode:
| > +** - `searching', it tries to find an event that matches a forward move request,
| > +** - `reading_request', it reads a new request on the socket,
| > +** - `collecting', it is collecting information (after a `collect' request).
| > +*/
| > +typedef enum {
| > +	searching, reading_request, collecting
| > +} MR_external_debugger_mode_type;
| 
| I suggest you add `MR_' prefixes to those names.

Done.
 
| >  Code *
| >  MR_trace_event_external(MR_Trace_Cmd_Info *cmd, MR_Event_Info *event_info)
| >  {
| > -	static bool	searching = FALSE;
| > +	static MR_external_debugger_mode_type 
| > +			external_debugger_mode = reading_request;
| >  	static Word	search_data;
| > +	static Word	collecting_variable;
| > +	static Word	(*initialize_ptr)(Word *) = NULL;
| > +	static Word    	(*filter_ptr)(Integer, Integer, Integer, MR_Trace_Port,
| > +				MR_PredFunc, String, String, String, Integer,
| > +				Integer, Integer, String, Word, Word *) = NULL;
| > +	static Word	(*send_collect_result_ptr)(Word, Word);
| > +	static Word    	*handle = NULL;
| > +	static bool    	collect_linked = FALSE;
| 
| The function pointer types look wrong.  For example, why do `initialize_ptr'
| and `send_collect_result_ptr' point to functions that return values
| of type `Word', rather than `void'?

It was to avoid the following warning:

	mercury_trace_external.c: In function `MR_trace_event_external':
	mercury_trace_external.c:484: warning: passing arg 1 of pointer to function makes pointer 
	from integer without a cast
	mercury_trace_external.c:484: warning: passing arg 2 of pointer to function makes pointer 
	from integer without a cast

And I though it was the type of pointers on Mercury objets.
So, what should it be?

| Also, the use of MR_Trace_Port
| and MR_PredFunc looks dangerous; the type here should exactly match
| the type with which the function is defined, and I'll bet that it
| will be defined using Integer or Word rather than MR_Trace_Port.

Well, collect.m is generated with the help of the file collect.in (which is 
not typed by users) which contains:

	% The stuff defined below is similar to types goal_path and trace_port
	% defined in modules compiler/hlds_goal.m and compiler/trace.m.
	% This enumeration must be EXACTLY the same as the MR_trace_port enum in
	% runtime/mercury_trace_base.h, and in the same order, since the code
	% assumes the representation is the same.
	:- type trace_port_type
		--->	call
		;	exit
		;	redo
		;	fail
		...

In runtime/mercury_trace_base.h, there is the comment:

	/*
	** This enum should EXACTLY match the definition of the `trace_port_type'
	** type in browser/debugger_interface, and the port names list in the
	** C source file of this module.
	*/

	typedef	enum {
		MR_PORT_CALL,
		MR_PORT_EXIT,
		...}

Maybe I should replace it by:

	/*
	** This enum should EXACTLY match the definition of the `trace_port_type'
	** type in browser/debugger_interface and extras/opium_m/source/collect.in, 
	** and the port names list in the C source file of this module.
	*/

??

| > +	switch((MR_external_debugger_mode_type) external_debugger_mode) {
| > +		case searching:
| 
| The cast here is unnecessary  and undesirable.

Removed.

| The indentation of the `case' should be the same as that of the `switch'.

It was to be consistent with the indentation of the switch that accurs after 
in the code, namely:

	/* loop to process requests read from the debugger socket */
	for(;;) {
		MR_read_request_from_socket(
			&debugger_request, &debugger_request_type);
		switch((int) debugger_request_type) {
			case MR_REQUEST_ABORT_PROG:
				exit(EXIT_SUCCESS);
	...

Should I change that to?

| > +		case collecting:
| > +			if (seqno == 1 && port == MR_PORT_EXIT) {
| > +				/* The end of the execution is reached */
| 
| That condition is wrong; it does not necessarily indicate the last
| event.  For one thing, you should use MR_port_is_final(port) rather
| than `port == MR_PORT_EXIT', because the final port could be an
| exception or a fail port.  Even that is not enough, though:  if `main'
| is compiled without tracing, but some other modules that it calls are
| compiled with tracing, then you can reach the exit port for event #1
| long before all the events have finished.

Yes, you are right.

I'm now sending the result via MR_trace_final_external() (which is even more
efficient since I win a switch ;-); this implied to declare
external_debugger_mode, MR_collecting_variable, send_collect_result_ptr, handle
as global.

Do you think it ok?

| > +				(*filter_ptr)(
| > +			MR_trace_event_number,
| > +			seqno,
| > +			depth,
| > +			port,
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_pred_or_func,
| > +			(String)
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_decl_module,
| > +			(String)
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_def_module,
| > +			(String)
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_name,
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_arity,
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_mode,
| > +			layout->MR_sll_entry->MR_sle_detism,
| > +			(String) path,
| > +			(Word) collecting_variable,
| > +			&collecting_variable);
| 
| The cast to (Word) there is unnecessary.

Removed.

| You need to call MR_make_permanent() on the collecting_variable
| here to ensure that the memory allocated for it won't be deallocated
| on backtracking.

ok. I'll do that.

| > +			case MR_REQUEST_LINK_COLLECT:
| > +			  {
| > +			        Char	result;
| > +
| > +				if (MR_debug_socket) {
| > +					fprintf(stderr, "\nMercury runtime: "
| > +						"REQUEST_LINK_COLLECT\n");
| > +				}
| > +				MR_TRACE_CALL_MERCURY(
| > +					ML_DI_link_collect(
| > +					    (Word *) &filter_ptr,
| > +					    (Word *) &initialize_ptr,
| > +					    (Word *) &send_collect_result_ptr,
| > +					    (Word *) &handle,
| > +					    (Char *) &result
| > +					    ));
| 
| The cast to (Char *) here is unnecessary.

Removed.

| > +			case MR_REQUEST_COLLECT:
| ...
| > +					/*
| > +					** In order to perform the collect from
| > +					** the current event, we need to call 
| > +					** filter once here.
| > +					*/
| > +					(*filter_ptr)(
| > +			MR_trace_event_number,
| > +			seqno,
| > +			depth,
| > +			port,
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_pred_or_func,
| > +			(String)
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_decl_module,
| > +			(String)
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_def_module,
| > +			(String)
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_name,
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_arity,
| > +			layout->MR_sll_entry->MR_sle_user.MR_user_mode,
| > +			layout->MR_sll_entry->MR_sle_detism,
| > +			(String) path,
| > +			(Word) collecting_variable,
| > +			&collecting_variable);
| 
| The code here duplicates that above.
| You should avoid the duplication,
| e.g. by abstracting this out into a separate function.

ok.

| Content-Description: collect.op
| ...
| > % There are several things to do in order to be able to execute a 
| > % collect/1 command:
| > % 1) create a file that will that contain the definition of collected_type, 
| > %	initialize/1 and filter/3,
| > % 2) generate `collect.m' from this file (generate_collect/1),
| > % 3) compile collect.m (compile_collect/0),
| > % 4) dynamically link it with the current execution (dyn_link_collect/0).
| > % 5) run the command (run_command/1).
| 
| It would be nice if there was a single command that did steps 2-5
| all at once.

Why? 

Note that collect/2 (which does 1-5 all at once) generates and compiles
collect.m only if it is needed (i.e. it checks if the file that contains the
initialize and filter definitions has changed).
 
| > opium_scenario(
| > 	name		: collect,
| > 	files		: [collect],
| > 	scenarios	: [],
| > 	message		:
| > "..."
| > 	).
| 
| "Hence" is the wrong word to use here.  Perhaps just delete the "Hence, ".
| 
| That documentation is incomplete because it does not describe
| the `event' type.

Ok. I will change that.

| > Then the command `collect(count_call, Result)' will unify Result with the \
| > number of calls that occurs during the program execution.\
| 
| s/occurs/occur/

Yop.

| > compile_collect_Op :-
| > 	write("Compiling collect.m...\n"),
| > 	sh("rm -f collect.so  collect.o"),
| > 	current_grade(Grade),
| > 	concat_string([
| > 		"mmc --grade ", 
| > 		Grade,
| > 		" -O6",
| > 		" -c --pic-reg collect.m"], Command1),
| > 	sh(Command1), 
| > 	print(Command1), nl,
| > 	concat_string([
| > 		"ml --grade ", 
| > 		Grade, 
| > 		" --make-shared-lib ",
| > 		"--pic-reg -o collect.so collect.o"], Command2),
| > 	sh(Command2), 
| > 	print(Command2), nl,
| > 	exists("collect.so"),
| > 	!,
| > 	opium_write_debug("collect.m has been compiled successfully.\n").
| 
| It's probably better to print the commands _before_ executing them.

Done.

| Also, perhaps you should check `exists("collect.o")' after invoking mmc,
| in case the compilation failed (e.g. due to a syntax error in the input file).

Done.
----

Thanks a lot for this review Fergus.
 
Since I have left a few `?' in this mail, I'm waiting your answer to them to
post a relative diff.


-- 
R1.


--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list