[m-dev.] For review: Add implementation of reference types (global heap)

Fergus Henderson fjh at cs.mu.OZ.AU
Mon Jun 8 15:56:14 AEST 1998


On 08-Jun-1998, Warwick Harvey <wharvey at cs.monash.edu.au> wrote:
> Be gentle, this is my first time.  :-)

Thanks for your contribution... shall do!

> extras/references/nb_reference.m:
> 	Implements references which are not backtracked on failure.
> 
> extras/references/reference.m:
> 	Implements references which *are* backtracked on failure.

In other parts of the Mercury library/extras, the convention has
been to use `foo.m' for the non-trailed version and `tr_foo.m'
for the trailed version.  (Specifically library/array.m
and extras/trailed_update/tr_array.m, and library/store.m and 
extras/trailed_update/tr_store.m.) 
I think it would be good to adopt a consistent convention.

> Index: runtime/mercury_deep_copy.c
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/runtime/mercury_deep_copy.c,v
> retrieving revision 1.7
> diff -u -r1.7 mercury_deep_copy.c
> --- mercury_deep_copy.c	1998/06/02 05:34:24	1.7
> +++ mercury_deep_copy.c	1998/06/05 06:54:09
> @@ -345,3 +345,55 @@
>  		return type_info;
>  	}
>  }
> +
> +
> +
> +#include <mercury_memory.h>

Generally we use the convention that `#include <...>' is used for
system header files, while `#include "..."' is used for header files
that we provide ourselves.  So I suggest that you use <...> instead
of "..." here.

Also it's generally better to put the #includes at the top of the file
rather than in the middle.

> +Word
> +MR_make_partially_permanent(Word term, Word *type_info, Word *lower_limit)
> +{

Hmm, "partially permanent" sounds like a bit of a contraction.
How about calling it `MR_make_long_lived', or perhaps `MR_extend_lifetime'?

> +	MemoryZone *old_heap_zone;
> +	Word *old_hp;
...
> +	old_heap_zone = heap_zone;
> +	old_hp = MR_hp;
...
> +	heap_zone = global_heap_zone;
> +	global_heap_zone = old_heap_zone;
> +	MR_hp = global_hp;
> +	global_hp = old_hp;
> +
> +	save_transient_registers();
> +	result = deep_copy(term, type_info, lower_limit, old_heap_zone->top);
> +	restore_transient_registers();
> +
> +	old_heap_zone = global_heap_zone;
> +	global_heap_zone = heap_zone;
> +	heap_zone = old_heap_zone;
> +	old_hp = global_hp;
> +	global_hp = MR_hp;
> +	MR_hp = old_hp;

Some comments here about what the code is supposed to be doing would be
helpful.

I think the idea here is to swap the heap_zone (and hp) with the
global_heap_zone (and global_hp), do the copy, and then swap them
back.  But I don't understand why you need to save the old_heap_zone
and old_hp.

Oh, after staring at it a bit more I see that `old_heap_zone' and
`old_hp' are just being used as temporary variables so that you can
do the swap.  Oh, and you also use old_heap_zone->top in the call to
deep_copy().

It might be clearer to name those variables `tmp_heap_zone' and `tmp_hp'
and to use `global_heap_zone->top' instead of `old_heap_zone->top'
in the call to deep_copy(). 

So I suggest that you
	- s/old_heap_zone->top/global_heap_zone->top/
	- s/old_/tmp_/g
	- move the two lines which save the old_heap_zone and old_hp next
	  to the group of four below them which use those values,
	- add a comment next to those six lines 
          saying "/* temporarily swap the heap with the global heap */"
	- add a comment above the call to deep_copy() saying
          "/* copy values from the heap to the global heap */"
	- add a comment to the six lines below saying
	  "/* swap the heap and global heap back again */"

The code you have would work fine, but I think these suggestions would make
it a bit easier to read.

Something else that might improve readability would be to define
a swap() macro

	#define swap(val1, val2, type) 					\
		do {							\
			(type) swap_tmp;				\
			swap_tmp = (val1);				\
			(val1) = (val2);				\
			(val2) = swap_tmp;				\
		} while(0)

and to then use two calls to swap() instead of those six lines of code.

> Index: runtime/mercury_deep_copy.h
...
> +/* MR_make_permanent:
> +**
> +**	Returns a copy of term that can safely be stored in C's memory
> +**	somewhere even if it may be accessed after failure.

This is not quite right.  Or at least, I'm not sure how this is going
to interact with the accurate garbage collector.  I think the accurate
garbage collector should collect the global heap too, not just the
ordinary heap; otherwise any allocations on the global heap will be
memory leaks.  But the accurate garbage collector will not scan the C
stack or C global variables, so you won't be able to safely store
pointers to Mercury terms in those areas, at least not without
explicitly registering all such pointers (via some interface not yet
designed).

So I would suggest that this be reworded as follows:

	Returns a copy of term that can safely be accessed even after
	Mercury execution has backtracked past the point at which
	the term was allocated.

> NEW FILE: extras/references/nb_reference.m
...
> %  This type is not really used.  I'd rather define this as c_pointer, but
> %  there's a bug in the Mercury compiler that makes that not work.
> :- type nb_reference(T) ---> nb_reference(T).

I suggest you define this as

	:- type nb_reference(T) ---> nb_reference(c_pointer).

This has the same representation as `c_pointer',
but still avoids the aforementioned compiler bug.

(The same comment applies to reference.m as well as reference_nb.m.)

> NEW FILE: extras/references/nb_reference.m
...
> :- pragma inline(new_reference/2).
> :- pragma c_code(new_reference(X::in, Ref::out), will_not_call_mercury, "
> 	incr_hp(Ref, (sizeof(reference)+sizeof(Word)-1)/sizeof(Word));
> 	((refptr)Ref)->value = (void *)X;
> 	((refptr)Ref)->id = MR_current_choicepoint_id();
> ").

I suggest you put some whitespace around the arithmetic operations, i.e.
 	incr_hp(Ref, (sizeof(reference) + sizeof(Word) - 1) / sizeof(Word));

> NEW FILE: extras/references/scoped_update.m
...
> %  Occasionally one wants to use impose some scoping on non-backtrackable
> %  changes to some memory.  That is, one wants to implicitly give a given
> %  memory location two possibly different values:  the value it has inside a
> %  certain scope, and the value it has outside.

I suggest "... give a specified memory location ..."
rather than the somewhat awkward ".. give a given memory location ...".

Also, I'd add the following comment at the top:

	This module, together with `references.m', provide a way
	of implementing dynamically scoped global variables.

> :- pragma c_header_code("
> #include <mercury_trail.h>

s/<...>/"..."/

> /*
>  *  To handle the scoping, we use a scope_handle data structure, which
>  *  holds both the value inside and outside the scope.  Then we have
>  *  four functions to handle entering and leaving the scope both
>  *  forwards (on success) and backwards (on failure).  The user only
>  *  needs to think about the forwards versions; the backwards
>  *  functions are installed as function trail entries, and so are
>  *  automatically called at the right time.
>  */

Our standard convention is to use

	/*
	** ...
	*/

rather than

	/*
	 *  ...
	 */

> typedef struct {
>         Word *var;
>         Word insideval;
>         Word outsideval;
> } *scope_handle;

Our standard naming convention for type names in C is to use
"MixedCaseIdentifiers".  So this should be `ScopeHandle'
rather than `scope_handle'.

> void exit_scope(scope_handle handle);
> void enter_scope_failing(scope_handle handle, MR_untrail_reason reason);
> void exit_scope_failing(scope_handle handle, MR_untrail_reason reason);
> 
> 
> #include <stdio.h>
> 
> void show_handle(const char *msg, const scope_handle handle);

To avoid name clashes, these functions should either be declared static,
or their names should be prefixed with some unique prefix.

Declaring them static may break if the file is compiled with
`--split-c-functions' or `--intermodule-optimization',
so I suggest you go with the latter approach.

The unique prefixes that we have been using are

	MR_	Mercury Runtime
	ML_	Mercury Library

For things in `extras', we could either use `ME_' or we could just
(re)use `ML_'.  I suggest we go with `ME_'.

> :- pragma c_code(exit_scope(Handle::mdi), will_not_call_mercury, "
> 	exit_scope((scope_handle) Handle);
> ").

I suggest you use `pragma import' here:

	:- pragma import(exit_scope(mdi), will_not_call_mercury, "exit_scope").

> ===================================================================
> NEW FILE: extras/references/samples/Mmakefile
...
> #----------------------------------------------------------------------------
> -#
> 
> #interpreter.out: interpreter interpreter.inp
> #	./interpreter interpreter.m < interpreter.inp > interpreter.out
> 

Just delete that bit.

> ===================================================================
> NEW FILE: extras/references/samples/max_of.m
> %  This module defines a predicate max_of/2 that is equivalent to
> %
> %	max_of(Pred, Max) :-
> %		unsorted_solutions(Pred, List),
> %		List = [First|Rest],
> %		foldl((pred(X,Y,Z) is det :- max(X,Y,Z)), Rest, First, Max).
> %
> %   but more efficient, because it avoids building a list of solutions.

I suggest you say "but which is potentially more efficient"
rather than "but more efficient".

> :- pragma promise_pure(max_of/2).
> 
> max_of(Pred, Max) :-
> 	impure new_nb_reference(no, Someref),
> 	impure new_nb_reference(0, Maxref),
> 	(   Pred(Value),
> 	    semipure value(Someref, Some),
> 	    (	Some = no ->
> 		    impure update(Someref, yes),
> 		    impure update(Maxref, Value)
> 	    ;	semipure value(Maxref, Prev),
> 		Value > Prev ->
> 		    impure update(Maxref, Value)
> 	    ;
> 		    true
> 	    ),
> 	    fail
> 	;   
> 		impure min_solutions(Pred, MinSolutions),
> 		(   MinSolutions = 1
> 		;   semipure value(Someref, yes)
> 		),
> 		semipure value(Maxref, Max)
> 	).

The indentation here is a bit inconsistent, and differs from
our usual indentation convention.  Could you please modify
it to use consistent 8-space (one-tab) indentation,
with conditions of if-then-elses either on a single line `( ... ->'
or indented at the same level as the then and else parts

	(
		...
	->
		...
	;
		...
	)

?  Thanks.

> ===================================================================
> NEW FILE: extras/references/tests/ref_test.m
...
> %  Here is an example of using references to implement non-backtrackable
> %  global variables.  We implement global variables with a function that
> %  returns a reference.
> 
> :- pragma c_header_code(" int globalvar = 0; ").

You shouldn't put a definition in `c_header_code'.
Only declarations should go in `c_header_code'.
So you should split this into:

	:- pragma c_header_code("extern int globalvar;").
	:- pragma c_code("int globalvar = 0;").

However, that's still not quite right, because you
take the address of this variable and cast it to the Mercury type
`nb_reference(int)':

> :- func globalvar = nb_reference(int).
> :- pragma inline(globalvar/0).
> :- pragma c_code(globalvar = (Ref::out), will_not_call_mercury, "
> 	Ref = (Word) &globalvar;
> ").

That won't quite work, because &globalvar is a pointer to a C int, not
a Mercury int.  For this to work, `globalvar' should have type
`Integer' rather than `int'.  (`Integer', which is defined in
"runtime/mercury_types.h", is the C type that corresponds to
Mercury's `int' type.)

--------------------

Apart from the stuff mentioned above, that change looks great -- thanks.
Can you please send another diff when you've addressed those comments?

Cheers,
	Fergus.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.



More information about the developers mailing list