[m-dev.] For review: fork/4

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Jun 25 18:15:46 AEST 1998


On 25-Jun-1998, Thomas Charles CONWAY <conway at kryten.cs.mu.OZ.AU> wrote:
> Add a new HO construct to io.m: fork/4 which takes to closures that it
> executes concurrently wrt an io__state pair.

s/to closures/two closures/

> boehm_gc/linux_threads.c:
> 	bugfix: remove an errant '*' (dereference).

Did you check boehm_gc/irix_threads.c to see whether it has the same
problem?

> runtime/mercury_engine.{c,h}:
> 	add a new field to the MercuryEngine strucutre which is used to

s/strucutre/structure/

(Also it's generally good style to start sentences with a capital letter.)

> Index: boehm_gc/linux_threads.c
> ===================================================================
> RCS file: /home/staff/zs/imp/mercury/boehm_gc/linux_threads.c,v
> retrieving revision 1.2
> diff -u -r1.2 linux_threads.c
> --- 1.2	1998/06/11 02:40:48
> +++ linux_threads.c	1998/06/22 03:48:43
> @@ -543,7 +543,7 @@
> -    result = (*(si -> start_routine))(si -> arg);
> +    result = (si -> start_routine)(si -> arg);

That shouldn't make any difference. 
If it does, the C compiler is broken.
I would quite surprised if gcc is really broken in this respect.
If it really is, we should reproduce the bug in a small test
case and send them a bug report.

If anything the old code should be more portable --
the implicit dereference of function pointers, which the
new code uses, is something that I think was added by ANSI C,
and it may be that some old K&R C compilers don't support it
and would thus report a type error.  Since the boehm gc's
portability target includes non-ANSI C compilers, the change
is probably not a good idea.

> -pragma_c_gen__ordinary_pragma_c_code(CodeModel, MayCallMercury,
> +pragma_c_gen__ordinary_pragma_c_code(CodeModel, Attributes,
>  		PredId, ProcId, ArgVars, ArgDatas, OrigArgTypes,
>  		C_Code, Context, Code) -->
> +	
> +	%
> +	% Extract the attributes
> +	%
> +	{
> +		list__member(recursive(will_not_call_mercury), Attributes)
> +	->
> +		MayCallMercury = will_not_call_mercury
> +	;
> +		MayCallMercury = may_call_mercury
> +	},
> +	{
> +		list__member(threadsafe(threadsafe), Attributes)
> +	->
> +		ThreadSafe = threadsafe
> +	;
> +		ThreadSafe = not_threadsafe
> +	},
> +
...
> +	%
> +	% Extract the may_call_mercury attribute
> +	%
> +	{
> +		list__member(recursive(will_not_call_mercury), Attributes)
> +	->
> +		MayCallMercury = will_not_call_mercury
> +	;
> +		MayCallMercury = may_call_mercury
> +	},

I detect some duplicate code here.
You should abstract that out into subroutines.

> +++ prog_data.m	1998/06/23 23:23:01
> @@ -98,9 +98,11 @@
>  
>  	;	c_code(string)
>  
> -	;	c_code(may_call_mercury, sym_name, pred_or_func,
> +	;	c_code(list(pragma_c_code_flag), sym_name, pred_or_func,
>  			list(pragma_var), varset, pragma_c_code_impl)

In general, wouldn't it be better to have a set of attributes rather
than a list?

Best of all, make it an abstract data structure.

> -			% Whether or not the C code may call Mercury,
> +			% List of C code attributes:
> +			%	whether or not the C code may call Mercury,
> +			%	whether or not hte C code is thread-safe,

s/hte/the/

This documentation is likely to rot when we add new attributes,
so I suggest inserting "e.g.":
			
			% List of C code attributes, e.g.:
			...

> +:- type pragma_c_code_flag
> +	--->	recursive(may_call_mercury)
> +	;	threadsafe(threadsafe)
> +	.

What happens if the list contains both recursive(may_call_mercury)
and recursive(will_not_call_mercury)?

This seems like an inefficient data structure.  Once you get to the
HLDS, it might make more sense, I think, to use just a list or set
of

	:- type pragma_c_code_flag
		--->	will_not_call_mercury
		;	threadsafe
		.

rather than having the data structure correspond directly to the
source syntax.

Also I suggest you use the term `attribute' rather than `flag'
in the prog_data representation, since the next thing we will
want to add to this is `language'

		...
		;	language(language)
		.

	:- type language
		--->	c
		;	c_plus_plus
		;	fortran
		...

Although maybe it would be best to seperate out flags from
attributes once you get to the HLDS, since the set of flags
could be represented as a single integer bitmap.

> +	% For pragma c_code, if threadsafe execution is being used, then
> +	% we need to put a mutex around the code unless it's declared to
> +	% be threadsafe.
> +:- type threadsafe
> +	--->	not_threadsafe
> +	;	threadsafe.

It's not quite clear what "the code" refers to.
I suggest

	% If threadsafe execution is enabled, then
	% we need to put a mutex around the C code for each
	% `pragma c_code' declaration, unless it's declared to
	% be threadsafe.

> Index: compiler/prog_io_pragma.m
> ===================================================================
> RCS file: /home/staff/zs/imp/mercury/compiler/prog_io_pragma.m,v
> retrieving revision 1.14
> diff -u -r1.14 prog_io_pragma.m
> --- 1.14	1998/05/15 07:07:32
> +++ prog_io_pragma.m	1998/06/24 00:55:52
> @@ -105,34 +105,34 @@
>  	->
>  	    % XXX we should issue a warning; this syntax is deprecated.
>  	    % Result = error("pragma c_code doesn't say whether it can call mercury", PredAndVarsTerm)
> -	    MayCallMercury = will_not_call_mercury,
> +		    % may_call_mercury is a conservative default.
> +	    Flags = [recursive(may_call_mercury),
> +			threadsafe(not_threadsafe)],

You should not make this change, because it will break existing code
in the Mercury library.  `may_call_mercury' is unfortunately not a
conservative default for code that modifies MR_hp.

In addition it would cause a major efficiency hit to existing code.

So, either fix all the existing code in the Mercury library first,
or don't make this change.  I'd prefer the former, but the latter
would be easier ;-)

> diff -u -r1.97 reference_manual.texi
> --- 1.97	1998/06/12 07:05:35
> +++ reference_manual.texi	1998/06/24 00:05:01
> +
> +In general, you may use either a single attribute or a list of attributes
> +in place of @samp{may_call_mercury} or @samp{will_not_call_mercury}.
> +In the current release of Mercury there are two attributes:

I suggest you replace that with

	All Mercury implementations must support the attributes described
	below.  They may also support additional attributes.

	The attributes which must be supported by all implementations
	are as follows:

(and then delete the occurrences of "which" below):

> + at table @asis
> +
> + at item @samp{may_call_mercury}/@samp{will_not_call_mercury} which declares
> +whether or not execution inside this C code may call back into Mercury
> +or not.
> +
> + at item @samp{threadsafe}/@samp{not_threadsafe} which declares whether or not
> +it is safe for multiple threads to execute this C code concurrently.
> +C code that is not threadsafe has code inserted around it to obtain
> +and release a mutex.  All non-threadsafe C code shares a single mutex.

This only happens in multithreaded environments.
The wording should reflect that.  Also the wording should
describe the requirements on implementations, rather than just
describing what happens with the current implementation.

> +:- pred io__fork(pred(io__state, io__state), pred(io__state, io__state),
> +		io__state, io__state).
> +:- mode io__fork(pred(di, uo) is cc_multi, pred(di, uo) is cc_multi, di, uo)
> +		is cc_multi.
> +%	io__fork(Pred1, Pred2, State0, State).
> +%		`State' is the io__state that results from an [arbitary]
> +%		interleaving of the state modifications made by `Pred1' and
> +%		`Pred2'. That is, io__fork executes `Pred1' and `Pred2'
> +%		concurrently (though not necessarily in parallel).
> +%		This predicate is only available in thread-safe grades.

1.  What's a thread-safe grade?
    You should refer the reader to the "Compilation model"
    section of the Mercury users guide.
    (And you should make sure there's something for them to find there
    when the follow the cross-reference!  Currently the options for
    thread-safe execution are not yet documented.)

2.  Since this predicate isn't available in all grades, it should
    probably go in a different library rather than in the standard library.

3.  Some readers will probably think this does something like C's fork().
    You should explain that it does not fork off a new process.
    In fact, the relationship between Mercury threads and Posix threads
    will need to be explained, in case users want to mix C code using
    threads with Mercury code using fork/4.

    This documentation will need to be thought about carefully
    to avoid overly constraining future implementations.

> +io__fork(This, That) -->
> +	make_semaphore(0, Sem),
> +	io__fork2(do_then_signal(Sem, This), do_then_signal(Sem, That)),
> +	wait_semaphore(Sem),
> +	wait_semaphore(Sem).
>
> +:- pred io__fork2(pred(io__state, io__state), pred(io__state, io__state),
> +		io__state, io__state).
> +:- mode io__fork2(pred(di, uo) is cc_multi, pred(di, uo) is cc_multi, di, uo)
> +		is cc_multi.

s/fork2/fork_2/g

> +:- pragma no_inline(io__fork2/4).

Please document why not.

> +:- pragma c_code(io__fork2(Pred1::(pred(di, uo) is cc_multi),
> +		Pred2::(pred(di, uo) is cc_multi), IO0::di, IO::uo),
> +		may_call_mercury, "{
> +#ifdef	MR_THREAD_SAFE
> +	MR_ThreadGoal	*child;
> +
> +	incr_hp_atomic((Word *) child,
> +		round_up(sizeof(MR_ThreadGoal), sizeof(Word)));
> +
> +	child->func = (void (*)(void *)) ML_io_call_forked_goal;

I think it would be a good idea to avoid using a cast here.

The reason is that a compiler could conceivably pass arguments
of type `void *' in a different register to arguments of
type `Word' (e.g. on m68k, where there are different
registers for integers and pointers).

Instead, add a new C function called say ML_io_call_forked_goal_from_thread
that takes a `void *' and just calls ML_io_call_forked_call
after casting the argument to Word.

> +:- pragma c_header_code("
> +#ifdef	MR_THREAD_SAFE
> +	typedef struct {
> +		MercuryLock	lock;
> +		MercuryCond	cond;
> +		int		count;
> +	} ML_Semaphore;
> +#endif
> +").
> +
> +:- type semaphore == c_pointer.
> +
> +:- pred make_semaphore(int, semaphore, io__state, io__state).
> +:- mode make_semaphore(in, out, di, uo) is det.
> +
> +:- pragma no_inline(make_semaphore/4).

All this stuff should go in a separate library for
concurrency/multithreading, I think.  I suggest a new directory
`extras/threads', although I'm open to suggestions for better names.

It would make sense to document the semaphore stuff and include
it in the interface.  Indeed, semaphores could go in a different module
than io__fork.

You should document the reason for the `pragma no_inline' declarations.

The semaphore implementation using a mutex and a condition variable seems
a bit inefficient; you really ought to use Posix (1003.1b) semaphores
if they are available (of course you'd need to autoconf that).

> diff -u -r1.5 mercury_context.c
> --- 1.5	1998/06/15 07:00:03
> +++ mercury_context.c	1998/06/24 06:41:21
> @@ -48,6 +48,8 @@
>  	free_context_list_lock = make(MercuryLock);
>  	pthread_mutex_init(free_context_list_lock, MR_MUTEX_ATTR);
>  
> +	pthread_mutex_init(&MR_global_lock, MR_MUTEX_ATTR);

Why not just use a static initializer (PTHREAD_MUTEX_INITIALIZER)?

>  void 
>  call_engine_inner(Code *entry_point)
>  {
> -#ifdef	MR_THREAD_SAFE
> -	MercuryThread saved_owner_thread;
> -#endif

Please add a comment warning about the use of local variables
in this function.

> +create_thread_2(void *goal0)
> +{
> +	MR_ThreadGoal *goal;
> +
> +	goal = (MR_ThreadGoal *) goal0;
> +	if (goal != NULL)
> +	{

The curly brace should be on same line as the if.

> +void
> +init_thread(MR_when_to_use when_to_use)
...
> +	switch (when_to_use) {
> +		case MR_use_later :
> +			call_engine(ENTRY(do_runnext));
> +
> +			destroy_engine(eng);
> +			return;

This behaviour (particularly the destroy_engine() call)
should be documented.

> +		case MR_use_now :
> +			return;
> +		
> +		default:
> +			fatal_error("init_thread was passed a bad value");
> +			return;

Delete the `return'.  It's unreachable code.

...
> Index: runtime/mercury_thread.h
>  
> +  #define MR_OBTAIN_GLOBAL_C_LOCK()	MR_mutex_lock(&MR_global_lock, \
> +						"pragma c code");
> +
> +  #define MR_RELEASE_GLOBAL_C_LOCK()	MR_mutex_unlock(&MR_global_lock, \
> +						"pragma c code");
> +

Some documentation here pointing to the documentation at the definition
of MR_global_lock about its purpose would be helpful.

>    #if defined(MR_DIGITAL_UNIX_PTHREADS)
>      #define MR_GETSPECIFIC(key) 	({		\
>  		pthread_addr_t gstmp;			\
> @@ -64,10 +70,26 @@
>      #define	MR_KEY_CREATE		pthread_key_create
>    #endif
>  
> -  MercuryThread	*create_thread(int x);
> +  typedef struct {
> +	  void	(*func)(void *);
> +	  void	*arg;
> +  } MR_ThreadGoal;
> +  MercuryThread	*create_thread(MR_ThreadGoal *);

You should document create_thread().

Also it should be MR_create_thread() and MR_destroy_thread().
I suppose that change can be made later if you want,
but this seems like a good time.

> +	/*
> +	** MR_global_lock is a mutex for ensuring that only one non-threadsafe
> +	** piece of pragma c code executes at a time. If `not_threadsafe' is
> +	** given or `threadsafe' is not given in the attributes of a pragma
> +	** c code definition of a predicate, then the generated code will
> +	** obtain this lock before executing, then release it.

I suggest "...before executing the C code fragment, and then
release it afterwards.".

> +	** XXX we should emit a warning if may_call_mercury and not_threadsafe
> +	** (the defaults) are specified since if you obtain the lock then
> +	** call back into Mercury deadlock could result.

Ouch.

--------------------

I'd like to see another diff, of course.

Cheers,
	Fergus.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.



More information about the developers mailing list