[m-dev.] thread.spawn_native

Peter Wang novalazy at gmail.com
Fri Jun 6 17:12:32 AEST 2014


On Fri, 6 Jun 2014 15:39:00 +1000, Paul Bone <paul at bone.id.au> wrote:
> On Fri, Jun 06, 2014 at 02:03:32PM +1000, Peter Wang wrote:
> > Hi,
> > 
> > I propose to add a predicate that always spawns a "native" thread,
> > where "native" means OS thread or whatever is provided by the
> > environment we are running in.
> > 
> > :- pred thread.spawn_native(pred(io, io)::in(pred(di, uo) is cc_multi),
> >     maybe_error::out, io::di, io::uo) is cc_multi.
> > 
> > It would allow low-level C .par grades to spawn OS threads.
> > This is important for applications which use blocking calls, or have
> > resources bound to OS threads and not Mercury threads.  These
> > applications are currently restricted to hlc.par.gc for no good reason.
> > 
> > On low-level C non-.par grades, thread.spawn_native would return an
> > error.
> > 
> > For all other backends, thread.spawn is currently equivalent to
> > thread.spawn_native except that the former has no provision to fail.
> > If someone was so inclined, they could make thread.spawn create
> > user-mode threads in other grades, too.
> 
> I agree that blocking IO is a problem.  However I don't like this solution.
> Unfortunatly my alternative idea is probably more difficult.  I propose to
> add, and have wanted to add this for a while, non blocking IO by default,
> with the runtime waking contexts (Mercury's light weight threads) when the
> IO they requested is finished.  In the case that the OS does not provide a
> nonblocking option, an IO thread should be created by the runtime, which
> should be transparent to the user.  The IO thread will not run Mercury code,
> it is not a Mercury Engine.

Yes, that model has some appeal.  (though simultaneous blocking calls
are possible so you need multiple IO threads)

However, some APIs carry state in OS threads.  To use those APIs you
must ensure that all calls into the API are performed on the same OS
thread, and furthermore you do not want any interference from other code
executing on that OS thread.  If you dedicate an OS thread exclusively
to execute the blocking foreign calls of a particular Mercury thread,
I think you may as well execute the Mercury thread in that OS thread.

> I would normally be okay with having the simplier idea implemented first
> (in this case thread.spawn_native) so that at least certain things are
> possible, even though they're not ideal.  However in this case adding
> spawn_native is also difficult.  Several parts of the parallel runtime
> system assume that there are a fixed number of threads (Mercury Engines).
> This is by design as it makes several things more efficient including
> elimiating synchronisation when retriving the array of engines.  The areas
> of the runtime system that depend on this including spark stealing, context
> scheduling, and notification of engines when new parallel work is available
> (sparks or contexts).  I don't mind breaking this assumption if there's a
> good reason, but a temporary solution isn't a good reason.

I'll accept that it would non-trivial now.  I don't see conceptually why
threads for parallel execution should overlap with explicit concurrency
threads.

I also think it's not only a temporary solution (and there are degrees
of "temporary").  Even if we eventually move to your preferred model,
the API issue would still remain.

> We should talk about the relative merrits and costs of each design to find
> the best one.  My guess is that my proposal will be more tedious to
> implement but may be better in the long run, it may also be less flexible.
> 
> 
> > As a related proposal, we may add a thread handle type and thread.join
> > so that the user can be certain when a thread is terminated and its
> > resources cleaned up.
> > 
> > :- type thread.
> > 
> > :- pred thread.spawn_native(
> >     pred(thread, io, io)::in(pred(in, di, uo) is cc_multi),
> >     maybe_error(thread)::out, io::di, io::uo) is cc_multi.
> > 
> > :- pred thread.join(thread::in, io::di, io::uo) is det.
> > 
> > I am not sure whether thread.join should be mandatory to call.
> > 
> > The thread handle would give us a convenient place to hang a thread
> > identifier, useful for logging and debugging.
> > 
> > The thread handle could have a slot to hold a return value for
> > thread.join.  Not strictly necessary.
> 
> I like this idea.
> 
> I also think that thread.join could be made mandatory per-thread.  For
> example, Java has the concept of normal threads and deamon threads.  To
> clean up a normal thread you must call join, but this is not necessary for a
> deamon thread.  Also the process will only exit once there are zero normal
> threads running, but any number of deamon threads may be active when the
> process ends (but we don't need that behaviour (I think)).

Sounds pretty ugly.

> 
> In the interim the new barrier library could be used to know when a thread
> exists.  If you want a thread to return a result as it exits I suggest using
> a channel (in the interim).
> 
> Hrm, if we introduce join, we should also consider its behavour if a thread
> throws an exception to its top level.

True, it gives us a place to store an exception result.

Peter



More information about the developers mailing list