[m-dev.] thread.spawn_native

Peter Wang novalazy at gmail.com
Tue Jun 10 16:34:47 AEST 2014


On Fri, 6 Jun 2014 18:07:34 +1000, Paul Bone <paul at bone.id.au> wrote:
> On Fri, Jun 06, 2014 at 05:12:32PM +1000, Peter Wang wrote:
> > On Fri, 6 Jun 2014 15:39:00 +1000, Paul Bone <paul at bone.id.au> wrote:
> > > On Fri, Jun 06, 2014 at 02:03:32PM +1000, Peter Wang wrote:
> > > > Hi,
> > > > 
> > > > I propose to add a predicate that always spawns a "native" thread,
> > > > where "native" means OS thread or whatever is provided by the
> > > > environment we are running in.
> > > > 
> > > > :- pred thread.spawn_native(pred(io, io)::in(pred(di, uo) is cc_multi),
> > > >     maybe_error::out, io::di, io::uo) is cc_multi.
> > > > 
> > > > It would allow low-level C .par grades to spawn OS threads.
> > > > This is important for applications which use blocking calls, or have
> > > > resources bound to OS threads and not Mercury threads.  These
> > > > applications are currently restricted to hlc.par.gc for no good reason.
> > > > 
> > > > On low-level C non-.par grades, thread.spawn_native would return an
> > > > error.
> > > > 
> > > > For all other backends, thread.spawn is currently equivalent to
> > > > thread.spawn_native except that the former has no provision to fail.
> > > > If someone was so inclined, they could make thread.spawn create
> > > > user-mode threads in other grades, too.
> > > 
> > > I agree that blocking IO is a problem.  However I don't like this solution.
> > > Unfortunatly my alternative idea is probably more difficult.  I propose to
> > > add, and have wanted to add this for a while, non blocking IO by default,
> > > with the runtime waking contexts (Mercury's light weight threads) when the
> > > IO they requested is finished.  In the case that the OS does not provide a
> > > nonblocking option, an IO thread should be created by the runtime, which
> > > should be transparent to the user.  The IO thread will not run Mercury code,
> > > it is not a Mercury Engine.
> > 
> > Yes, that model has some appeal.  (though simultaneous blocking calls
> > are possible so you need multiple IO threads)
> 
> Yes, I was thinking of creating them on demand to blocking work, (or cache
> them as a different type of 'worker' thread).
> 
> > However, some APIs carry state in OS threads.  To use those APIs you
> > must ensure that all calls into the API are performed on the same OS
> > thread, and furthermore you do not want any interference from other code
> > executing on that OS thread.  If you dedicate an OS thread exclusively
> > to execute the blocking foreign calls of a particular Mercury thread,
> > I think you may as well execute the Mercury thread in that OS thread.
> 
> Oh, sorry, I forgot that part of your original message.  My seperate
> proposal for this (which I've talked to Zoltan about) is to add a new
> annotation to foreign calls to APIs that have this requirement.  When such a
> foreign call is made the runtime system will annotate the context to say
> that it must remain on that Mercury engine, and will not be migrated, even
> after the call returns.  This should be easy enough to implement.

The proposal is unclear to me.

Let's say the annotation is called `may_not_migrate'.  If a Mercury
context calls a `may_not_migrate' foreign proc, then it is permanently
tied to some Mercury engine (OS thread) from then on?  Presumably the
foreign proc is executed directly in that OS thread, and not rerouted to
an IO thread as in the case of blocking foreign procs.  Otherwise, it
would not solve the API issue.

Where does the Mercury engine come from?  If the number of Mercury
engines is fixed, doesn't it mean that a blocking `may_not_migrate'
foreign proc will prevent progress of other Mercury threads?  Or can
`may_not_migrate' foreign procs not block?

Can two Mercury contexts be bound to the same Mercury engine?
Presumably so, if the number of Mercury engines is fixed.
But then one context could stomp on the thread-local state expected
by another context.

> > > I would normally be okay with having the simplier idea implemented first
> > > (in this case thread.spawn_native) so that at least certain things are
> > > possible, even though they're not ideal.  However in this case adding
> > > spawn_native is also difficult.  Several parts of the parallel runtime
> > > system assume that there are a fixed number of threads (Mercury Engines).
> > > This is by design as it makes several things more efficient including
> > > elimiating synchronisation when retriving the array of engines.  The areas
> > > of the runtime system that depend on this including spark stealing, context
> > > scheduling, and notification of engines when new parallel work is available
> > > (sparks or contexts).  I don't mind breaking this assumption if there's a
> > > good reason, but a temporary solution isn't a good reason.
> > 
> > I'll accept that it would non-trivial now.  I don't see conceptually why
> > threads for parallel execution should overlap with explicit concurrency
> > threads.
> 
> What if your explicit concurrency thread executes a parallel conjunction? or
> is executed by a parallel conjunction.  We at least have to accept the
> possibility.

Right, any explicit concurrency thread (of which the main thread is one)
should be able to make goals available for execution by the parallel
execution workers.  And the problem is that a parallel conjunct could
(indirectly) call a foreign proc which has a requirement to be executed
on a particular Mercury engine (OS thread).

It seems solvable.  When the parallel worker hits a foreign proc (say,
with the `may_not_migrate' annotation) it should be able to make the
work available to be picked up by the original Mercury engine.  I think
that would be as if the parallel conjunct never left the original
Mercury engine?

Blocking foreign procs should not be executed in a parallel worker
either.

> > I also think it's not only a temporary solution (and there are degrees
> > of "temporary").  Even if we eventually move to your preferred model,
> > the API issue would still remain.
> 
> Sorry, I forgot to mention the second proposal above.
> 
> > > 
> > > I also think that thread.join could be made mandatory per-thread.  For
> > > example, Java has the concept of normal threads and deamon threads.  To
> > > clean up a normal thread you must call join, but this is not necessary for a
> > > deamon thread.  Also the process will only exit once there are zero normal
> > > threads running, but any number of deamon threads may be active when the
> > > process ends (but we don't need that behaviour (I think)).
> > 
> > Sounds pretty ugly.
> 
> The benifit is that it allows developers to choose what behaviour they need.

Admittedly everything else is going down so it's not as bad
as a thread.kill procedure.

Peter



More information about the developers mailing list