[m-dev.] thread.spawn_native

Paul Bone paul at bone.id.au
Thu Jun 12 15:49:49 AEST 2014


On Wed, Jun 11, 2014 at 06:00:40PM +1000, Peter Wang wrote:
> On Wed, 11 Jun 2014 12:35:53 +1000, Paul Bone <paul at bone.id.au> wrote:
> > On Wed, Jun 11, 2014 at 12:20:43PM +1000, Peter Wang wrote:
> > > 
> > > The problem is when two Mercury contexts try to use the same library
> > > simultaneously.  They cannot rely on thread-local state to persist
> > > across calls:
> > > 
> > >     1. context-1 calls liba.set(1) -- performed on IO-thread-1
> > >     2. context-2 calls liba.set(2) -- also performed on IO-thread-1
> > >     3. context-1 calls liba.get -- performed on IO-thread-1
> > > 
> > 
> > Why doesn't each context use a handle or pointer to refer to it's own set of
> > data in liba.  liba's API is very limited if it doesn't allow for this.
> 
> That's the problem.  If the API accepts context handles everywhere
> instead of using thread-local state then we wouldn't be having this
> discussion.
> 
> Here are a couple of examples:
> 
> libxml2: Although there are context handles, and it allows for parsing
> multiple documents simultaneously in different OS threads, it crashes if
> you try to parse multiple documents simultaneously (multiple Mercury
> contexts multiplexed onto the same OS thread).  The context handle
> probably does not encapsulate ALL the state.

To me that seems as if it's not thread safe.  Or doesn't meet my
expectations of "thread safety".  Specifically, all the state should be
captured in the handle to make it safe.

> OpenGL: you call glXMakeContextCurrent and from then on all GL calls
> from that OS thread work within that rendering context.
> Different threads may work with different contexts simultaneously.
> GL functions do not take a rendering context argument.

I had a feeling OpenGL might do this.  But I didn't know there was a
glXMakeContextCurrent so that's better than I was expecting.

I didn't realize OpenGL would crash if I switched OS threads - that's the
kind of example interested in finding.  I haven't used OpenGL since 1.4 and
even then I didn't use it deeply.

> Allegro 5: sorry... I blame OpenGL
> 
> I'm sure I could find more examples easily.  Yes, the APIs are to blame.
> I guess the reason is that they start out with global state, then thread
> support is bolted on using thread-local state for source compatibility.
> 
> Only libxml2 is a *actual* problem right now.  We can work around it if
> we really care, but we'll just use hlc.par.gc.

So my main objection to spawn_native is that it creating and destroying
Mercury engines at runtime, which is difficult.  However spawning IO workers
is much less of a problem.  So I propose that we add support for IO workers
to the runtime, and add library code so that the programmer can manage IO
workers for libraries with these requirements.

    :- type io_worker_thread.

    :- pred spawn_io_worker(io_worker_thread::out, io::di, io::uo) is det.

    :- pred end_io_worker(io_worker_thread::in, io::di, io::uo) is det.

This allows the programmer to manipulate IO workers like any other variable,
they can use them in part of the representation of a library's state.

Then we add something for foreign calls, maybe a pragma or a foreign call
attribute, so that the compiler and runtime system know that the foreign
call should be executed by the IO worker.  The compiler and runtime system
are responsible for passing the data to the IO worker thread, and returning
ti from that thread and waking the Mercury context when the foreign call
returns.

This should be much easier than safely adding spawn_native.


> Anyway, this is a secondary issue.  The primary issue I want to solve is
> blocking functions.  Both spawn_native and IO workers can work.  I see
> spawn_native as much simpler, and also happens to solve the API issue.
> Your objection is that it may break assumptions in the implementation
> currently made for efficiency.  I'm willing to try it and see.

It sounds like you're saying that this shouldn't affect efficiency too
badly.  You are probably right, but my main concern is about the effort
required to do this.


> On the other hand, I object to the complexity of IO workers and see no
> benefit.  Some form of helper threads may be required to augment async
> I/O in the future.  I don't think the two forms are necessarily the
> same: spawn_native can be an escape hatch when something is not possible
> in the helper thread, e.g. thread-local state, or blocking functions
> that call back into Mercury.

IO workers and asynchronous IO may be just as complex to implement as
spawn_native.  I like this because asynchronous IO should reduce the number
of threads required to manage a large number of IO tasks.

However allowing the number of Mercury engines to change at runtime
(spawn_native) has another benefit that I'd forgotten until now.  It allows
Mercury, given information from the OS, to adjust it's demands on the
system's parallel processing resources.  When Apple announced Grand Central
Dispatch it caused Zoltan and I to think about how multiple programs like
Mercury programs interact when running on the same hardware.  If you have
four processors and two CPU-bound Mercury programs, the Mercury programs by
default try to use four processors because that's how many are detected.
However, it'd be a lot more efficient if the programs didn't share the same
four processors but instead used two processors each (two native threads
each), so that there are fewer context switches.  These kinds of adjustments
are best made at runtime as situations change.  As far as we know no OS
provides this kind of information so this isn't a practical concern right
now.  However it is another reason why allowing on-the-fly creation and
destruction of Mercury engines might be a good thing.

Given that, I think that removing this restriction and implementing
spawn_native may be easier than the combination of IO workers, asynchronous
IO, library support for IO workers and a foreign call annotation for context
pinning.

What are your goals WRT timeliness?  Depending on when you need it I can
review the RTS code and remove this restriction.

Thanks.


-- 
Paul Bone



More information about the developers mailing list