[m-dev.] thread.spawn_native
Paul Bone
paul at bone.id.au
Fri Jun 6 15:39:00 AEST 2014
On Fri, Jun 06, 2014 at 02:03:32PM +1000, Peter Wang wrote:
> Hi,
>
> I propose to add a predicate that always spawns a "native" thread,
> where "native" means OS thread or whatever is provided by the
> environment we are running in.
>
> :- pred thread.spawn_native(pred(io, io)::in(pred(di, uo) is cc_multi),
> maybe_error::out, io::di, io::uo) is cc_multi.
>
> It would allow low-level C .par grades to spawn OS threads.
> This is important for applications which use blocking calls, or have
> resources bound to OS threads and not Mercury threads. These
> applications are currently restricted to hlc.par.gc for no good reason.
>
> On low-level C non-.par grades, thread.spawn_native would return an
> error.
>
> For all other backends, thread.spawn is currently equivalent to
> thread.spawn_native except that the former has no provision to fail.
> If someone was so inclined, they could make thread.spawn create
> user-mode threads in other grades, too.
I agree that blocking IO is a problem. However I don't like this solution.
Unfortunatly my alternative idea is probably more difficult. I propose to
add, and have wanted to add this for a while, non blocking IO by default,
with the runtime waking contexts (Mercury's light weight threads) when the
IO they requested is finished. In the case that the OS does not provide a
nonblocking option, an IO thread should be created by the runtime, which
should be transparent to the user. The IO thread will not run Mercury code,
it is not a Mercury Engine.
I would normally be okay with having the simplier idea implemented first
(in this case thread.spawn_native) so that at least certain things are
possible, even though they're not ideal. However in this case adding
spawn_native is also difficult. Several parts of the parallel runtime
system assume that there are a fixed number of threads (Mercury Engines).
This is by design as it makes several things more efficient including
elimiating synchronisation when retriving the array of engines. The areas
of the runtime system that depend on this including spark stealing, context
scheduling, and notification of engines when new parallel work is available
(sparks or contexts). I don't mind breaking this assumption if there's a
good reason, but a temporary solution isn't a good reason.
We should talk about the relative merrits and costs of each design to find
the best one. My guess is that my proposal will be more tedious to
implement but may be better in the long run, it may also be less flexible.
> As a related proposal, we may add a thread handle type and thread.join
> so that the user can be certain when a thread is terminated and its
> resources cleaned up.
>
> :- type thread.
>
> :- pred thread.spawn_native(
> pred(thread, io, io)::in(pred(in, di, uo) is cc_multi),
> maybe_error(thread)::out, io::di, io::uo) is cc_multi.
>
> :- pred thread.join(thread::in, io::di, io::uo) is det.
>
> I am not sure whether thread.join should be mandatory to call.
>
> The thread handle would give us a convenient place to hang a thread
> identifier, useful for logging and debugging.
>
> The thread handle could have a slot to hold a return value for
> thread.join. Not strictly necessary.
I like this idea.
I also think that thread.join could be made mandatory per-thread. For
example, Java has the concept of normal threads and deamon threads. To
clean up a normal thread you must call join, but this is not necessary for a
deamon thread. Also the process will only exit once there are zero normal
threads running, but any number of deamon threads may be active when the
process ends (but we don't need that behaviour (I think)).
In the interim the new barrier library could be used to know when a thread
exists. If you want a thread to return a result as it exits I suggest using
a channel (in the interim).
Hrm, if we introduce join, we should also consider its behavour if a thread
throws an exception to its top level.
Thanks.
--
Paul Bone
More information about the developers
mailing list