[m-rev.] for review: Support dynamic creation of Mercury engines in low-level C parallel grades.
Peter Wang
novalazy at gmail.com
Wed Jun 25 11:47:37 AEST 2014
On Tue, 24 Jun 2014 19:42:50 +1000, Paul Bone <paul at bone.id.au> wrote:
> >
> > runtime/mercury_context.c:
> > runtime/mercury_context.h:
> > Rename MR_num_idle_engines to MR_num_ws_engines. It only counts
> > idle work-stealing engines.
>
> MR_num_ws_engines is the same name as above. I guess, and hope, that this
> comment is in error and these are seperate variables with distinct names.
That should be MR_num_idle_ws_engines.
> >
> > Rename MR_try_wake_an_engine to MR_try_wake_ws_engine
> > and restrict it to work-stealing engines.
>
> What if a context that has an exclusive engine becomes runnable, how do you
> wake it? I suggest instead parameterizing MR_try_wake_an_engine. Is it not
> valid to send MR_ENGINE_ACTION_CONTEXT to an exclusive engine? For example
> if the exclusive context blocked on a thread.channel or something?
try_wake_engine does that, which is different from MR_try_wake_ws_engine.
>
> > Rename MR_shutdown_all_engines to MR_shutdown_ws_engines
> > and restrict it to work-stealing engines.
>
> I was initially confused by this, i didn't realize that these engines would
> shutdown by themselves when they finnished their work. The changelog should
> reflect this.
Added a comment.
>
> > Make try_wake_engine and try_notify_engine decrement
> > MR_num_idle_ws_engines only for shared engines.
> >
> > In MR_do_idle, make exclusive engines bypass work-stealing
> > and skip to the sleep state.
> >
> > In MR_do_sleep, make exclusive engines ignore work-stealing advice
> > and abort the program if told to shut down.
> > Assert that a context with an exclusive_engine really is only loaded
> > by that engine.
> >
> > In MR_fork_new_child, make exclusive engines not attempt to wake
> > work-stealing engines. Its sparks cannot be stolen anyway.
> >
> > Make do_work_steal fail the attempt for exclusive engines.
> > There is one call where this might happen.
>
> Where? I couldn't find / didn't see it.
>
In MR_do_sleep, case ETIMEDOUT. Perhaps it can't happen?
I need to understand the states better.
I'll make MR_do_sleep test for a work-stealing engine before calling
do_work_steal.
> > Add notes to MR_attempt_steal_spark. Its behaviour is unchanged.
> >
> > Replace a call to MR_destroy_thread by MR_finalize_thread_engine.
> >
> > Delete MR_num_exited_engines. It was unused.
> >
> > runtime/mercury_thread.c:
> > runtime/mercury_thread.h:
> > Delete MR_next_engine_id and MR_next_engine_id_lock.
> > Engine ids can now be recycled so they are not enough.
>
> This didn't parse well.
More explanation?
Delete MR_next_engine_id and MR_next_engine_id_lock. Because
engines can be created and destroyed dynamically, and engine ids
index into fixed-sized arrays, we cannot allocate engine ids by
just incrementing a counter.
> >
> > runtime/mercury_memory_zones.c:
> > Replace MR_num_threads by appropriate counters (I hope).
>
> I never got around to testing and tuning those settings so I wouldn't worry.
> Your changes look reasonable.
Ok.
> > runtime/mercury_memory_handlers.c:
> > runtime/mercury_par_builtin.h:
> > Conform to changes.
> >
> > runtime/mercury_threadscope.c:
> > XXX don't know about this
>
> Leave it, I'm working on some large scale changes for parallel profiling.
Ok.
>
> > diff --git a/runtime/mercury_context.c b/runtime/mercury_context.c
> > index db79ee3..382aa7d 100644
> > --- a/runtime/mercury_context.c
> > +++ b/runtime/mercury_context.c
> > @@ -242,16 +249,15 @@ static MR_Context *free_small_context_list = NULL;
> > #endif
> >
> > #ifdef MR_LL_PARALLEL_CONJ
> > -MR_Integer volatile MR_num_idle_engines = 0;
> > -MR_Unsigned volatile MR_num_exited_engines = 0;
> > +MR_Integer volatile MR_num_idle_ws_engines = 0;
> > static MR_Integer volatile MR_num_outstanding_contexts = 0;
> > -static sem_t shutdown_semaphore;
> > +static sem_t shutdown_ws_semaphore;
> >
> > static MercuryLock MR_par_cond_stats_lock;
> > /*
> > ** The spark deques are kept in engine id order.
> > **
> > -** This array will contain MR_num_threads pointers to deques.
> > +** This array will contain MR_max_engines pointers to deques.
> > */
> > MR_SparkDeque **MR_spark_deques = NULL;
> > #endif
>
> If there arn't (currently) that many engines, then are some slots in the
> array NULL? This comment should say whether or not this is true.
Done.
> > @@ -1794,14 +1845,17 @@ try_wake_engine(MR_EngineId engine_id, int action,
> > */
> > MR_LOCK(&(esync->d.es_wake_lock), "try_wake_engine, wake_lock");
> > if (esync->d.es_state == ENGINE_STATE_SLEEPING) {
> > - MR_atomic_dec_int(&MR_num_idle_engines);
> > + /* XXX not sure about this */
> > + if (engine_id < MR_num_ws_engines) {
> > + MR_atomic_dec_int(&MR_num_idle_ws_engines);
> > #ifdef MR_DEBUG_THREADS
> > if (MR_debug_threads) {
> > - fprintf(stderr, "%ld Decrement MR_num_idle_engines %"
> > + fprintf(stderr, "%ld Decrement MR_num_idle_ws_engines %"
> > MR_INTEGER_LENGTH_MODIFIER "d\n",
> > - MR_SELF_THREAD_ID, MR_num_idle_engines);
> > + MR_SELF_THREAD_ID, MR_num_idle_ws_engines);
> > }
> > #endif
> > + }
>
> I think it's correct to decrement MR_num_idle_ws_engines for when waking
> work stealing engines, not any engine. So this code is correct.
Ok.
> > +static void
> > +MR_shutdown_engine_for_threads(MercuryEngine *eng)
> > +{
> > + #ifndef MR_HIGHLEVEL_CODE
> > + MR_EngineId id = eng->MR_eng_id;
> > +
> > + MR_LOCK(&MR_all_engine_bases_lock, "MR_shutdown_engine_for_threads");
> > +
> > + assert(MR_all_engine_bases[id] == eng);
> > + MR_all_engine_bases[id] = NULL;
> > +
> > + if (MR_highest_engine_id == id) {
> > + int i;
> > + for (i = id - 1; i >= 0; i--) {
> > + if (MR_all_engine_bases[i] != NULL) {
> > + MR_highest_engine_id = (MR_EngineId) i;
> > + break;
> > + }
> > + }
> > + }
> > +
> > + assert(MR_spark_deques[id] == eng->MR_eng_spark_deque);
> > + MR_spark_deques[id] = NULL;
> > +
> > + MR_UNLOCK(&MR_all_engine_bases_lock, "MR_shutdown_engine_for_threads");
> > + #endif
> > +}
> > +#endif /* MR_THREAD_SAFE */
>
> So we leave the engine sleep sync data slot allocated? Is that deliberate?
> Or maybe I've forgotten something since I wrote that code.
engine_sleep_sync_data slots are always initialised. When the engine
shuts down the slot should be left in an acceptable state for the next
engine to occupy that slot - or, that was the intention. I'll check it
and maybe add an assertion.
> > diff --git a/runtime/mercury_wrapper.c b/runtime/mercury_wrapper.c
> > index f9c09ca..82aa750 100644
> > --- a/runtime/mercury_wrapper.c
> > +++ b/runtime/mercury_wrapper.c
> > @@ -323,15 +323,10 @@ static char *MR_mem_usage_report_prefix = NULL;
> >
> > static int MR_num_output_args = 0;
> >
> > -/*
> > -** This is initialized to zero. If it is still zero after configuration of the
> > -** runtime but before threads are started, then we set it to the number of
> > -** processors on the system (if support is available to detect this).
> > -** Otherwise, we fall back to 1.
> > -*/
> > -MR_Unsigned MR_num_threads = 0;
> > +#ifdef MR_LL_PARALLEL_CONJ
> > +MR_Unsigned MR_num_ws_engines = 0;
> > +MR_Unsigned MR_max_engines = 8192;
>
> That's a lot of pthreads. However if you're not using them all then the
> only cost is memory for three arrays that hold null pointers. 3*8*8192 =
> 192KB. It doesn't seem like much but, and I choose these words
> specifically, 1024 pthreads ought to be enough for anybody.
Yeah, I don't know when you would need that many. Out of interest, I
can already comfortably create 8000 engines in asm_fast.par.gc.stseg
(doing practically nothing, of course). top says the process is taking
about 5 GB RES. With --detstack-size 1 --nondetstack-size 1 it's only
474 MB RES.
But I'll drop it down.
> > diff --git a/runtime/mercury_wrapper.h b/runtime/mercury_wrapper.h
> > index 4f7ce48..3a4a50c 100644
> > --- a/runtime/mercury_wrapper.h
> > +++ b/runtime/mercury_wrapper.h
> > @@ -258,7 +258,7 @@ extern MR_Unsigned MR_contexts_per_thread;
> >
> > /*
> > ** The number of outstanding contexts we can create
> > -** (MR_contexts_per_thread * MR_num_threads).
> > +** (MR_contexts_per_thread * MR_num_ws_engines).
> > */
> > extern MR_Unsigned MR_max_outstanding_contexts;
> >
>
> That raises a question. Are contexts created for thread.spawn_native
> counted towards this limit?
Yes, because it is a (crude) attempt to limit memory usage.
Peter
More information about the reviews
mailing list