[m-users.] uncaught Mercury exception using mdprof_create_feedback

Matthias Guedemann matthias.guedemann at googlemail.com
Sat Oct 11 19:18:40 AEDT 2014


Hi Paul,

>     + The known best sequential performance, this is usually
> asm_fast.gc.  + The parallel performance, asm_fast.gc.par.stseg with
> multiple threads, + And the performance of enabling parallelism, but
> not using it, asm_fast.gc.par.stseg with one thread.  This gives us an
> idea of the costs/overheads of parallelism.

ok, on i5 dual core with HT

    asm_fast.gc                    is around 1:30m
    asm_fast.par.gc.stseg with -P1 is around 1:55m (104% CPU)
    asm.fast.par.gc.stseg with -P2 is around 1:17m (160% CPU)
    asm.fast.par.gc.stseg with -P3 is around 1:00m (292% CPU)

so, within the asm.fast.par.gc.steg grade, I get almost 2x speedup, but
the asm_fast.gc grade is faster in general (so no surprises here). My
guess is that on a quad core, I'd get almost 3x speedup, I'll see if I
can verify this.

You're right of course, like with most benchmarks, the interpretation of
the results depend on what one tries to achieve. My principal interest
is learning more about efficient, declarative programming. Using
parallelism and different grades comes after profiling and algorithmic
optimization. I like very much the idea of automatic introduction of
parallelism as a kind of last optimization step.

Best regards,
Matthias



More information about the users mailing list