[m-rev.] for post-commit review: Fix compatibility issues for the low-level C parallel grades.

Julien Fischer juliensf at csse.unimelb.edu.au
Sun Mar 21 00:50:50 AEDT 2010


Hi Paul,

The diff below does not correspond to the log message -- it appears to be
the diff for the change you made that documents the .par grades.

Julien.


On Sat, 20 Mar 2010, Paul Bone wrote:

>
> For post commit review by anyone.  I'm committing this to the main branch now.
> I'll let our release manager (Julien) review it or wait for it to be reviewed
> before pushing it onto the 10.04 branch.
>
> Thanks.
>
>
> Branches: main, 10.04
>
> Fix a number of errors and warnings in the runtime picked up by GCC 4.x in
> parallel and threadscope grades.
>
> We had been using types with the wrong signedness well calling atomic operations.
> GCC 4.x also picked up an error where #elif was used instead of #else.
>
> While testing these changes on a 32bit system more bugs where found on the i386
> architecture and on AMD brand processors.
>
> runtime/mercury_atomic_ops.h:
> runtime/mercury_atomic_ops.c:
>     Add unsigned variants of the following atomic operations:
>         increment,
>         add,
>         add_and_fetch,
>         dec_and_is_zero,
>
>     Add a signed variant for compare and swap.
>
>     Rename the MR_atomic_dec_<type>_and_is_zero operation to move the type to
>     the end of the name.
>
>     Use volatile storage in the MR_Stats structure.
>
>     A 32bit machine cannot do atomic operations on 64bit values and MR_Stats
>     must use 64bit values.  Therefore 64bit values in the MR_Stats structure
>     are now protected by a lock on 32bit machines.
>
> runtime/mercury_atomic_ops.h:
>     Fix a typeo in the i386 version of MR_atomic_dec_and_is_zero_uint().
>
> runtime/mercury_atomic_ops.c:
>     AMD CPUs do not conform to Intel's specification for being able to
>     extract the CPU clock speed from the brand string.  When we cannot
>     determine the CPU's clock speed then we write out threadscope
>     timestamps in raw clock cycles rather than nanoseconds.
>
>     On i386 machines the ebx register is used to implement PIC code,
>     however the CPUID instruction uses it to output information.  Save
>     this register on C's stack while we issue CPUID and retrieve the
>     result in ebx.
>
>     We now pass native machine sized values to the inline assembler code
>     that implements RDTSC and RDTSCP.
>
>     Fix commenting style in some places.
>
> runtime/mercury_atomic_ops.c:
>     Fix some incorrect C preprocessor code for conditional compilation.
>
> runtime/mercury_grade.h:
>     Increment binary compatibility number.  This should have been done in a
>     prior change when the MR_runnext macro changed which broke binary
>     compatibility in the parallel low-level C grades.
>
> runtime/mercury_context.h:
>     In MR_SyncTerm_Struct use an unsigned value for the number of conjuncts
>     remaining before the conjunction is complete.
>
> runtime/mercury_threadscope.c:
>     Record raw cpu clock ticks rather than milliseconds when we don't
>     know the processor's clock speed.
>
> runtime/mercury_context.c:
> runtime/mercury_wsdeque.h:
> runtime/mercury_wsdeque.c:
>     Conform to changes in mercury_atomic_ops.h
>
> Index: compiler/options.m
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/compiler/options.m,v
> retrieving revision 1.668
> diff -u -p -b -r1.668 options.m
> --- compiler/options.m	11 Feb 2010 04:36:09 -0000	1.668
> +++ compiler/options.m	23 Feb 2010 03:10:27 -0000
> @@ -4172,7 +4172,12 @@ options_help_compilation_model -->
>          "\tEnable experimental complexity analysis for the predicates",
>          "\tlisted in the given file.",
>          "\tThis option is supported for the C back-end, with",
> -        "\t--no-highlevel-code."
> +        "\t--no-highlevel-code.",
> +
> +        "--threadscope\t\t(grade modifier: `.threadscope')",
> +        "\tEnable support for profiling parallel execution.",
> +        "\tThis option is supported in the low-level C back-end parallel",
> +        "\tgrades on x86 and x86_64 processors."
>      ]),
>
>      io.write_string("      Miscellaneous optional features\n"),
> @@ -4206,6 +4211,9 @@ options_help_compilation_model -->
>          "\tAs above, but use a dynamically sized trail that is composed",
>          "\tof small segments.  This can help to avoid trail exhaustion",
>          "\tat the cost of increased execution time.",
> +        "--parallel\t\t(grade modifier: `.par')",
> +        "\tEnable parallel execution support.",
> +        "\tThis option is only supported for the C back-ends.",
>          "--maybe-thread-safe {yes, no}",
>          "\tSpecify how to treat the `maybe_thread_safe' foreign code",
>          "\tattribute.  `yes' means that a foreign procedure with the",
> Index: doc/reference_manual.texi
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/doc/reference_manual.texi,v
> retrieving revision 1.438
> diff -u -p -b -r1.438 reference_manual.texi
> --- doc/reference_manual.texi	14 Jan 2010 02:27:58 -0000	1.438
> +++ doc/reference_manual.texi	23 Feb 2010 04:42:22 -0000
> @@ -672,6 +672,17 @@ This is an abbreviation for @samp{not (s
>  A conjunction.
>  @var{Goal1} and @var{Goal2} must be valid goals.
>
> + at item @code{@var{Goal1} & @var{Goal2}}
> +A parallel conjunction.
> +This has the same declarative semantics as the normal conjunction.
> +Operationally, implementations may execute @var{Goal1} & @var{Goal2}
> +in parallel with one-another.
> +Implementations may also start the parallel execution of these goals
> +in any order.
> +It is a compilation error for @var{Goal1} or @var{Goal2} to have a
> +determinism other than @samp{det} or @samp{cc_multi}.
> + at xref{Determinism categories}.
> +
>  @item @code{@var{Goal1} ; @var{Goal2}}
>  where @var{Goal1} is not of the form @samp{Goal1a -> Goal1b}:
>  a disjunction.
> Index: doc/user_guide.texi
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/doc/user_guide.texi,v
> retrieving revision 1.603
> diff -u -p -b -r1.603 user_guide.texi
> --- doc/user_guide.texi	4 Feb 2010 02:20:46 -0000	1.603
> +++ doc/user_guide.texi	23 Feb 2010 04:40:20 -0000
> @@ -5588,6 +5588,8 @@ then a progress message will be displaye
>                                      program with mprof.
>  * Using mdprof::                    How to analyze the time and/or memory
>                                      performance of a program with mdprof.
> +* Using threadscope::               How to analyse the parallel
> +                                    execution of a program with threadscope.
>  * Profiling and shared libraries::  Profiling dynamically linked executables.
>  @end menu
>
> @@ -5597,6 +5599,7 @@ then a progress message will be displaye
>  @cindex Measuring performance
>  @cindex Optimization
>  @cindex Efficiency
> + at cindex Parallel performance
>
>  To obtain the best trade-off between productivity and efficiency,
>  programmers should not spend too much time optimizing their code
> @@ -5616,19 +5619,34 @@ that associates a lot more context with
>  but not both at the same time;
>  @samp{mdprof} can profile both time and space at the same time.
>
> +The parallel execution of Mercury programms can be analyzed with a third
> +profiler called @samp{threadscope}.
> + at samp{threadscope} allows programmers to visualise CPU utilization,
> +as well as how garbage collection, task granularity and the management of
> +parallel tasks.
> +The @samp{threadscope} tool is not included with the Melbourne Mercury
> +Compiler,
> +See @url{http://research.microsoft.com/en-us/projects/threadscope/,
> +Threadscope: Peformance Tuning Parallel Haskell Programs}.
> +
>  @node Building profiled applications
>  @section Building profiled applications
>  @cindex Building profiled applications
>  @pindex mprof
>  @pindex mdprof
> + at pindex threadscope
>  @cindex Time profiling
>  @cindex Heap profiling
>  @cindex Memory profiling
>  @cindex Allocation profiling
>  @cindex Deep profiling
> + at cindex Threadscope profiling
> + at cindex Parallel runtime profiling
> + at findex --parallel
> + at findex --threadscope
>
>  To enable profiling, your program must be built with profiling enabled.
> -The two different profilers require different support,
> +The three different profilers require different support,
>  and thus you must choose which one to enable when you build your program.
>
>  @itemize @bullet
> @@ -5644,6 +5662,10 @@ pass the @samp{--memory-profiling} optio
>  To build your program with deep profiling enabled (for @samp{mdprof}),
>  pass the @samp{--deep-profiling} option to @samp{mmc},
>  @samp{mgnuc} and @samp{ml}.
> + at item
> +To build your program with threadscope profiling enabled (for @samp{threadscope}).
> +pass the @samp{--parallel --threadscope} options to @samp{mmc},
> + at samp{mgnuc} and @samp{ml}.
>  @end itemize
>
>  If you are using Mmake,
> @@ -5653,7 +5675,7 @@ e.g.@: by adding the line @samp{GRADEFLA
>  (For more information about the different grades,
>  see @ref{Compilation model options}.)
>
> -Enabling profiling has several effects.
> +Enabling @samp{mprof} or @samp{mdprof} profiling has several effects.
>  First, it causes the compiler to generate slightly modified code,
>  which counts the number of times each predicate or function is called,
>  and for every call, records the caller and callee.
> @@ -5667,6 +5689,13 @@ Third, if you enable graph profiling,
>  the compiler will generate for each source file
>  the static call graph for that file in @samp{@var{module}.prof}.
>
> +Enabling @samp{threadscope} profiling causes the compiler to build the project
> +against a different runtime system.
> +This runtime system logs events relevant to parallel execution.
> + at samp{threadscope} support uses special x86 and x86_64 instructions to access the
> +processor's time stamp counter.
> +Therefore it is not supported on other architectures.
> +
>  @node Creating profiles
>  @section Creating profiles
>  @cindex Profiling
> @@ -5701,6 +5730,10 @@ will use two of those files (@file{Prof.
>  and a two others: @file{Prof.MemoryWords} and @file{Prof.MemoryCells}.
>  Executables compiled with @samp{--deep-profiling}
>  save profiling data in a single file, @file{Deep.data}.
> +Executables compiled with @samp{--parallel --threadscope}
> +save profiling data in a single file with the same name as the program being
> +profiled and the extension @samp{.eventlog}, for example
> + at file{my_program.eventlog}.
>
>  It is also possible to combine @samp{mprof} profiling results
>  from multiple runs of your program.
> @@ -5715,7 +5748,7 @@ when running your program with @samp{mpr
>  If this happens, just run it again --- the problem occurs only very rarely.
>  The same vulnerability does not occur with @samp{mdprof} profiling.
>
> -With both profilers,
> +With the @samp{mprof} and @samp{mdprof} profilers,
>  you can control whether time profiling measures
>  real (elapsed) time, user time plus system time, or user time only,
>  by including the options @samp{-Tr}, @samp{-Tp}, or @samp{-Tv} respectively
> @@ -6092,6 +6125,36 @@ all		map
>  internal	set
>  @end example
>
> + at node Using threadscope
> + at section Using threadscope
> +
> + at pindex threadscope
> + at cindex Threadscope profiling
> + at cindex Parallel execution profiling
> +
> +The @samp{threadscope} tools are not distributed with Mercury.
> +The tools are written in Haskell and work with GHC 6.10.
> + at samp{threadscope} has a number of dependencies in the form of Haskell
> +libraries, many of these will be provided with GHC or packaged for/by
> +your operating system.
> +These are: @samp{array}, @samp{binary}, @samp{cairo},
> + at samp{containers}, @samp{filepath}, @samp{ghc-events}, @samp{glade},
> + at samp{gtk}, @samp{mtl}.
> +The @samp{cairo}, @samp{gtk} and @samp{glade} modules are provided by
> +the @samp{gtk2hs} package.
> + at samp{ghc-events} is not packaged by most operating systems at this stage, It
> +can be retrieved from
> + at url{http://hackage.haskell.org/package/ghc-events, hackage}.
> +threadscope itself can also be retrieved from
> + at url{http://hackage.haskell.org/package/threadscope, hackage}.
> +Information about how to install Haskell packages can be found
> + at url{http://haskell.org/haskellwiki/Cabal/How_to_install_a_Cabal_package, here}
> +
> +Once @samp{threadscope} is installed it can be used to view @file{*.eventlog}
> +profiles either bu using the menu in the @samp{threadscope}'s
> +user interface.
> +Or by executing @samp{threadscope} and giving the filename on the command line.
> +
>  @node Profiling and shared libraries
>  @section Profiling and shared libraries
>  @pindex mprof
> @@ -7314,7 +7377,7 @@ The set of aspects and their alternative
>  @cindex .decldebug (grade modifier)
>  @c @cindex .ssdebug (grade modifier)
>  @cindex .par (grade modifier)
> - at c @cindex .threadscope (grade modifier)
> + at cindex .threadscope (grade modifier)
>  @cindex prof (grade modifier)
>  @cindex memprof (grade modifier)
>  @cindex profdeep (grade modifier)
> @@ -7327,7 +7390,7 @@ The set of aspects and their alternative
>  @cindex decldebug (grade modifier)
>  @c @cindex ssdebug (grade modifier)
>  @cindex par (grade modifier)
> - at c @cindex threadscope (grade modifier)
> + at cindex threadscope (grade modifier)
>  @table @asis
>  @item What target language to use, what data representation to use, and (for C) what combination of GNU C extensions to use:
>  @samp{none}, @samp{reg}, @samp{jump}, @samp{asm_jump},
> @@ -7360,10 +7423,10 @@ small segments: @samp{stseg} (the defaul
>  @item Whether to use a thread-safe version of the runtime environment:
>  @samp{par} (the default is a non-thread-safe environment).
>
> - at c @item Whether to include support for profile the execution of parallel
> - at c programs:
> - at c @samp{threadscope} (the default is no support for profiling parallel
> - at c execution).
> + at item Whether to include support for profile the execution of parallel
> +programs:
> + at samp{threadscope} (the default is no support for profiling parallel
> +execution).
>  @c See also the @samp{--profile-parallel-execution} runtime option.
>
>  @end table
> @@ -7497,6 +7560,12 @@ and grade modifier; they are followed by
>  @c @item @samp{.ssdebug}
>  @c @code{--ss-debug}.
>
> + at item @samp{.par}
> + at code{--parallel}.
> +
> + at item @samp{.par.threadscope}
> + at code{--parallel --threadscope}.
> +
>  @end table
>
>  @end table
> @@ -7858,6 +7927,30 @@ or for backtrackable destructive update.
>  This option is only supported by the C back-ends.
>
>  @sp 1
> + at item @code{--parallel}
> + at findex --parallel
> + at cindex Parallel evaluation
> +Enable support for parallel evaluation.
> +This enables runtime and code generation options necessary for taking
> +advantage of a shared memory parallel computer.
> +To parallel evaluation can be achieved by using either the parallel conjunction
> +operator or the concurrency support provided in the @samp{thread} module of the
> +standard library.
> + at xref{Goals, parallel conjunction, Goals, mercury_ref, The Mercury
> +Language Reference Manual}, and
> + at xref{thread, the thread module, thread, mercury_library, The Mercury
> +Library Reference Manual}.
> +This option is only supported by the C back-ends.
> +
> + at sp 1
> + at item @code{--threadscope}
> + at findex --threadscope
> + at cindex Threadscope profiling
> +Enable support for threadscope profiling.
> +This enables runtime support for profiling the parallel evaluation of
> +programs, @xref{Using threadscope}.
> +
> + at sp 1
>  @item @code{--maybe-thread-safe @{yes, no@}}
>  @findex --maybe-thread-safe
>  Specify how to treat the @samp{maybe_thread_safe} foreign code
>
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list