[m-rev.] for possible post-commit review: draft of most of the grades chapter
Julien Fischer
jfischer at opturion.com
Wed Aug 6 17:41:55 AEST 2025
On Mon, 4 Aug 2025 at 20:40, Zoltan Somogyi <zoltan.somogyi at runbox.com> wrote:
>
> Reviews can probably wait until the chapter is finished.
I had a look anyway ;-)
> diff --git a/doc/user_guide.texi b/doc/user_guide.texi
> index 76166efb0..64bdd9678 100644
> --- a/doc/user_guide.texi
> +++ b/doc/user_guide.texi
> @@ -119,7 +119,7 @@ During that time, we advise you to look at
> @menu
> * Introduction:: General overview.
> * Introduction to mmc:: Introduction to compiling Mercury program
> -* Grades:: Mercury grades
> +* Mercury grades:: Mercury grades
> * Running:: Running Mercury programs
> * Compilation details:: The Mercury compilation process in detail
> * Filenames:: File naming conventions.
> @@ -751,89 +751,262 @@ please see ZZZ @c @ref
>
> @c ----------------------------------------------------------------------------
>
> - at node Grades
> + at node Mercury grades
> @chapter Mercury grades
>
> + at menu
> +* The Mercury backends::
> +* The importance of consistency::
> +* Base grades::
> +* Grade modifiers::
> + at end menu
> +
> + at node The Mercury backends
> + at section The Mercury backends
> +
> +The Mercury compiler was has two backends.
Delete "was".
> +The first backend implemented by the Mercury team
> +translates Mercury programs to
> +what is effectively assembly-level code in C syntax.
> +Most compilers generate either assembly level code
> +(either assembly code itself or binary machine code),
> +so the first part is quite conventional.
> +The second part, generating this assembly level code in C syntax,
> +was rare at the time, though it has become more common since.
> +We chose that approach because it allowed us to take advantage
> +of the huge amount of work put into C compilers,
> +both in terms of making C available on pretty much all commonly used platforms,
> +and generating for them not just code that works, but @emph{fast} code.
> +
> +Like most compilers, the Mercury compiler has representations
> +for both the source code and the target code.
> +In compilers for imperative languages,
> +these are usually called the abstract syntax tree (AST)
> +and the intermediate representation (IR) respectively.
> +The Mercury compiler's versions of the these are substantially different,
Delete "the"
> +so we use different names for them:
> +the high-level data structure (HLDS)
> +and the low-level data structure (LLDS).
> + at c (Revealing a complete lack of originality.)
> +
> +The low-level code generated by the original backend
> +is about as far from the code that a human C programmer would write
> +as it is possible to get.
> +Later on, as part of a research project,
> +we investigated translating Mercury code
> +into @emph{idiomatic} C code,
> +with a view towards making the approach general enough
> +to be able to produce idiomatic code
> +in other imperative programming languages as well.
> +We call the compiler's internal representation of this kind of the code
> +the medium-level data structure (MLDS),
> +since it is clearly lower level than Mercury code,
> +but higher level than assembly code.
> +
> +From the names of these internal representations,
> +Mercury calls the first backend the LLDS backend,
> +and the second backend the MLDS backend.
> +The first generates low (assembly) level C code,
> +while the second generates either high level C code, Java code, or C# code.
> +
> + at node The importance of consistency
> + at section The importance of consistency
> +
> +In general, a Mercury program consists of many modules.
> +These modules must be compiled in a manner
> +that allows the resulting files codes to be linked together.
> +For example if you compile e.g.@:
> +module @var{module_a} to @file{@var{module_a}.java}
> +and module @var{module_b} to @file{@var{module_b}.c},
> +which the Java and C implementations compile further
> +to @file{@var{module_a}.jar} and @file{@var{module_b}.o}.
The nearest analog to an object file for Java would be a .class file rather
than a .jar file.
> +This is not just because JAR files and object files have different formats;
> +the more fundamental reason is that the codes in them
> +make fundamentally different assumptions
> +about how the different modules of a program
> +are supposed to communicate and cooperate with each other.
> +This is why if you compile
> +both module @var{module_a} and module @var{module_b} to C,
> +but compile module @var{module_a} using the LLDS backend
> +and and at var{module_b} using the MLDS backend,
Double "and" there; also add space between and @var.
> +any attempt to link @file{@var{module_a}.o} and @file{@var{module_b}.o}
> +will still fail.
> +It will fail because the two backends
> +use different naming schemes for the C code they generate,
> +so that the name of the C function
> +that implements e.g.@: the predicate @var{module_b.q}
> +will be different from the name by which
> +a call from @var{module_a.p} will try to refer to @var{module_b.q}.
> +This difference is deliberate.
> +This is because any attempt to ``cure'' such link failures
> +by having the two backends use the same C naming scheme
> +would address only @emph{incidental} incompatibilities;
> +the @emph{fundamental} incompatibilities,
> +such as the LLDS backend managing its own stacks (two of them)
> +while the MLDS backend relies on the standard C stack,
> +would still remain.
I suggest referring to the stack as a call stack there.
> +The @command{mmc} option @command{--target} controls
I suggest using texinfo @option command instead of @command here and below
where appropriate.
> +which of its target languages the Mercury compiler will generate code for.
> +With @command{--target c}, Mercury will generate C code;
> +with @command{--target java}, Mercury will generate Java code; and
> +with @command{--target c#}, Mercury will generate C# code.
> +When generating C, the option @command{--high-level-code} controls
> +whether code generation will use the LLDS or the MLDS backend,
> +generating assembly code in C syntax or idiomatic C code respectively.
> +(Since you cannot write assembly-level code in either Java or C#,
> +the compiler will always use the MLDS backend
> +when generating Java or C# target code.)
> +What the previous paragraph is saying is that
> +if you compile all the modules of a program,
> +an attempt to link the resulting files together can succeed
> + at emph{only if} all the modules were compiled
> +using the exact same values of
> + at command{--target} and @command{--high-level-code} options.
> +
> +These are not the only options that have this property.
> +There are other options as well,
> +and in fact Mercury has more such options than most other languages.
> +This is why unlike most other languages,
> +Mercury has an explicit name for this concept:
> +it calls each set of compatible values of those options a @emph{grade}.
Use @dfn there instead of @emph.
> +A Mercury grade succinctly specifies the value of roughly twenty options
> +(depending on when you count them; we add new ones from time to time).
> +Each grade consists of a @emph{base grade} (which must be present)
> +and zero or more other @emph{grade modifiers} (which are therefore optional).
> +The names of the grade modifiers all start with period;
> +the names of the base grades do not.
> +The name of a grade is a concatenation of the selected base grade
> +and the selected grade modifiers.
> +For example, @command{hlc.tr.gc} specifies
> +the base grade @command{hlc} (meaning high level C code)
> +and the grade modifiers @command{.tr} and @command{.gc}
> +(which respectively call for the use of trailing
> +and of the Boehm-Demers-Weiser garbage collector).
> +
> + at c XXX Should require the base grade to come first?
Do we not do that?
Julien.
More information about the reviews
mailing list