[m-rev.] for review: update compiler_design.html
Fergus Henderson
fjh at cs.mu.OZ.AU
Thu May 8 18:15:36 AEST 2003
Estimated hours taken: 1
Branches: main
compiler/notes/compiler_design.html:
Update to reflect the fact that we are now using sub-modules.
The updated full version is in
<http://www.cs.mu.oz.au/~fjh/tmp/compiler_design.html>;
reading that might be easier than reading the diff.
Workspace: /home/ceres/fjh/mercury
Index: compiler/notes/compiler_design.html
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/notes/compiler_design.html,v
retrieving revision 1.83
diff -u -d -r1.83 compiler_design.html
--- compiler/notes/compiler_design.html 18 Mar 2003 02:43:47 -0000 1.83
+++ compiler/notes/compiler_design.html 8 May 2003 08:11:25 -0000
@@ -30,20 +30,23 @@
<p>
-The top-level of the compiler is in the file mercury_compile.m.
+The top-level of the compiler is in the file mercury_compile.m,
+which is a sub-module of the top_level.m package.
The basic design is that compilation is broken into the following
stages:
<ul>
-<li> 1. parsing (source files -> HLDS)
-<li> 2. semantic analysis and error checking (HLDS -> annotated HLDS)
-<li> 3. high-level transformations (annotated HLDS -> annotated HLDS)
-<li> 4. code generation (annotated HLDS -> target representation)
+<li> 1. parsing (source files -> HLDS) <br>
+<li> 2. semantic analysis and error checking (HLDS -> annotated HLDS) <br>
+<li> 3. high-level transformations (annotated HLDS -> annotated HLDS) <br>
+<li> 4. code generation (annotated HLDS -> target representation) <br>
<li> 5. low-level optimizations
(target representation -> target representation)
<li> 6. output code (target representation -> target code)
</ul>
+
+
Note that in reality the separation is not quite as simple as that.
Although parsing is listed as step 1 and semantic analysis is listed
as step 2, the last stage of parsing actually includes some semantic checks.
@@ -57,46 +60,87 @@
<p>
In addition, the compiler is actually a multi-targeted compiler
-with several different back-ends. When you take the different
-back-ends into account, the structure looks like this:
+with several different back-ends.
+
+<p>
+
+The modules in the compiler are structured by being grouped into
+"packages". A "package" is just a meta-module,
+i.e. a module that contains other modules as sub-modules.
+(The sub-modules are almost always stored in separate files,
+which are named only for their final module name.)
+We have a package for the top-level, a package for each main pass, and
+finally there are also some packages for library modules that are used
+by more than one pass.
+<p>
+
+Taking all this into account, the structure looks like this:
<ul type=disc>
-<li> front-end
- <ul type=disc>
- <li> 1. parsing (source files -> HLDS)
- <li> 2. semantic analysis and error checking (HLDS -> annotated HLDS)
- <li> 3. high-level transformations (annotated HLDS -> annotated HLDS)
- </ul>
-<li> back-ends
+<li> At the top of the dependency graph is the top_level.m package,
+ which currently contains only the one module mercury_compile.m
+ which invokes all the different passes in the compiler.
+<li> The next level down is all of the different passes of the compiler.
+ In general, we try to stick by the principle that later passes can
+ depend on data structures defined in earlier passes, but not vice
+ versa.
<ul type=disc>
- <li> a. LLDS back-end
- <ul type=disc>
- <li> 4a. code generation (annotated HLDS -> LLDS)
- <li> 5a. low-level optimizations (LLDS -> LLDS)
- <li> 6a. output code (LLDS -> C)
- </ul>
- <li> b. MLDS back-end
- <ul type=disc>
- <li> 4b. code generation (annotated HLDS -> MLDS)
- <li> 5b. MLDS transformations (MLDS -> MLDS)
- <li> 6b. output code
- (MLDS -> C or MLDS -> MSIL
- or eventually MLDS -> Java, etc.)
- </ul>
- <li> c. RL back-end
+ <li> front-end
<ul type=disc>
- <li> 4c. code generation (annotated HLDS -> RL)
- <li> 5c. low-level optimizations (RL -> RL)
- <li> 6c. output code (RL -> RL-bytecode)
+ <li> 1. parsing (source files -> HLDS)
+ <br> Package: parse_tree.m
+ <li> 2. semantic analysis and error checking (HLDS -> annotated HLDS)
+ <br> Package: check_hlds.m
+ <li> 3. high-level transformations (annotated HLDS -> annotated HLDS)
+ <br> Package: transform_hlds.m
</ul>
- <li> d. bytecode back-end
+ <li> back-ends
<ul type=disc>
- <li> 4d. code generation (annotated HLDS -> bytecode)
+ <li> a. LLDS back-end
+ <br> Package: ll_backend.m
+ <ul type=disc>
+ <li> 4a. code generation (annotated HLDS -> LLDS)
+ <li> 5a. low-level optimizations (LLDS -> LLDS)
+ <li> 6a. output code (LLDS -> C)
+ </ul>
+ <li> b. MLDS back-end
+ <br> Package: ml_backend.m
+ <ul type=disc>
+ <li> 4b. code generation (annotated HLDS -> MLDS)
+ <li> 5b. MLDS transformations (MLDS -> MLDS)
+ <li> 6b. output code
+ (MLDS -> C or MLDS -> MSIL or MLDS -> Java, etc.)
+ </ul>
+ <li> c. RL back-end
+ <br> Package: aditi_backend.m
+ <ul type=disc>
+ <li> 4c. code generation (annotated HLDS -> RL)
+ <li> 5c. low-level optimizations (RL -> RL)
+ <li> 6c. output code (RL -> RL-bytecode)
+ </ul>
+ <li> d. bytecode back-end
+ <br> Package: bytecode_backend.m
+ <ul type=disc>
+ <li> 4d. code generation (annotated HLDS -> bytecode)
+ </ul>
+ <li> There's also a package backend_libs.m which contains
+ modules which are shared between several different back-ends.
</ul>
</ul>
+<li> Finally, at the bottom of the dependency graph there is the package
+ libs.m. libs.m contains the option handling code, and also library
+ modules which are not sufficiently general or sufficiently useful to
+ go in the Mercury standard library.
</ul>
<p>
+
+In addition to the packages mentioned above, there are also packages
+for the build system: make.m contains the support for the `--make' option,
+and recompilation.m contains the support for the `--smart-recompilation'
+option.
+
+<p>
<hr>
<!---------------------------------------------------------------------------->
@@ -120,6 +164,10 @@
<p>
+Option handling is part of the libs.m package.
+
+<p>
+
The command-line options are defined in the module options.m.
mercury_compile.m calls library/getopt.m, passing the predicates
defined in options.m as arguments, to parse them. It then invokes
@@ -132,6 +180,9 @@
<p>
+Support for `--make' is in the make.m package,
+which contains the following modules:
+
<dl>
<dt> make.m
@@ -183,6 +234,9 @@
<h3> FRONT END </h3>
<h4> 1. Parsing </h4>
+Parsing is in the parse_tree.m package,
+which contains the modules listed below.
+
<p>
<ul>
@@ -299,7 +353,10 @@
<p>
The result at this stage is the High Level Data Structure,
-which is defined in four files:
+which is defined in the hlds.m package.
+
+<p>
+The HLDS data structure itself is spread over four modules:
<ol>
<li> hlds_data.m defines the parts of the HLDS concerned with
@@ -320,6 +377,8 @@
<h4> 2. Semantic analysis and error checking </h4>
+This is the check_hlds.m package.
+
<p>
Any pass which can report errors or warnings must be part of this stage,
@@ -546,6 +605,8 @@
<h4> 3. High-level transformations </h4>
+This is the transform_hlds.m package.
+
<p>
The first pass of this stage does tabling transformations (table_gen.m).
@@ -700,6 +761,8 @@
<h3> a. LLDS BACK-END </h3>
+This is the ll_backend.m package.
+
<h4> 4a. Code generation. </h4>
<p>
@@ -860,6 +923,7 @@
<dd> This could also be considered a part of code generation,
but for the LLDS back-end this is currently done as part
of the output phase (see below).
+
</dl>
<p>
@@ -896,13 +960,21 @@
<li> removal of redundant assignments, i.e. assignments that assign a value
that the target location already holds (reassign.m) <br>
+
</ul>
<p>
-Several of these optimizations (frameopt and use_local_vars)
+The module opt_debug.m contains utility routines used for debugging
+these LLDS-to-LLDS optimizations.
+
+<p>
+
+Several of these optimizations (frameopt and use_local_vars) also
use livemap.m, a module that finds the set of locations live at each label.
+<p>
+
Use_local_vars numbering also introduces
references to temporary variables in extended basic blocks
in the LLDS representation of the C code.
@@ -962,15 +1034,22 @@
output of RTTI structures to rtti_out.m.
</ul>
+
<p>
<hr>
<!---------------------------------------------------------------------------->
<h3> b. MLDS BACK-END </h3>
+<p>
+
+This is the ml_backend.m package.
+
+<p>
+
The original LLDS code generator generates very low-level code,
since the LLDS was designed to map easily to RISC architectures.
-We're currently developing a new back-end that generates much higher-level
+We have developed a new back-end that generates much higher-level
code, suitable for generating Java, high-level C, etc.
This back-end uses the Medium Level Data Structure (mlds.m) as its
intermediate representation.
@@ -1017,7 +1096,7 @@
<li> ml_tag_switch.m
<li> switch_util.m (also used by MLDS back-end)
</ul>
- <dl>
+ </dl>
The module ml_code_util.m provides utility routines for
MLDS code generation. The module ml_util.m provides some
general utility routines for the MLDS.
@@ -1040,14 +1119,29 @@
<h4> 6b. MLDS output </h4>
-There are currently two backends that generate code from MLDS, one
-generates C/C++ code, the other generates Microsoft's Intermediate
-Language (MSIL or IL).
+There are currently four backends that generate code from MLDS:
+one generates C/C++ code,
+one generates assembler (by interfacing with the GCC back-end),
+one generates Microsoft's Intermediate Language (MSIL or IL),
+and one generates Java.
+
<p>
+
<ul>
<li>mlds_to_c.m converts MLDS to C/C++ code.
</ul>
+
+<p>
+
+The MLDS->asm backend is logically part of the MLDS back-ends,
+but it is in a module of its own (mlds_to_gcc.m), rather than being
+part of the ml_backend package, so that we can distribute a version
+of the Mercury compiler which does not include it. There is a wrapper
+module called maybe_mlds_to_gcc.m which is generated at configuration time
+so that mlds_to_gcc.m will be linked in iff the GCC back-end is available.
+
<p>
+
The MLDS->IL backend is broken into several submodules.
<ul>
<li> mlds_to_ilasm.m converts MLDS to IL assembler and writes it to a .il file.
@@ -1058,12 +1152,26 @@
</ul>
After IL assembler has been emitted, ILASM in invoked to turn the .il
file into a .dll or .exe.
+
<p>
+
+The MLDS->Java backend is broken into two submodules.
+<ul>
+<li> mlds_to_java.m converts MLDS to Java and writes it to a .java file.
+<li> java_util.m contains some utility routines.
+</ul>
+After the Java code has been emitted, a Java compiler (normally javac)
+is invoked to turn the .java file into a .class file containing Java bytecodes.
+
<hr>
<!---------------------------------------------------------------------------->
<h3> c. Aditi-RL BACK-END </h3>
+<p>
+
+This is the aditi_backend.m package.
+
<h4> 4c. Aditi-RL generation </h4>
<ul>
@@ -1133,6 +1241,8 @@
<h3> d. BYTECODE BACK-END </h3>
+This is the bytecode_backend.m package.
+
<p>
The Mercury compiler can translate Mercury programs into bytecode for
@@ -1158,6 +1268,8 @@
<h3> SMART RECOMPILATION </h3>
+This is the recompilation.m package.
+
<p>
The Mercury compiler can record program dependency information
@@ -1187,6 +1299,22 @@
<h3> MISCELLANEOUS </h3>
+This module is part of the transform_hlds.m package.
+
+ <dl>
+ <dt> dependency_graph.m:
+ <dd>
+ This contains predicates to compute the call graph for a
+ module, and to print it out to a file.
+ (The call graph file is used by the profiler.)
+ The call graph may eventually also be used by det_analysis.m,
+ inlining.m, and other parts of the compiler which could benefit
+ from traversing the predicates in a module in a bottom-up or
+ top-down fashion with respect to the call graph.
+ </dl>
+
+The following modules are part of the backend_libs.m package.
+
<dl>
<dt> builtin_ops:
<dd>
@@ -1212,29 +1340,18 @@
each type: unify/2, compare/3, and index/1 (used in the
implementation of compare/3).
- <dt> dependency_graph.m:
- <dd>
- This contains predicates to compute the call graph for a
- module, and to print it out to a file.
- (The call graph file is used by the profiler.)
- The call graph may eventually also be used by det_analysis.m,
- inlining.m, and other parts of the compiler which could benefit
- from traversing the predicates in a module in a bottom-up or
- top-down fashion with respect to the call graph.
-
<dt> passes_aux.m
<dd>
Contains code to write progress messages, and higher-order
code to traverse all the predicates defined in the current
module and do something with each one.
- <dt> opt_debug.m:
- <dd>
- Utility routines for debugging the LLDS-to-LLDS optimizations.
-
<dt> error_util.m:
<dd>
Utility routines for printing nicely formatted error messages.
+ </dl>
+
+The following modules are part of the libs.m package.
<dt> process_util.m:
<dd>
@@ -1246,7 +1363,6 @@
Contains an ADT representing timestamps used by smart
recompilation and `mmc --make'.
</dl>
-
<p>
<hr>
--
Fergus Henderson <fjh at cs.mu.oz.au> | "I have always known that the pursuit
The University of Melbourne | of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh> | -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list