[m-dev.] for review: first cut at MLDS
Fergus Henderson
fjh at cs.mu.OZ.AU
Thu Jul 1 16:41:14 AEST 1999
Hi all,
Tyson and I are going to work on a new back-end for the Mercury compiler.
The new back-end will generate higher-level code than the current back-end,
and so it will go through a new intermediate data structure called the "MLDS"
(medium level data structure).
This is nowhere near being ready to commit yet.
At this stage it is closer to a back-of-the-envelope design sketch.
But I think it is sufficiently fleshed out for me to get some useful
feedback, and I suspect Tyson and Zoltan may well want to comment on
this.
If this looks OK, then my next step will be to fill in all the details
so that I can get it to compile, and then modify mercury_to_c.m to
generate MLDS instead of C. This would be capable of handling variables
of type `int', and det and semidet code only.
Open issues:
- How to handle polymorphism -- for the high-level C back-end,
we should handle it exactly as we do in the current LLDS-based
backend, but for the JVM and similar back-ends we would need
to do something different.
- How to handle discriminated union types. Again, for the MLDS->C
back-end we could use the same data representation as our current
LLDS->C compiler uses, but for JVM and similar back-ends we
probably want to use inheritence.
%-----------------------------------------------------------------------------%
% Copyright (C) 1999 The University of Melbourne.
% This file may only be copied under the terms of the GNU General
% Public License - see the file COPYING in the Mercury distribution.
%-----------------------------------------------------------------------------%
% MLDS - The Medium-Level Data Structure.
% Main author: fjh.
% This module defines the MLDS data structure itself.
% The MLDS is an intermediate data structure used in compilation;
% we compile Mercury source -> parse tree -> HLDS -> MLDS -> target (e.g. C).
% See notes/compiler_design.html for more information about the MLDS & LLDS.
% [XXX Need to document MLDS in notes/compiler_design.html.]
%
% The MLDS is intended to be suitable for generating target code in
% languages such as Java, Java bytecode, high-level C, C++, or C--, etc.
% This is as opposed to the LLDS, which is better suited for generating
% assembler or the very low-level C or GNU C (lower level than C--) that
% the original Mercury compiler backend generates. In the LLDS, we are
% doing all our own register allocation and stack manipulation, but with
% MLDS, those tasks are the responsibility of the target.
% The one really important simplification that we do make relative to the
% HLDS is that MLDS does not have any support for non-determinism, so the
% HLDS->MLDS compiler must compile non-deterministic code into code that
% uses continuation passing.
% The MLDS data structure is quite full-featured, including for example
% support for multiple return values and tagged pointers. However, many
% of the intended target languages don't support all of those features.
% Therefore the HLDS->MLDS compiler must ensure that the final MLDS code that
% it eventually generates does not use features which the target does not
% support. This will (presumably) be accomplished by having handle_options.m
% set various flags according to the selected target, and having the
% HLDS->MLDS compiler take account of those flags and either generate simpler
% MLDS code in the first place or run some extra simplification passes over
% the MLDS code before invoking the MLDS->target compiler.
%
%-----------------------------------------------------------------------------%
:- module mlds.
:- interface.
:- import_module hlds_pred, prog_data.
:- import_module llds.
:- import_module bool, list, std_util.
%-----------------------------------------------------------------------------%
%
% The type `mlds' is the actual MLDS.
%
:- type mlds
---> mlds(
module_name, % The Mercury module name
mlds__foreign_code, % Code defined in some other language,
% e.g. for `pragma c_header_code', etc.
% The MLDS code itself
mlds__imports, % Packages/modules/classes to import
mlds__defns %
).
:- type mlds__imports == list(mlds__import).
:- type mlds__import.
:- type mlds__defns == list(mlds__defn).
:- type mlds__defn
---> mlds__defn(
mlds__entity_name, % the name of the entity being declared
mlds__context, % the source location
mlds__decl_flags, % these contain the following:
% mlds__access, % public/private/protected
% mlds__member_type, % static/per_instance
% mlds__virtuality, % virtual/non_virtual
% mlds__finality, % final/overridable (funcs only)
% mlds__constness, % const/modifiable (data only)
% mlds__is_abstract, % abstract/concrete
% etc.
mlds__entity_defn % the definition of the entity
).
:- type mlds__entity_name.
% This specifies information about some entity being defined
% The entity may be any of the following:
% constant or variable
% function
% class, including
% package (class with only static members)
% interface (abstract class, no data members)
% struct (value class)
% enum
:- type mlds__entity_defn
% constants or variables
---> mlds__data(
mlds__type,
maybe(mlds__initializer)
)
% functions
; mlds__function(
maybe(pred_proc_id), % identifies the original
% Mercury procedure, if any
mlds__func_signature, % the argument & return types
maybe(mlds__statement) % the function body, or `no'
% if the function is abstract
)
% packages, classes, interfaces, structs, enums
; mlds__class(
mlds__class
).
:- type mlds__type.
:- type mlds__initializer == list(mlds__rval).
:- type mlds__func_signature
---> mlds__func_signature(
mlds__type, % return type
list(mlds__type) % argument types
).
:- type mlds__class_kind
---> mlds__class % a generic class:
% can inherit other classes and
% interfaces
% (but most targets will only support
% single inheritence, so usually there
% will be at most one class)
; mlds__package % class with only static members
% (can only inherit other packages)
; mlds__interface % class with no variable data members
% (can only inherit other interfaces)
; mlds__struct % value class
% (can only inherit other structs)
; mlds__enum % class with one integer member and
% a bunch of static consts
% (cannot inherit anything)
.
:- type mlds__class
---> mlds__class(
mlds__class_kind,
mlds__imports, % imports these classes (or
% modules, packages, ...)
list(mlds__class_id), % inherits these base classes
list(mlds__interface_id), % implements these interfaces
mlds__defns % contains these members
).
:- type mlds__class_id.
:- type mlds__interface_id.
%-----------------------------------------------------------------------------%
:- type mlds__decl_flags.
:- type access
---> public
; private
; protected
; default. % Java "default" access: accessible to anything
% defined in the same package.
:- type per_instance
---> one_copy % i.e. "static" storage duration
% (but not necessarily static linkage)
% or static member function
; per_instance. % i.e. "auto" local variable in function,
% or non-static member of class.
:- type virtuality
---> virtual
; non_virtual.
:- type finality
---> final
; overridable.
:- type constness
---> const
; modifiable.
:- type abstractness
---> abstract
; concrete.
:- func access(mlds__decl_flags) = access.
:- func per_instance(mlds__decl_flags) = per_instance.
:- func virtuality(mlds__decl_flags) = virtuality.
:- func finality(mlds__decl_flags) = finality.
:- func constness(mlds__decl_flags) = constness.
:- func abstractness(mlds__decl_flags) = abstractness.
%-----------------------------------------------------------------------------%
%
% C code required for the C interface.
% When compiling to a language other than C,
% this part still needs to be generated as C code
% and compiled with a C compiler.
%
:- type mlds__foreign_code
---> mlds__foreign_code(
c_header_info,
list(user_c_code),
list(c_export) % XXX we will need to modify
% export.m to handle different
% target languages
).
%-----------------------------------------------------------------------------%
:- type mlds__statement
---> mlds__statement(
mlds__stmt,
mlds__context
).
% mlds__context is probably == prog_context,
% but might also contain goal_path or other information.
:- type mlds__context.
:- type mlds__stmt
--->
%
% sequence
%
block(mlds__defns, list(mlds__statement))
%
% iteration
%
; while(mlds__rval, mlds__statement, bool)
% the `bool' is true iff the loop is guaranteed
% to iterate at least once -- in that case,
% the compiler can generate a `do...while' loop
% rather than a `while...' loop.
%
% selection (see also computed_goto)
%
; if_then_else(mlds__rval, mlds__statement,
maybe(mlds__statement))
/******
% Is it worth including this? We already have `computed_goto'...
; switch(
mlds__rval,
% other representations might be better...
assoc_list(mlds__rval, mlds__statement),
mlds__statement % the default case
)
******/
%
% transfer of control
%
; label(label)
% Defines a label that can be used as the
% target of calls, gotos, etc.
; goto(label)
% goto(Target)
% Branch to the specified address.
; computed_goto(mlds__rval, list(label))
% Evaluate rval, which should be an integer,
% and jump to the (rval+1)th label in the list.
% e.g. computed_goto(2, [A, B, C, D])
% will branch to label C.
%
% function call/return
%
; call(
mlds__func_signature, % signature of the func
mlds__rval, % the function to call
maybe(mlds__rval), % for method calls, this field
% specifies the `this' object
list(mlds__rval), % ordinary function arguments
is_tail_call % indicates whether this
% call is a tail call
)
; return(list(mlds__rval)) % Some targets will not support
% returning more than one value
%
% exception handling
%
/* XXX not yet implemented */
%
% atomic statements
%
; atomic(mlds__atomic_statement)
.
:- type is_tail_call
---> tail_call % a tail call
; call % just an ordinary call
.
%
% atomic statements
%
:- type mlds__atomic_statement
---> comment(string)
% Insert a comment into the output code.
; assign(mlds__lval, mlds__rval)
% assign(Location, Value):
% Assign the value specified by rval to the location
% specified by lval.
%
% heap management
%
; alloc_mem(mlds__lval, maybe(tag), mlds__rval, mlds__type)
% Get a memory block of a size given by an rval
% and put its address in the given lval,
% possibly after tagging it with a given tag.
% (Some targets might not support tags.)
; mark_hp(mlds__lval)
% Tell the heap sub-system to store a marker
% (for later use in restore_hp/1 instructions)
% in the specified lval
%
% It's OK for the target to treat this as a no-op,
% and probably that is what most targets will do.
; restore_hp(mlds__rval)
% The rval must be a marker as returned by mark_hp/1.
% The effect is to deallocate all the memory which
% was allocated since that call to mark_hp.
%
% It's OK for the target to treat this as a no-op,
% and probably that is what most targets will do.
%
% trail management
%
; trail_op(trail_op)
%
% foreign language interfacing
%
; target_code(target_lang, string)
% Do whatever is specified by the string,
% which can be any piece of code in the specified
% target language (C, assembler, IL, or whatever)
% that does not have any non-local flow of control.
.
%
% This is just a random selection of possible languages
% that we might want to target...
%
:- type target_lang
---> lang_C
; lang_GNU_C
; lang_C_minus_minus
; lang_asm
; lang_IL_bytecode
; lang_IL_asm
; lang_java_asm
; lang_java_bytecode
.
%
% trail management
%
:- type trail_op
---> store_ticket(mlds__lval)
% Allocate a new "ticket" and store it in the lval.
; reset_ticket(mlds__rval, reset_trail_reason)
% The rval must specify a ticket allocated with
% `store_ticket' and not yet invalidated or
% deallocated.
% If undo_reason is `undo' or `exception', restore
% any mutable global state to the state it was in when
% the ticket was obtained with store_ticket();
% invalidates any tickets allocated after this one.
% If undo_reason is `commit' or `solve', leave the state
% unchanged, just check that it is safe to commit
% to this solution (i.e. that there are no outstanding
% delayed goals -- this is the "floundering" check).
% Note that we do not discard trail entries after
% commits, because that would in general be unsafe.
%
% Any invalidated ticket is useless and should
% be deallocated with either `discard_ticket'
% or `discard_tickets_to'.
; discard_ticket
% Deallocates the most-recently allocated ticket.
; mark_ticket_stack(mlds__lval)
% Tell the trail sub-system to store a ticket counter
% (for later use in discard_tickets_upto)
% in the specified lval.
; discard_tickets_to(mlds__rval)
% The rval must be a ticket counter obtained via
% `mark_ticket_stack' and not yet invalidated.
% Deallocates any trail tickets allocated after
% the corresponding call to mark_ticket_stack.
% Invalidates any later ticket counters.
.
%-----------------------------------------------------------------------------%
%
% An lval represents a data location or variable that can be used
% as the target of an assignment.
%
% XXX this probably needs work
%
:- type field_id.
:- type mlds__var.
:- type mlds__lval
%
% values on the heap
%
---> field(maybe(tag), mlds__rval, field_id)
% field(Tag, Address, FieldName)
% selects a field of a compound term.
% Address is a tagged pointer to a cell
% on the heap; the offset into the cell
% is FieldNum words. If Tag is yes, the
% arg gives the value of the tag; if it is
% no, the tag bits will have to be masked off.
% The value of the tag should be given if
% it is known, since this will lead to
% faster code.
%
% values somewhere in memory
%
; mem_ref(mlds__rval) % The rval should have
% originally come from a mem_addr rval.
%
% local variables
%
; lvar(mlds__var)
.
%-----------------------------------------------------------------------------%
% XXX this probably needs work
% An rval is an expression that represents a value.
:- type mlds__rval
---> lval(mlds__lval)
% The value of an `lval' rval is just the value stored in
% the specified lval.
; create(tag, list(maybe(mlds__rval)), create_arg_types,
static_or_dynamic, int, mlds__type)
% create(Tag, Arguments, MaybeArgTypes, StaticOrDynamic,
% LabelNumber, CellKind):
% [see comments in llds.m]
; mkword(tag, mlds__rval)
% Given a pointer and a tag, mkword returns a tagged pointer.
; const(rval_const)
; unop(unary_op, mlds__rval)
; binop(binary_op, mlds__rval, mlds__rval)
; mem_addr(mem_ref).
% The address of a word in the heap, etc.
--
Fergus Henderson <fjh at cs.mu.oz.au> | "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh> | of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3 | -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to: mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions: mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------
More information about the developers
mailing list