[m-rev.] for review: a pass for computing type representations

Peter Wang novalazy at gmail.com
Tue Jan 30 16:11:42 AEDT 2018


On Tue, 30 Jan 2018 02:06:31 +1100 (AEDT), "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> Peter, will you please review this diff? You were the last person
> to work on type representations when you implemented argument packing,
> and you added Mantis feature request 432, which this diff prepares for,
> so you are the logical person.
> 

> Make "compute type representations" a separate pass.
> 
> The ultimate purpose of this diff is to prepare for future improvements
> in type representations, allowing values of some data types to be represented
> more compactly than up to now.
> 
> The main way this diff does that is by creating a separate pass for deciding
> how values of each type should be represented. We have traditionally decided
> data representations for each type as its type definition was processed
> during the make_hlds pass, but these decisions were always tentative,
> and could be overridden later, e.g. when we processed foreign_type or
> foreign_enum pragmas for the type. This dispersed decision making algorithm
> is hard to understand, and therefore to change.
> 
> This diff centralizes decisions about type representations in a separate
> pass that does nothing else. It leaves the algorithm distributed among
> several files (du_type_layout.m, make_tags.m, and add_foreign_enum.m) for now,
> to make reviewing this diff easier, but soon after it is committed I intend
> to move all the relevant code to du_type_layout.m, to centralize the decision
> code in "space" as well as in time.
> 
> For the reason why this pass runs before any of the semantic analysis
> passes, instead of after all of them as I originally intended and as we
> discussed on m-dev in late october 2017, see the big comment at the start of
> du_type_layout.m.
> 
> As per another part of that same discussion on m-dev, this diff
> makes a start on implementing a new type of item, the type_repn item,
> which is intended *only* to be used in compiler-generated interface faces,

files



> diff --git a/compiler/add_type.m b/compiler/add_type.m
> index 57bd5a3..9b5e1d9 100644
> --- a/compiler/add_type.m
> +++ b/compiler/add_type.m

> @@ -357,33 +361,50 @@ module_add_type_defn_foreign(TypeStatus0, TypeStatus1, TypeCtor,
>  % Predicates that help the top level predicates do their jobs.
>  %
>  
> -:- pred convert_type_defn_to_hlds(type_defn::in, type_ctor::in, globals::in,
> -    hlds_type_body::out) is det.
> +:- pred convert_type_defn_to_hlds(type_defn::in, type_ctor::in,
> +    hlds_type_body::out, module_info::in, module_info::out) is det.
>  
> -convert_type_defn_to_hlds(TypeDefn, TypeCtor, Globals, HLDSBody) :-
> +convert_type_defn_to_hlds(TypeDefn, TypeCtor, HLDSBody, !ModuleInfo) :-
>      (
>          TypeDefn = parse_tree_du_type(DetailsDu),
>          DetailsDu =
> -            type_details_du(Body, MaybeUserEqComp, MaybeDirectArgCtors),
> -        % Initially, when we first see the `:- type' definition,
> -        % we assign the constructor tags assuming that there is no
> -        % `:- pragma reserve_tag' declaration for this type.
> -        % (If it turns out that there was one, then we will recompute the
> -        % constructor tags by calling assign_constructor_tags again,
> -        % with ReservedTagPragma = uses_reserved_tag, when processing
> -        % the pragma.)
> -        ReservedTagPragma = does_not_use_reserved_tag,
> -        assign_constructor_tags(Body, MaybeUserEqComp, TypeCtor,
> -            ReservedTagPragma, Globals, CtorTagMap, ReservedAddr, IsEnum),
> -        IsForeign = no,
> -        ( if ReservedAddr = does_not_use_reserved_address then
> -            compute_cheaper_tag_test(CtorTagMap, CheaperTagTest)
> -        else
> -            CheaperTagTest = no_cheaper_tag_test
> -        ),
> -        HLDSBody = hlds_du_type(Body, CtorTagMap, CheaperTagTest, IsEnum,
> -            MaybeUserEqComp, MaybeDirectArgCtors,
> -            ReservedTagPragma, ReservedAddr, IsForeign)
> +            type_details_du(Ctors, MaybeUserEqComp, MaybeDirectArgCtors),
> +        MaybeRepn = no,
> +        MaybeForeign = no,
> +        HLDSBody = hlds_du_type(Ctors, MaybeUserEqComp, MaybeRepn,
> +            MaybeForeign),
> +        (
> +            MaybeDirectArgCtors = no
> +        ;
> +            MaybeDirectArgCtors = yes(DirectArgCtors),
> +            % In one test case (submodules/direct_arg_cycle1.m), we insert
> +            % the same value of DirectArgCtors into DirectArgMap0 *twice*.
> +            %
> +            % I (zs) don't know whether this is something that we should allow,
> +            % since one of those is from writing a "where direct_arg is"
> +            % clause in the *source* code of the program, even though
> +            % that syntax was intended to be used only in automatically
> +            % generated interface files.
> +            %
> +            % For now, I left the old behavior.
> +            % XXX TYPE_REPN We should think again about this decision.

Judging by the commented-out documentation in reference_manual.texi and
my mail archives, that test case came from an earlier development where
the direct arg optimisation would not be applied across module boundaries
without programmer help.

There was also an idea that direct_arg assertions could serve as checked
documentation. It doesn't really fit in with the rest of the language so
I think you can disallow "where direct_arg" clauses in source code
later.

> diff --git a/compiler/du_type_layout.m b/compiler/du_type_layout.m
> index aa48ca1..28972f5 100644
> --- a/compiler/du_type_layout.m
> +++ b/compiler/du_type_layout.m
> @@ -53,11 +350,21 @@ decide_du_type_layout(ModuleInfo, TypeCtor, TypeDefn, !TypeTable) :-
>          % Leave these types alone.
>      ).
>  
> -:- pred layout_du_ctor_args(module_info::in, du_type_kind::in,
> -    constructor::in, constructor::out) is det.
> +:- pred layout_du_ctor_args(module_info::in, du_type_kind::in, int::in,
> +    constructor_repn::in, constructor_repn::out,
> +    ctor_name_to_repn_map::in, ctor_name_to_repn_map::out) is det.
>  
> -layout_du_ctor_args(ModuleInfo, DuKind, Ctor0, Ctor) :-
> -    Ctor0 = ctor(ExistTVars, Constraints, Name, Args0, Arity, Context),
> +layout_du_ctor_args(ModuleInfo, DuKind, ArgPackBits, CtorRepn0, CtorRepn,
> +        !CtorRepnMap) :-
> +    % Args1 is Args0 with any float arg marked as needing a double word
> +    % if the representation of floats is a double word.
> +    %
> +    % Args2 is Args1 with consecutive sub-word-sized arguments packed
> +    % into single words as much as possible.
> +    % XXX TYPE_REPN For now, we only recognize enums as sub-word-sized.
> +    % We should extend this to also cover whichever of int8, int16, int32
> +    % (and their unsigned versions) are smaller than a word.
> +    ArgRepns0 = CtorRepn0 ^ cr_args,

Update the variable names.

> diff --git a/compiler/hlds_data.m b/compiler/hlds_data.m
> index e09aa94..c97d1b0 100644
> --- a/compiler/hlds_data.m
> +++ b/compiler/hlds_data.m
> @@ -94,6 +93,13 @@
>                  cons_context        :: prog_context
>              ).
>  
> +:- type cons_arg
> +    --->    cons_arg(
> +                ca_field_name       :: maybe(ctor_field_name),
> +                ca_type             :: mer_type,
> +                ca_context          :: prog_context
> +            ).

Unused type?

> diff --git a/compiler/make_hlds_passes.m b/compiler/make_hlds_passes.m
> index 8832ee4..1d6d82c 100644
> --- a/compiler/make_hlds_passes.m
> +++ b/compiler/make_hlds_passes.m
> @@ -430,20 +431,22 @@ do_parse_tree_to_hlds(AugCompUnit, Globals, DumpBaseFileName, MQInfo0,
>  
>  add_builtin_type_ctor_special_preds_in_builtin_module(TypeCtor, !ModuleInfo) :-
>      varset.init(TVarSet),
> -    Body = hlds_abstract_type(abstract_type_general),
>      term.context_init(Context),
>      % These predicates are local only in the public builtin module,
>      % but we *get here* only if we are compiling the public builtin module.
>      TypeStatus = type_status(status_local),
>      construct_type(TypeCtor, [], Type),
> +    % You cannot construct clauses to unify or compare values of an abstract
> +    % type. The abstract body is code for "generate code for a builtin type".
> +    Body = hlds_abstract_type(abstract_type_general),

s/is code for/is a signal to/

> diff --git a/compiler/parse_type_defn.m b/compiler/parse_type_defn.m
> index 61c24d5..cab871d 100644
> --- a/compiler/parse_type_defn.m
> +++ b/compiler/parse_type_defn.m
...
> -:- pred parse_where_type_is_abstract_enum(module_name::in, varset::in,
> +:- pred parse_where_type_is_abstract(module_name::in, varset::in,
>      term::in, term::in, prog_context::in, int::in,
>      maybe1(item_or_marker)::out) is det.
>  
> -parse_where_type_is_abstract_enum(ModuleName, VarSet, HeadTerm, BodyTerm,
> +parse_where_type_is_abstract(ModuleName, VarSet, HeadTerm, BodyTerm,
>          Context, SeqNum, MaybeIOM) :-
> +    ContextPieces =
> +        cord.from_list([words("On the left hand side of type definition:")]),
>      varset.coerce(VarSet, TypeVarSet),
> -    parse_type_defn_head(tdhpc_type_defn, ModuleName, VarSet, HeadTerm,
> +    parse_type_defn_head(ContextPieces, ModuleName, VarSet, HeadTerm,
>          MaybeNameParams),
>      ( if
> -        BodyTerm = term.functor(term.atom("type_is_abstract_enum"), Args, _)
> +        BodyTerm = term.functor(term.atom(AttrName), Args, _),
> +        ( AttrName = "type_is_abstract_enum"
> +        ; AttrName = "type_is_representable_in_n_bits"
> +        )
>      then
> -        ( if
> -            Args = [Arg],
> -            decimal_term_to_int(Arg, NumBits)
> -        then
> -            TypeDefn0 = parse_tree_abstract_type(abstract_enum_type(NumBits)),
> -            MaybeTypeDefn = ok1(TypeDefn0)
> +        ( if Args = [Arg] then
> +            ( if decimal_term_to_int(Arg, NumBits) then
> +                TypeDefn0 = parse_tree_abstract_type(
> +                    abstract_type_fits_in_n_bits(NumBits)),
> +                MaybeTypeDefn = ok1(TypeDefn0)
> +            else
> +                Pieces = [words("Error: the argument of"), quote(AttrName),
> +                    words("is not an positive integer."), nl],

a positive

> diff --git a/compiler/parse_type_repn.m b/compiler/parse_type_repn.m
> index e69de29..f542702 100644
> --- a/compiler/parse_type_repn.m
> +++ b/compiler/parse_type_repn.m
> @@ -0,0 +1,280 @@
> +%-----------------------------------------------------------------------------e
> +% vim: ft=mercury ts=4 sw=4 et
> +%-----------------------------------------------------------------------------e
> +% Copyright (C) 2017 The Mercury team.
> +% This file may only be copied under the terms of the GNU General
> +% Public License - see the file COPYING in the Mercury distribution.
> +%---------------------------------------------------------------------------%
> +%
> +% File: parse_type_repn.m.
> +%
> +% This module parses XXX

...

> +
> +:- module parse_tree.parse_type_repn.
> +
> +:- interface.
> +
> +:- import_module mdbcomp.sym_name.
> +:- import_module parse_tree.maybe_error.
> +:- import_module parse_tree.parse_types.
> +:- import_module parse_tree.prog_data.
> +
> +:- import_module list.
> +:- import_module term.
> +:- import_module varset.
> +
> +    % Parse ":- type_representation(..., ..., ...)" items.
> +    %

Too many args.

> diff --git a/compiler/unify_proc.m b/compiler/unify_proc.m
> index 4b0b8fc..8efdca8 100644
> --- a/compiler/unify_proc.m
> +++ b/compiler/unify_proc.m
> @@ -9,9 +9,24 @@
>  %
>  % File: unify_proc.m.
>  %
> -% This module generates the bodies of the automatically generated
> -% unify, index and compare predicates.
> +% This module generates the bodies of the unify, index and compare predicates
> +% that the compiler automatically creates for each type definition.
>  %
> +% We can sometimes do this without knowing the representation of the type.
> +% For example, if the type has user has specified the predicates by which
> +% two values of the type should be unified or compared, then the automatically
> +% generated clauses need only call the specified predicates.
> +%
> +% However, in many cases, we *do* need to know the representation of the type.
> +% For example, we need that information
> +%
> +% - to decide whether an eqv type is equivalent to a dummy type;
> +% - to decide whether an arguments of a functor of a du type are dummies; and

s/an arguments/arguments

> diff --git a/compiler/write_module_interface_files.m b/compiler/write_module_interface_files.m
> index 1965b54..e7afaea 100644
> --- a/compiler/write_module_interface_files.m
> +++ b/compiler/write_module_interface_files.m
> @@ -939,8 +950,11 @@ make_impl_type_abstract(TypeDefnMap, !TypeDefnPairs) :-
>              % Leave dummy types alone.
>              true
>          else
> -            ( if du_type_is_enum(Ctors, NumBits) then
> -                Details = abstract_enum_type(NumBits)
> +            ( if du_type_is_enum(DetailsDu, NumBits) then
> +                % XXX TYPE_REPN We should also generate fitns_in_n_bits
> +                % if the original type is a less-than-word-sized builtin,
> +                % such as int8.
> +                Details = abstract_type_fits_in_n_bits(NumBits)
>              else
>                  Details = abstract_type_general
>              ),

s/fitns/fits

I can't give more than a superficial review but deciding type
representations in a single pass is obviously a good idea.

Peter


More information about the reviews mailing list