[m-dev.] discussion about the implementation of compact type representations
Peter Wang
novalazy at gmail.com
Mon Oct 30 18:16:34 AEDT 2017
On Mon, 30 Oct 2017 09:25:27 +1100 (AEDT), "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> Attached is a set of proposals that I would like your opinions on.
> Which parts do you agree with and which parts do you disagree with?
> Do you have a better idea for any component?
> Some principles for the implementation of compact type representations
>
> At the YesLogic meeting on friday 2017-10-27 we discussed several ways
> in which the Mercury compiler could compact the representation of
> several kinds of types. I mentioned that the key challenge in
> the implementation of such compact representations was ensuring
> that all modules agree on the compact representation of e.g. a type t2,
> even when they differ in whether they see the definition of a type, e.g. t1,
> that is a component of t2. If t1 is an argument in a term of type t2,
> then all code accessing that term needs to know where the storage of
> the *next* argument after the one of type t1 starts, which requires knowing
> how big the representation of t1 is.
>
> Here are some principles that I think can help solve that problem.
> They are not yet fully worked out.
>
> principle 1:
> If a module m1 defines and abtract-exports a type t1, whose
> representation, though semantically hidden, could affect decisions
> about the representations of other types (t2, t3 etc) defined
> *either* in m1 or in other modules, then a description of its
> representation SHOULD be included in module m1's interface files.
> (I am not sure yet about *which* interface files exactly.)
>
> The abstract-exported type definitions affected this way are
>
> - type constructors whose definitions show that they are dummy
> types (if their arity is nonzero, this means all their instances
> are also dummy types)
>
> - type constructors whose definitions show that they are notag types
>
> - type constructors whose definitions show that they fit in a given
> number of bits that is less than a word in size
>
> - type constructors that are defined to be equivalent to another type t4
> that either
>
> - *are* in one of these four categories (if t4 is defined in m1), or
> - *may be* in one of these four categories (if t4 is defined outside m1).
I think we require one further condition for types defined to be
equivalent to an atomic type with a 2-word representation:
currently only float but may soon include int64 and uint64.
> -----------------------------------------
>
> An implementation of compact term representation along the lines above
> will need answers to two questions:
>
> - what exactly is a dummy type, and
> - what exactly is a notag type,
>
> beyond the fact that they both have a single function symbol of
> arity 0 and arity 1 respectively.
>
> - Can they have user defined comparison and unification predicates?
>
> - Can there be a reserve_tag pragma for them?
>
> - Can there be a foreign_type pragma for them?
>
> - Can the argument of a notag type have an existential type?
>
> - For each of the above questions, if the answer is "no, there can't be",
> should it be an error if "there is"? For example, if a type with
> a single function symbol of arity 1 has a user defined comparison
> predicate, is that an error, or is that ok, with the type simply
> not being a notag type?
>
> I propose that for types with a single function symbol of arity 0,
>
> - Specifying a user-defined unification or comparison predicate
> should be an error.
Sounds sensible.
>
> - Specifying a reserve tag pragma for them should be an error.
Never used reserved tags myself.
>
> - Specifying a foreign_type tag pragma for them should be an error.
A small number of types in the standard library will need to be
trivially changed (io.file_id, dir.stream, etc.).
User code should be less likely to be broken by this change.
>
> I propose that for types with a single function symbol of arity 1,
>
> - Specifying a user-defined unification or comparison predicate
> should not be an error, but should make the type NOT a notag type.
> - Specifying a reserve tag pragma for them should be an error.
>
> - Specifying a foreign_type tag pragma for them should not be an error.
> In grades where the foreign type pragma is applicable, that will be the type;
> in grades where no such foreign type pragma is applicable, the type
> should be a notag type, unless one of the other considerations prevents that.
> - If the argument of the function symbol has an existential type,
> that should not be an error, but should make the type NOT a notag type.
All seems fine.
> -----------------------------------------
>
> An implementation detail of principle 3 is: how should we choose the extensions
> of "interface Mercury" over standard Mercury that record information about
> type representations?
>
> At the moment, if an abstract exported type is an enum, we record that fact
> in interface files using syntax such as
>
> :- type t6 where type_is_abstract_enum(N).
>
> where N is the number of bits required to represent the enum. In this case,
> the extension is a kind of where clause that we wouldn't want to permit
> in user-written Mercury code.
>
> I would prefer a syntax along the lines of
>
> :- type_representation(t6, abstract_enum(N)).
> :- type_representation(t7, dummy_type).
> :- type_representation(t8, notag_type).
> :- type_representation(t9, equivalent_to(t10)).
>
> Distinguishing between interface Mercury and standard Mercury at the top level
> function symbol of the declaration seems to me to be a cleaner separation.
>
> While we currently represent "where type_is_abstract_enum(N)" declarations
> using the usual kind of type_defn items, it would be natural to represent
> type_representation declarations using a different kind of item, and
> principle 4a could be implemented as a list of this new kind of item.
>
> The type_representation syntax should also be slightly easier to parse;
> since it does not use any operators, we won't have to worry about their
> precendences.
That's fine. Is the lack of 'pragma' to distinguish it from code
that can be written by the user?
We also have 'where type_is_abstract_noncanonical' and 'where
direct_arg' that may be replaced by this new mechanism.
Peter
More information about the developers
mailing list