[m-dev.] subtypes (was Re: [m-rev.] for review: don't allow nondefault mode functions in terms)

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri Dec 11 23:10:52 AEDT 2015



On Fri, 11 Dec 2015 21:09:47 +1100, Mark Brown <mark at mercurylang.org> wrote:
> > The univ contains nothing but the value and its typeinfo. We can
> > implement the needed mechanism in one of two logically equivalent ways:
> > we can add a third field containing the mode information to the univ
> > when the value it contains is higher order, or we can extend the
> > type_infos for higher values themselves with mode information.
> > The latter seems preferable.
> 
> Definitely. Since the code that actually calls the univ module may
> itself be generic, the modeinfo would have to get there either inside
> or alongside the typeinfo. It's not needed for most types, so inside
> is much better than alongside.

I agree with that, but I think a better reason is that this info
would probably also be useful for other uses, so we shouldn't
make the solution univ-specific if we don't have to.

> > I see three ways to add the modes. All would require the invention
> > of a new type, say MR_ModeInfo, for containing mode information.
> > In the following, procedure has N args.
> >
> > - Add a new field containing a pointer to an array of N MR_ModeInfos.
> 
> I think this is the best way. It would save space to represent
> commonly used modes, such as default functions, with a small constant,
> and there could be a lot more sharing for modes than for types, for
> example, a bunch of predicates with mode (pred(in, di, uo) is det)
> which differ in the type of the first argument.

I don't think that is a strong argument at all. First, the amount
of space occupied by RTTI for higher order values is not all that
significant. Second, in most cases it is statically allocated,
so it doesn't have to be filled in. Third, the commonalities
would need to save at least one word of space on average
to compensate for the word occupied by the pointer to this
new array, which the other two schemes do not have.

My order of preference is actually the third option first,
and then the second; the first option is last. This is mostly
the order of how close to each other the mode and type info
for the same argument are kept: next to each other,
in the same array, in different arrays. I think keeping
them close is aesthetically cleaner and less error prone.

> A tagged pointer for
> this field could distinguish between simple modes containing just
> in/out/di/uo, which can be represented cheaply, and more complex
> modes.

If we eventually do decide to go with the first option, then
for the most common cases, we don't even need a pointer.
The most common modes can be encoded in 2 or 3 bits;
we can distinguish a containing five 2-bit or three 3-bit fields
from a pointer. (We already assume that there is no
valid pointer whose value is smaller than 1024.)

> Yes, and in fact I need to keep this task as small and simple as
> possible, or risk it not being completed at all. That's why it doesn't
> deal with anything other than pred and func subtypes.

Agreed, which is why I wrote what I did.

> At minimum I want to:
> - allow all existing programs to run as before
> - ensure new programs never crash

Agreed to this as well.

> For construct/3, it would be sufficient to store one bit of
> information with the type constructor, or perhaps with each du
> functor, that indicates whether there are any constraints on the
> arguments at all.

Exactly what constraints are you thinking of? Type class constraints?
The presence of higher order values?  The presence of higher order
values other than standard mode functions? Some combination
of the above?

Zoltan.



More information about the developers mailing list