[m-dev.] subtypes discussion

Mark Brown dougl at cs.mu.OZ.AU
Mon Dec 23 15:25:57 AEDT 2002
Previous message: [m-dev.] 0.11 release
Next message: [m-dev.] subtypes discussion
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In the last thread discussing the design and documentation of subtypes that
I posted, a couple of issues were raised about some aspects that were a bit
counterintuitive, and that may in fact be fundamental problems.  The purpose
of this note is to explain some of my own intuitions behind the design, and
hopefully to convince everyone that the basic design is correct.  For those
who remain unconvinced, this note will at least provide something more
tangible to argue against.

I'll first present a motivating example, written in Mercury without subtype
declarations.  Then I'll show how subtype declarations can be usefully
applied in this case.  Following that I'll discuss some of the issues that
have been raised, and present my conclusions.

Feedback is sought.


1) Programming with existing subtypes
-------------------------------------

Sometimes it is desirable to avoid the common in/out/di/uo modes, and instead
use modes that give subtype information about arguments in addition to the
usual uniqueness and boundness information.  For example, it is necessary to
use a higher-order inst to allow a higher-order term to be called safely.
Subtype information can also help avoid the need for run-time checks that
should never fail, without compromising safety.

Consider the following example.

	:- type fruit ---> apple ; orange ; lemon.

	:- inst disliked_fruit ---> apple ; lemon.

	:- func fruit_problem(fruit) = string.
	:- mode fruit_problem(in(disliked_fruit)) = out is det.

	fruit_problem(lemon) = "too sour".
	fruit_problem(apple) = "too often shared by a worm".

The inst 'disliked_fruit' represents a subtype of 'fruit', and since the
mode of fruit_problem/1 asserts that the first parameter will always be in
this subtype, there is no need for a clause for orange.  This is fortunate,
because there doesn't exist any problem with oranges that we could reasonably
return -- oranges are not one of our disliked fruits.

Continuing the example, we may wish to have some container type in which we
intend to put our disliked fruit, and perhaps a means of describing the
contents of a container of disliked fruit.

	:- type rubbish_bag(T).

	:- func rubbish_problem(rubbish_bag(fruit)) = string.
	:- mode rubbish_problem(in(rubbish_bag(disliked_fruit))) = out is det.

and we need to define the inst rubbish_bag/1 in the "obvious" way.

If we want a predicate that takes a function with the same subtype as
fruit_problem/1, we may write something like:

	:- pred p(func(fruit) = string, ...).
	:- mode p(in(func(in(disliked_fruit)) = out is det), ...).

The modes are quite complex for a fairly simple problem, and yet they don't
seem to add much more information than the type declarations themselves.


2) Programming with subtype declarations
----------------------------------------

In general, the subtype component of a parameter can differ between different
modes, and indeed between the initial and final insts of one mode.  This is
unlike the type of a parameter, which is unique -- each parameter has exactly
one type.  For each parameter in the above examples, the subtype component
is the same in all of its insts; therefore the subtype information is in a
sense "type-like", because there is exactly one subtype for each parameter.

One problem with existing subtypes in Mercury, as demonstrated by the above
example, is that the modes required are too verbose.  Granted, there will be
cases where the extra verbosity is required, but when the modes are really
just in/out/di/uo with a "type-like" subtype component, the verbosity is a
waste.  This is particularly so when the structure of each inst mostly
duplicates the structure of the type.  It is in these type-like cases (which,
IMHO, would be the most common way of using subtypes) that subtype
declarations are designed to help.

Consider the above example, this time written with subtype declarations
(the type declarations are the same as before).

	:- subtype disliked_fruit < fruit ---> apple ; lemon.

	:- func fruit_problem(disliked_fruit) = string.
	:- func rubbish_problem(rubbish_bag(disliked_fruit)) = string.

	:- pred p(func(disliked_fruit) = string, ...).
	:- mode p(in, ...).

The declarations are much simpler, and as a bonus the default mode for each
function is now applicable, which simplifies things even further.  We no
longer need to define the inst rubbish_bag/1, because the propagation of
subtype information achieves exactly the "obvious" effect that this inst was
meant to, without the need for any explicit definition.  The subtype
declarations therefore achieve what they are designed to in this case.


3) Terminology
--------------

Just to be clear on some (slightly misused) terminology:

	An argument is "covariant" if the ordering is preserved by the
	functor.

	An argument is "contravariant" if the ordering is reversed by the
	functor.

	If an argument is "invariant", then two values in this argument
	position must be equal for the terms to be comparable.

In the programming languages vernacular, this terminology is applied to
functions, methods, etc.  Since outputs are nearly always thought of as
covariant, the terms only refer to the input arguments; hence,
"contravariant" means contravariant on the inputs and covariant on the
outputs, "covariant" means covariant on both inputs and outputs, etc.

We'll adopt a similar use of terminology, but more specific to Mercury.
By "contravariant" we shall mean contravariant on initial insts and covariant
on final insts, and by "covariant" we shall mean covariant on initial and
final insts.  For constrained polymorphic types, universally quantified
types will be considered inputs and existentially quantified types will be
considered outputs.


4) Subtypes: subsumption vs specialization
------------------------------------------

One reason our proposed design may seem counterintuitive is that our subtypes
do not support subsumption.  That is, we don't support the notion that if S
is a subtype of T, then values of type S can be safely used wherever values
of type T are expected.  This may be counterintuitive because it differs from
the way the term is used by the OO community, where the subtype relation is
usually synonymous with subsumption.  Note, though, that it should be of no
concern that we use the term "subtype" differently to OO, since we also use
the term "type" differently.

Of more concern is the fact that the logic programming community also seems
to use the term "subtype" to mean subsumption.  This usage, however, is a
lot less common than the OO case.

Two forms of subsumption are supported by Mercury.  At one level, the
entailment ordering of typeclass constraints supports subsumption: you can
always use a type whose constraint entails the expected constraint.  At
another level, the "matches" partial order supports subsumption.  In both
cases, functions and predicates are contravariant with respect to the
ordering, and this is necessary for the subsumption to be sound (cf. Eiffel).

In our example, the use of subtypes (with or without subtype declarations)
can be thought of as a form of specialization rather than subsumption.  We
pay the price of limiting the generality of fruit_problem and rubbish_problem,
and in return we get some benefits: we don't need run-time checks in
rubbish_problem, and we don't need a spurious case for oranges in
fruit_problem.  Since we don't support subsumption, there is no need for the
subtype partial order to be contravariant; indeed, it seems that covariance
is the right way to support specialization, just as contravariance is the
right way to support subsumption.  This point has long been observed in the
OO community [1], and the intuitions there, if not the details, seem
applicable to us.

[1] Giuseppe Castagna.  Covariance and contravariance: conflict without a
cause.  TOPLAS, 17(3), 1995.


5) Subtypes vs subsets
----------------------

For first order types, the subtype partial order corresponds to the subset
relation on values of the given types, as expected.  For higher-order types,
however, it is not obvious to me how to interpret a given subtype as a
subset.

Consider the following type and inst (which is a subtype, although it isn't
"type-like" so we couldn't define it with our subtype declarations):

	pred(fruit :: ground >> disliked_fruit) is semidet.

Possibly we can think of this as a subset of the values described by inst
'pred(fruit::in) is semidet', where P is in the subset iff P(orange) cannot
succeed.

On the other hand, consider the case where the initial inst is modified:

	pred(fruit :: disliked_fruit >> ground) is semidet.

Values described by this inst are unary relations on the set {apple, lemon},
and none of these values are in the set of values described by
'pred(fruit::in) is semidet', since the latter are all defined on oranges.
Therefore the subtype order does not in general correspond to the subset
order.  Note that even if the subtype order was contravariant instead of
covariant, it would still not correspond to the subset order.  It appears
to me that the subtype order must be invariant if it is to directly
correspond to the subset order.

Given the above, calling our declarations "subtype" declarations might be
a misnomer (Tyson suggested they were badly named on the developer's
mailing list a while ago).  We could perhaps think of some other name that
reflects the fact that the declarations declare both a type and inst, and
use that instead.

On the other hand, maybe we should just avoid trying to use sets to model
higher-order subtypes.  Perhaps it is better to think of types as structures
(e.g. and-or trees), and subtypes as substructures -- structures which have
had some parts removed from them.


6) Conclusions
--------------

There are two main questions I want to consider: a) whether the subtype
ordering is the right design, and b) whether "subtype" is the best name
to use.

The subtype declarations did simplify our motivating example, and in my
judgement there would be a significant number of similar cases that
would also be simplified.  These "type-like" cases we are interested in,
which can be thought of as a form of specialization, can usually only be
expressed when the subtype order is covariant.  So on the issue of whether
the design is correct, I'm strongly of the view that the proposed
propagation algorithm (and hence the proposed subtype order) is the right
design.

On the issue of naming the new feature, I don't have any strong commitment.
I think the name "subtype" is fine; I don't know any reason why it is
important that subtypes correspond to subsets.  However, I'm open to other
views on this point.


Cheers,
Mark.

--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------
Previous message: [m-dev.] 0.11 release
Next message: [m-dev.] subtypes discussion
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the developers mailing list