[m-rev.] for discussion: subtypes documentation

Mark Brown dougl at cs.mu.OZ.AU
Sun Nov 24 22:36:55 AEDT 2002
Previous message: [m-rev.] diff: fix compilation error with --intermodule-optimization
Next message: [m-rev.] for discussion: subtypes documentation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,

The following diff is a draft of the subtypes documentation.  I'm not
intending to commit this yet; I'm posting this now to allow any early
review comments on the design.

A couple of notes about the diff:

	- The given syntax cannot express some subtypes that we may wish
	  to use.  For example, there is no way for the supertype to be
	  a pred or func type.  We may therefore need one or more
	  additional variants on the syntax, such as the first variant
	  in Fergus' proposal.

	- I've expounded on subtypes a little bit more in the modes
	  section, describing subtype information as being attached
	  to the and-nodes of a type tree.  That was the simplest way I
	  could find of describing the semantics of subtypes.

Index: reference_manual.texi
===================================================================
RCS file: /home/mercury1/repository/mercury/doc/reference_manual.texi,v
retrieving revision 1.263
diff -u -r1.263 reference_manual.texi
--- reference_manual.texi	7 Nov 2002 06:14:09 -0000	1.263
+++ reference_manual.texi	24 Nov 2002 11:08:04 -0000
@@ -97,6 +97,7 @@
 * Type classes::      Constrained polymorphism.
 * Existential types:: Support for data abstraction and heterogeneous
                       collections.
+* Subtypes::          Subtypes provide a convenient way to express some modes.
 * Semantics::         Declarative and operational semantics of Mercury
                       programs.
 * Foreign language interface:: Calling code written in other programming
@@ -498,6 +499,7 @@
 pragma            fx        1199
 promise           fx        1199
 rule              fx        1199
+subtype           fx        1199
 typeclass         fx        1199
 use_module        fx        1199
 -->               xfx       1200
@@ -534,6 +536,7 @@
 :- func
 :- inst
 :- mode
+:- subtype
 :- typeclass
 :- instance
 :- pragma
@@ -550,6 +553,7 @@
 The @samp{type}, @samp{pred} and @samp{func} declarations are used for the
 type system,
 the @samp{inst} and @samp{mode} declarations are for the mode system,
+the @samp{subtype} declarations are for the type and mode systems,
 the @samp{pragma} declarations are for the C interface, and for
 compiler hints about inlining, and the remainder are for the module system. 
 They are described in more detail in their respective chapters.
@@ -1536,8 +1540,8 @@
 @chapter Types
 
 The type system is based on many-sorted logic, and supports polymorphism,
-type classes (@pxref{Type classes}), and existentially quantified types
-(@pxref{Existential types}).
+type classes (@pxref{Type classes}), existentially quantified types
+(@pxref{Existential types}), and subtypes (@pxref{Subtypes}).
 
 @menu
 * Builtin types::
@@ -2087,15 +2091,17 @@
 that can be used to construct terms of that type;
 the children of an and-node are the types
 of the arguments of the constructors.
-We attach mode information to the or-nodes of type trees.
+We attach mode information to the and-nodes and the or-nodes of type trees.
 
 An @dfn{instantiatedness tree} is an assignment
 of an @dfn{instantiatedness} --- either @dfn{free} or @dfn{bound} ---
-to each or-node of a type tree,
+to each node of a type tree,
 with the constraint that all descendants of a free node must be free.
 
 A term is @dfn{approximated by} an instantiatedness tree
-if for every node in the instantiatedness tree,
+if for every function symbol in the term
+the corresponding and-node is ``bound'',
+and for every or-node in the instantiatedness tree,
 
 @itemize @bullet
 @item
@@ -2138,10 +2144,13 @@
 The shape of these trees is determined by
 the type of the variable to which they apply.
 
-As execution proceeds, variables may become more instantiated.
+As execution proceeds, variables may become more instantiated,
+which means that the or-nodes in the type tree become more instantiated.
+An or-node that is bound is considered more instantiated than
+one that is free.
 A @dfn{mode mapping} is a mapping
 from an initial instantiatedness tree to a final instantiatedness tree,
-with the constraint that no node of the type tree
+with the constraint that no or-node of the type tree
 is transformed from bound to free.
 Mercury allows the user to specify mode mappings directly
 by expressions such as @code{inst1 >> inst2},
@@ -2174,7 +2183,8 @@
 These two modes are enough for most functions and predicates.
 Nevertheless, Mercury's mode system is sufficiently
 expressive to handle more complex data-flow patterns,
-including those involving partially instantiated data structures.  
+including those involving subtypes (@pxref{Subtypes}),
+and those involving partially instantiated data structures.  
 (The current implementation does not handle
 partially instantiated data structures yet.)
 
@@ -2472,9 +2482,13 @@
 
 
 Constrained polymorphic modes are particularly useful when passing
-objects with higher-order types to polymorphic predicates
+values with higher-order types to polymorphic predicates
 since they allow the higher-order mode information to be retained
 (@pxref{Higher-order}).
+Similarly, constrained polymorphic modes are useful when passing
+values that belong to a subtype to predicates that take a supertype,
+since they allow the subtype information to be retained
+(@pxref{Subtypes}).
 
 @node Different clauses for different modes
 @section Different clauses for different modes
@@ -5084,6 +5098,274 @@
 good_example3_univ("foo"::in(bound("foo")), univ(42)::out).
 good_example3_univ("bar"::in(bound("bar")), univ("blah")::out).
 @end example
+
+ at node Subtypes
+ at chapter Subtypes
+
+Subtype information about variables is expressed by Mercury's mode system.
+For a given or-node in an instantiatedness tree,
+the set of function symbols which may occur at that location in a term
+is specified by which of the children of the or-node are bound.
+This set does not necessarily include every function symbol
+in the corresponding type,
+since some and-nodes may be free.
+A variable's instantiatedness tree therefore describes (for first order terms)
+a subset of the values of the variable's type.
+
+To make subtypes easier to use,
+Mercury allows users to declare subtypes such as
+
+ at example
+:- subtype non_empty_list(T) < list(T) ---> [T | list(T)].
+ at end example
+
+This subtype describes a list whose top function symbol is @code{[|]/2},
+whose first argument has a subtype given by the parameter @code{T},
+and whose second argument is any list of @code{T}'s.
+If this subtype is used in an argument of a predicate or function
+declaration, the subtype information is propagated into the
+corresponding modes in a way we describe below.
+
+ at menu
+* Subtype declarations::
+* Semantics of Mercury programs with subtypes::
+* An example using subtypes::
+ at c * Abstract subtypes::
+* Higher-order subtypes::
+ at end menu
+
+ at node Subtype declarations
+ at section Subtype declarations
+
+Subtype declarations have the form
+
+ at example
+:- subtype @var{Sub} < @var{Super} ---> @var{Body}.
+ at end example
+
+ at noindent
+ at var{Sub} is a non-variable type
+whose arguments (if any) are distinct variables.
+ at var{Super} is a type or subtype.
+ at var{Body} is a sequence of constructors separated by semi-colons,
+whose arguments are types or subtypes.
+Types or subtypes in the body of the definition may be inst qualified
+using the @code{::} operator,
+provided that the specified inst is compatible with @code{ground}.
+
+Any type variable appearing in @var{Super} must also appear in @var{Sub}.
+Any type variable appearing in @var{Body} must also appear in @var{Super},
+except that anonymous variables (i.e., variables beginning with @code{_})
+are allowed to appear in the body.
+
+Circularities are not allowed, so it is an error if some part of @var{Super}
+is ultimately defined to be a subtype of @var{Sub}.
+
+The values described by a subtype are meant (for first order types)
+to be a subset of those described by the supertype.  Hence for each
+function symbol in the body of the subtype declaration there must be a
+matching function symbol in the definition of the supertype, and each
+argument of the former should in turn be a subtype of
+the corresponding argument of the latter.
+
+The only subtype of a universally quantified type variable
+is the variable itself,
+so where the supertype uses such a variable the subtype must use the
+same variable in the corresponding location.
+Similarly, where the supertype uses an existentially quantified type variable
+the subtype must use an anonymous type variable in the corresponding location.
+Existential quantifiers and class constraints are always implicitly
+determined from the supertype, and should not be explicitly given.
+
+ at node Semantics of Mercury programs with subtypes
+ at section Semantics of Mercury programs with subtypes
+
+For every subtype we define a type, the @dfn{base type},
+which is the type used internally to represent values of the subtype,
+and an inst, the @dfn{base inst},
+which encompasses the subtype information about values of the subtype.
+
+The base type is determined by treating each subtype declaration
+
+ at example
+:- subtype @var{Sub} < @var{Super} ---> @var{Body}.
+ at end example
+
+ at noindent
+as if it is the equivalence type declaration
+
+ at example
+:- type @var{Sub} == @var{Super}.
+ at end example
+
+ at noindent
+and expanding out all of these equivalences in the usual way.
+
+The base inst is defined on the type tree of the base type.
+Consider the subtype at the root or-node of the type tree:
+
+ at itemize @bullet
+ at item
+If it is an anonymous variable (which is allowed in subtype definitions),
+the inst is ground.  That is, every node in the type tree maps to bound.
+
+ at item
+If it is inst qualified (which is allowed in subtype definitions),
+then the subtype information on the left hand side
+is propagated into the inst on the right hand side
+to form the base inst of the qualified type.
+The propagation of a subtype into an inst is described below.
+
+ at item
+If it is defined by a subtype declaration or a discriminated union type
+(modulo equivalence) then the root or-node maps to bound,
+and for each function symbol in the type or subtype definition
+the corresponding and-node maps to bound,
+and the children of this and-node are
+the base insts of the arguments of the function symbol.
+All other and-nodes (if any) map to free.
+
+ at item
+If it is defined by an abstract type ...
+
+XXX I suppose we will have to treat this as ground, which will ignore
+any subtype information in arguments of the type.
+
+ at item
+If it is a pred or func type, the base inst is just the subtype itself.
+This is not a valid Mercury inst, so attempting to use it directly
+will result in an error.
+However, if a base inst like this is propagated into another inst,
+as described below, then a valid inst may result
+and using this resulting inst would therefore not be a problem.
+
+ at item
+If it is a tuple type @code{@{@}/N}, it is treated as
+a discriminated union type with the single function symbol @code{@{@}/N}.
+
+ at item
+Other built-in types map to ground.
+
+ at end itemize
+
+A subtype constructor can be used in some, but not all,
+of the places where it is legal to use a type constructor.
+Subtypes are allowed in predicate and function declarations,
+in method declarations in the ``where'' part of typeclass declarations,
+and in the bodies and supertypes of subtype declarations.
+Subtypes are not allowed anywhere else;
+in particular, they are not allowed to be used in the definition
+of ordinary types (i.e., those defined with a @code{type} declaration),
+or on the right hand side of a @code{with_type} qualifier in a goal.
+
+Using a subtype in a predicate or function declaration has two effects,
+one on the type of the argument and one on each mode of the argument.
+In the type, an occurrence of a subtype constructor
+is replaced by the base type,
+and each argument of the constructor is substituted into the base type
+in the appropriate places.
+In each mode, the base inst is propagated into the initial and final insts,
+so that the new mode contains the information from both the existing mode
+as well as the subtype.
+The propagation of a base inst into an existing inst can be described by
+showing the effect on each node of the type tree.
+
+For or-nodes, the new inst is usually the same as the existing inst.
+There are two exceptions.
+First, if the existing inst is a (constrained) variable
+and the subtype is a non-variable,
+then the base inst is propagated into the variable's constraint.
+Second, if the existing inst is a higher-order pred or func inst
+and the subtype is a higher-order pred or func type,
+then the base inst for each argument (and possibly return value)
+is propagated into the argument's mode in the existing inst,
+and the existing determinism remains unchanged.
+See @ref{Higher-order subtypes} for further details.
+
+For and-nodes, the new inst is given by the following table:
+
+ at example
+        base inst         existing inst       new inst
+
+        free              free                free
+        bound             free                free
+        free              bound               free
+        bound             bound               bound
+ at end example
+
+ at node An example using subtypes
+ at section An example using subtypes
+
+ at c XXX
+
+ at c @node Abstract subtypes
+ at c @section Abstract subtypes
+ at c
+ at c We don't support abstract subtypes yet.  They won't be supported
+ at c unless/until abstract insts are supported.
+ at c
+
+ at node Higher-order subtypes
+ at section Higher-order subtypes
+
+As stated earlier, the subtype relation on first-order types
+is the same as the subset relation.
+For higher-order types, the subtype relation is slightly more complex.
+We extend the relation to higher-order types with the following two cases.
+
+ at itemize @bullet
+ at item
+A type @code{func(@var{S1}, @dots{}, @var{SN}) = @var{S}}
+is a subtype of @code{func(@var{T1}, @dots{}, @var{TN}) = @var{T}}
+iff @var{S} is a subtype of @var{T},
+and for each argument @var{i}, @var{Si} is a subtype of @var{Ti}.
+
+ at item
+A type @code{pred(@var{S1}, @dots{}, @var{SN})}
+is a subtype of @code{pred(@var{T1}, @dots{}, @var{TN})}
+iff for each argument @var{i}, @var{Si} is a subtype of @var{Ti}.
+
+ at end itemize
+
+ at noindent
+In other words the subtype relation on predicate and function types
+is @emph{covariant} in the argument values
+and (for function types) the return values.
+ at c This is in contrast to the usual object-oriented view of subtypes,
+ at c where inputs are contravariant and outputs are covariant.  We use
+ at c covariant subtypes because they work much better with parametric
+ at c polymorphism.
+Mercury's mode system provides enough information for the compiler
+to statically check that the use of covariant subtypes is safe.
+
+An inst qualifier can be used in a subtype definition
+to give more information about a predicate or function type
+than just the subtype of each argument.
+For example, the qualifier may specify the determinism of a function
+or predicate, or the uniqueness of its arguments.
+If a predicate or function type @var{S} is to be a subtype of @var{T},
+then any extra information must be identical in both @var{S} and @var{T}.
+Hence
+
+ at example
+pred(S) :: pred(out) is det
+ at end example
+
+cannot be a subtype of
+
+ at example
+pred(T) :: pred(out) is multi
+ at end example
+
+even if @var{S} is a subtype of @var{T},
+because the determinisms are not identical.
+Similarly, if a pred or func type is propagated into a pred or func inst,
+the determinism, uniqueness and freeness information must match exactly;
+if they don't match it is a mode error.
+
+Here is an example using higher-order subtypes ...
+
+ at c XXX
 
 @node Semantics
 @chapter Semantics
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
Previous message: [m-rev.] diff: fix compilation error with --intermodule-optimization
Next message: [m-rev.] for discussion: subtypes documentation
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list