[m-rev.] Types chapter of book for review

Ralph Becket rafe at cs.mu.OZ.AU
Tue Mar 25 17:00:17 AEDT 2003


I'm printing out several copies of the types chapter and will leave them
on the filing cabinet next to the door in the Mercury office.  Please
feel free to take one - all feedback gratefully recieved.

- Ralph



books/tutorial/types.tex:

% vim: ft=tex ff=unix ts=4 sw=4 et tw=76

\chapter{Types}

Mercury is both strongly and statically typed.  Strong typing ensures
that a program does not inadvertently compare apples with oranges, while
the purpose of static typing is to inform the programmer \emph{at
compile time} if a given program may attempt such a thing.  Types also
exist to provide a systematic way to structure data.

\XXX{Are the following paragraphs too pejorative?  Relevant this
early?  Needed at all?  Examples required?}

By way of comparison with other languages, C is not
strongly typed, since C allows unchecked casts between values of
different types (a \emph{cast} is a way of telling the compiler to
treat a particular value as if it had a different type.)

Python is strongly typed, but not statically type checked: every time a
Python program performs an addition, for instance, the arguments have to
be tested to verify that they are indeed numbers.

Java, on the other hand, \emph{is} both strongly and statically typed.
Unfortunately, though Java's type system is weak: \emph{checked} run time
casts are frequently necessary, meaning that many type errors will only
be spotted when a running program aborts with an exception.

Mercury does have these problems.  All type errors in
a Mercury program are identified at compile time, making it harder to
ship buggy programs.  An expressive type system makes Mercury programs more
efficient, since no type checks are necessary at run time.  The extra
information also allows the compiler to perform optimizations that would
otherwise be impossible.

There are three different sorts of type in Mercury: discriminated
unions, equivalence types and abstract types.  A \emph{discriminated
union} is a collection of values.  An
\emph{equivalence type} is simply a different name for another type.  An
\emph{abstract type} is one whose definition is hidden from its users.

Because Mercury's type system is rich there is a danger of confusing the
reader with too much, too soon.  Explanation of the more esoteric aspects of
the type system, namely existentially quantified types and type classes, is
deferred to chapter \XXX{}.

\section{The Primitive Types}

Mercury's built-in types are @int@, @float@, @char@, @string@, tuples and
the predicate and function types.

\subsection{Integers}

The @int@ type represents the integers -- @123@, @-9@, @42@ and so forth -- that
will fit into a machine word on the target computer (the range is from
about $-2$ billion to $+2$ billion on a 32-bit machine.)

All the core operations one can perform on @int at s, including the basic
arithmetic functions, are defined in the @int@ module in the Mercury
standard library.

Like most languages, Mercury has syntax for hexadecimal, octal, and binary
numbers, as well as for the integer codes for characters.  The interested
reader can find the details in the Mercury Reference Manual \XXX{}.

\subsection{Floating Point Numbers}

Floating point values such as @2.718@ and @3.0e8@ ($3.0\times10^8$) are
encoded by the @float@ type.  (The decimal point must always
appear in a floating point number and be preceded and followed by at
least one digit.)

The core @float@ operations are defined in the @float@ module in the
Mercury standard library.  Mathematical constants such as $\pi$ and $e$, and
things like trignometric functions, are defined in the @math@ module in the
Mercury standard library.

\subsection{Characters}

The @char@ type represents single characters.  Unfortunately, due to the
Mercury's heritage, there is no special syntax for single character
values.  Some characters, @a@, @b@, @c@, @1@, @2@, @3@ and so forth,
need no special quoting.
For others (space, tab and newline, for instance), one has to use
single quotes and write @' '@, @'\t'@ and
@'\n'@ -- single quote and backslash are @'\''@ and
@'\\'@ respectively.
Finally, some characters need to be enclosed in parentheses since
otherwise they might be confused with infix operators: @(+)@ and
@(*)@, for example.
The safest method is to use both parentheses \emph{and} quoting, as in
@('/')@.  We apologise for the inconvenience.

The core @char@ operations are defined in the @char@ module in the
Mercury standard libarary.

\subsection{Strings}

Mercury strings are immutable sequences of characters.  Examples include
@"bodacious"@ and @"new\nline"@.  To embed certain characters in a
string -- double quotes, backslashes, tabs and newlines
et cetera -- one has to \emph{escape} them with a preceding backslash:
@\"@, @\\@, @\t@ and @\n at .

If a string contains an explicit newline, as in
\begin{myverbatim}
    "there's a newline in here
there it is"
\end{myverbatim}
then the newline character is actually part of the string, just as
if we had written
\begin{myverbatim}
    "there's a newline in here\nthere it is"
\end{myverbatim}
However, if the \emph{last} character on a line is a non-escaped backslash
then the trailing newline is not taken to be part of the string, hence
\begin{myverbatim}
    "I see \
no newline"
\end{myverbatim}
is the same as
\begin{myverbatim}
    "I see no newline"
\end{myverbatim}

\XXX{I don't really want to mention the double-double quotes in a string
thing.}

It is even possible to include characters in a string by giving the
appropriate character code, although the reader is referred to the Mercury
Reference Manual \XXX{} for details.

The core @string@ operations are defined in the @string@ module in the
Mercury standard library.

\subsection{Tuples}

Tuples are the only primitive compound type in Mercury.  A tuple is a
vector of values between braces:
\begin{myverbatim}
    {"the base of natural logarithms", 'e', 2.718}
\end{myverbatim}
Tuples are useful for parcelling up data in situations where it would be
annoying to have to create a special-purpose type to do the job.
Tuples can have any number of arguments.

There is no tuple-specific library module: one can construct and
deconstruct them, and that's it.

\subsection{Function and Predicate Types}

Functions and predicates are first class citizens in the Mercury type
system: they can be passed as arguments, returned as results
and stored in other data structures.  This aspect of Mercury is
discussed in detail in chapter \XXX{} on higher order programming.

Function and predicate types look just like the corresponding @func@ and
@pred@ declarations with the name is omitted.
The @func@ declaration for @math.sqrt/1@, for example, is given as
\begin{myverbatim}
:- func sqrt(float) = float.
\end{myverbatim}
The type of @math.sqrt/1@ is therefore @func(float) = float at .

The @pred@ declaration for @set.contains/2@, which tests whether a
particular value is a member of a set, is declared as
\begin{myverbatim}
:- pred contains(set(T), T).
\end{myverbatim}
so @set.contains/2@ has the type @pred(set(T), T)@.
The @T@ is a \emph{type parameter} (about which we will say more
in a little while) meaning we can substitute any type at all for @T@ and
the predicate will still work.

Although we haven't yet covered higher order programming (see chapter
\XXX{}), we should still be able to understand higher order type
signatures.  @string.words/2@, for instance, has the type
\begin{myverbatim}
    func(pred(char), string) = list(string)
\end{myverbatim}
indicating that its first argument is a predicate over characters.
(Note that the \emph{type} of a function or predicate says nothing about
its \emph{mode} -- which arguments are inputs and its determinism and so
forth.  Modes are the subject of chapter \XXX{}.)

\section{Discriminated Union Types}

One can think of a discriminated union as being a set of values, each of
which is distinguishable from the others.  At their core, virtually all
Mercury types are distributed unions.
(Equivalence types are simply a means of giving convenient names to
complicated types and abstract types are used to hide the
definition of a type.)
\XXX{Should I qualify this by saying that types implemented in foreign
code aren't strictly DUs?}

We can define a new type to represent the suits in a deck of playing
cards like this:
\begin{myverbatim}
:- type suit ---> spades ; hearts ; diamonds ; clubs.
\end{myverbatim}
The new type is called @suit/0@ and it has four 
\emph{data constructors}, or possible values: @spades@, @hearts@,
@diamonds@ and @clubs at .  The goal
\begin{myverbatim}
    X = spades
\end{myverbatim}
will, assuming @X@ is uninstantiated, bind @X@ to the value @spades@ which
is of type @suit/0 at .

Data constructors can take arguments, as illustrated by the following type
for binary trees of @int at s:
\begin{myverbatim}
:- type int_tree ---> empty ; branch(int, int_tree, int_tree).
\end{myverbatim}
The code for insertion and the membership test can be written like
this:
\begin{myverbatim}
:- func insert(int, int_tree) = int_tree.

insert(X, empty          ) = branch(X, empty, empty).

insert(X, branch(Y, L, R)) =
    (      if X =< Y then branch(Y, insert(X, L), R)
      else /* X  > Y */   branch(Y, L, insert(X, R))
    ).

:- pred int_tree `contains` int.
:- mode in       `contains` in is semidet.

branch(X, L, R) `contains` Y :-
    (   Y = X
    ;   Y < X,  L `contains` Y
    ;   Y > X,  R `contains` Y
    ).
\end{myverbatim}
(Observe the C-style @/*@ comment @*/@ in the definition of @insert/2 at .)

As we've seen before, a data constructor in a program denotes either a
construction or a deconstruction -- exactly which is determined by the
modes and the context.  In @insert/2@, occurrences of @branch/3@ and
@empty/0@ in the head are deconstructions, while occurrences in the body are
constructions.

The definition for @contains/2@ uses a plain disjunction, not a switch:
it reads ``a @branch(X, L, R)@ contains @Y@ \emph{iff} @Y = X@
\emph{or} @Y < X@ and @L@ contains @Y@ \emph{or} @Y > X@ and @R@
contains @Y at .''
If any arm of the disjunction succeeds then the predicate as a
whole succeeds.  A clause for the @empty@ case is not necessary since it
would always fail.

It is quite common for a type to have a single constructor of the same
name:
\begin{myverbatim}
:- type date ---> date(int, int, int).  % Year, month, day.
\end{myverbatim}
Mercury will never get the type and the data constructor confused since
the type name can only appear in type declarations and the constructor
name can only appear in clauses.

\subsection{Data Constructors with Named Fields}

Like most languages, Mercury supports data structures with named fields.
Consider a type used to keep track of the tally of votes:
\begin{myverbatim}
:- type tally ---> tally(ayes :: int, nayes :: int).
\end{myverbatim}
naming the fields of the @tally/2@ data constructor @ayes/0@ and
@nayes/0@ respectively.  We can still construct and deconstruct @tally/2@
values as if it had been defined without named fields.  However, having
named fields also allows us to refer to them \emph{without} the need for an
explicit deconstruction.

\subsubsection{Accessing Fields}

If we have @X = tally(12, 7)@ then we can refer to the values of the fields
of the value bound to @X@ using @X ^ ayes@ and @X ^ nayes@:
\begin{myverbatim}
    X ^ ayes  = 12,
    X ^ nayes =  7
\end{myverbatim}
The expression @X ^ ayes@ is just syntactic sugar for @ayes(X)@.  The
function @ayes/1@ is automatically constructed from the field name in the
type definition and will be defined as
\begin{myverbatim}
ayes(tally(A, _)) = A.
\end{myverbatim}

\subsubsection{Updating Fields}

We can ``update'' a particular field like this:
\begin{myverbatim}
    Y = ( X ^ ayes := 13 )
\end{myverbatim}
giving @Y = tally(13, 7)@.  This, of course, does not affect the value
of @X@: we still have @X = tally(12, 7)@.  
@X ^ ayes := 13@ should be read as ``the value of @X@ with @13@
substituted in the @ayes@ field.''

The expression @X ^ ayes := 13@ is just syntactic sugar for
@'ayes :='(X, 13)@.  The function @'ayes :='/2@ is also automatically
constructed from the type definition.  (Anything at all can be used as a
functor name if it is enclosed in @'@single quotes@'@, which is why the
extra characters in the name do not lead to trouble.)  The definition of
the automatically created @'ayes :='/2@ would be
\begin{myverbatim}
'ayes :='(tally(_, B), A) = tally(A, B).
\end{myverbatim}

Updating several fields in one go is straightforward:
\begin{myverbatim}
    Z = (( Y ^ ayes  := Y ^ ayes  + 2 )
             ^ nayes := Y ^ nayes + 3 )
\end{myverbatim}
giving @Z = tally(15, 10)@.

\subsubsection{Fields Within Fields}

Consider the following:
\begin{myverbatim}
:- type car
    --->    car(
                make            :: string,
                registration    :: string,
                owner           :: person
            ).

:- type person
    --->    person(
                name            :: string,
                date_of_birth   :: date,
                address         :: string
            ).
\end{myverbatim}
and the bindings
\begin{myverbatim}
    Fred = person(
               "Fred Bloggs",
               date(1965, 7, 4),
               "11 Strangetrousers Terrace"
           ),
    Car  = car( 
               "Ford Escort",
               "PQR 123",
               Fred
           )
\end{myverbatim}
We can obtain the @name@ of the @owner@ of @Car@ with
\begin{myverbatim}
    Car ^ owner ^ name = "Fred Bloggs"
\end{myverbatim}
@Car ^ owner ^ name@ is the same as writing @(Car ^ owner) ^ name at .

If Fred moves house and we need to update his address then we can write
\begin{myverbatim}
    Car1 = (Car ^ owner ^ address := "42 Strawberry Fields")
\end{myverbatim}
which is the same as if we'd written
\begin{myverbatim}
    Owner  = Car ^ owner,
    Owner1 = (Owner ^ address := "42 Strawberry Fields"),
    Car1   = (Car   ^ owner   := Owner1                )
\end{myverbatim}
So the parentheses matter.  If we just wanted an updated version of the
@owner@ field of @Car@, rather than an updated version of @Car@ itself,
we'd write
\begin{myverbatim}
    Owner1 = ((Car ^ owner) ^ address := "42 Strawberry Fields")
\end{myverbatim}
which this time is the same as having written
\begin{myverbatim}
    Owner  = Car ^ owner,
    Owner1 = (Owner ^ address := "42 Strawberry Fields")
\end{myverbatim}

\subsubsection{Mixing Named and Unnamed Fields}

It is not necessary to name every field in a data constructor:
\begin{myverbatim}
:- type person
    --->    person(
                name            :: string,
                date_of_birth   :: date,
                address         :: string,
                children        :: list(child),
                /* secrets */      list(secret)
            ).
\end{myverbatim}
The @list(secret)@ field is not named and, unlike the named fields, can
therefore only be accessed by deconstruction and ``updated'' by
construction.

\subsubsection{User-Defined Field Access Functions}

It can be useful to explicitly define functions to be used for field access,
either to provide ``virtual'' fields that are computed rather than stored
(C$\sharp$ calls such things \emph{properties}), or to provide sanity checks
on accesses and updates, or to define ``indexed fields''.

Say we want to add a ``virtual'' field to the @person/0@ type, computing the
number of children a particular individual has.
(Virtual field access functions do not represent real fields, of course, and
so don't appear as field names in the type definition.)
To do this we simply write
\begin{verbatim}
:- func person ^ num_children = int.

Person ^ num_children = list.length(Person ^ children).
\end{verbatim}

We can add a sanity check to the @children@ field's update function
ensuring that a person never \emph{loses} children when the list is updated:
(here we are just providing our own definition for the @children/2@ field
access function; Mercury will not construct a definition for a field access
or update function if we have already provided one)
\begin{verbatim}
:- func (person ^ children := list(child)) = person.

(person(A, B, C, D0, E) ^ children := D) = Person :-
    ( if all [Child] (
            list.member(Child, D0) => list.member(Child, D)
      ) 
      then Person = person(A, B, C, D, E)
      else exception.throw("you can't unhave children!")
    ).
\end{verbatim}
(The condition of the conditional goal reads ``for all @Child@, \emph{if}
@Child@ is a member of @D0@ \emph{then} @Child@ is a member of @D at .'')

We can also add ``virtual, indexed'' fields.  Say, for example, we wish to
access a @person@'s @children@ by number.  Then we might write
\begin{verbatim}
:- func person ^ child(int) = child.

Person ^ child(N) = list.index0_det(Person ^ children, N).
\end{verbatim}
where @list.index0_det(Children, N)@ returns the @N at th member of @Children@,
(counting the first as zero, in traditional computer science style) or
throws an exception if @N@ is out of range.

(The expression @Person ^ child(N)@ is syntactic sugar for
@child(N, Person)@.)

\XXX{Should I mention @map.elem/2@ etc. here as examples?}

Virtul indexed fields may have any number of arguments.  For example, the
two-dimensional array module, @array2d@ in the Mercury standard library,
defines @Array2D ^ elem(Row, Column)@ to access elements of the array.

\subsubsection{Restrictions on Field Names}

At the time of writing, Mercury requires that no field name be defined twice
in the same module in order to avoid problems with ambiguity.  The following
are therefore all errors.

This example is illegal because it uses the same field name twice in the
same data constructor:
\begin{verbatim}
:- type foo ---> foo(a :: int, a :: int).
\end{verbatim}
The next example uses the same field name in two different data
constructors and is therefore illegal, despite the fact that they belong to
the same type:
\begin{verbatim}
:- type foo ---> foo1(a :: int)
            ;    foo2(a :: int).
\end{verbatim}
Finally, one cannot use the same field name twice, even in different types:
\begin{verbatim}
:- type foo ---> foo(a :: int).
:- type bar ---> bar(a :: int).
\end{verbatim}
There is no problem with different \emph{modules} using the same field name
since the module qualified forms of the name will be different.

(The names @foo@, @bar@, @baz@ and @quux@ are conventionally used by
computer scientists when they can't think of anything better to use in an
example.)

\subsubsection{Semideterministic Field Access}

Consider the following type:
\begin{myverbatim}
:- type data ---> empty ; datum(payload :: int, rest :: data).
\end{myverbatim}
Given an @X@ of type @data/0@, the expression @X ^ payload@, may
\emph{fail}.  The field access @X ^ payload@ can only succeed if @X@ is
bound to a @datum/2@ value -- if @X@ is bound to @empty@ then there is no
@payload@ field.

Field accesses for field names that do not occur in all the data
constructors of a type are therefore \emph{semideterministic}.  Be aware of
his point, as it can from time to time lead to confusing error messages from
the compiler.

\XXX{Do I need to say anything about using pattern matching to resolve the
issue?}

\subsection{Parameterised Discriminated Union Types}

So far we've only described ``concrete'' types.  It turns out to be very
useful to generalise over whole families of types.  The binary
tree \emph{structure} used in @int_tree/0@, for instance, would also work
fine for strings, floating
point numbers and pretty much anything else.  Rather than defining
nearly identical types (\eg @string_tree/0@ and @float_tree/0@) every time
we need them, it is far better to \emph{parameterise} our definition, thus:
\begin{verbatim}
:- type tree(T) ---> empty ; branch(T, tree(T), tree(T)).
\end{verbatim}
A value of type @tree(T)@, then, is either @empty@ or a @branch(X, Y, Z)@
where @X@ is of type @T@ and @Y@ and @Z@ are of type @tree(T)@.

Since we are now working over arbitrary types rather than plain @int at s, we
need to use \emph{generic} comparison predicates for the insert and look-up
operations:
\begin{verbatim}
:- func insert(T, tree(T)) = tree(T).

insert(X, empty          ) = branch(X, empty, empty).

insert(X, branch(Y, L, R)) =
    (      if X @=< Y then branch(Y, insert(X, L), R)
      else /* X @>  Y */   branch(Y, L, insert(X, R))
    ).

:- pred tree(T) `contains` T.
:- mode in      `contains` in is semidet.

branch(X, L, R) `contains` Y :-
    O = ordering(X, Y),
    (   O = (=)
    ;   O = (<),    R `contains` Y
    ;   O = (>),    L `contains` Y
    ).
\end{verbatim}
The predicates, \verb!@<!, \verb!@=<!, \verb!@>=!, and \verb!@>!, 
compare two terms in Mercury's \emph{standard ordering}.
It doesn't matter what that ordering is for our purposes here, although the
curious reader can find out more from the Mercury Reference Manual
\XXX{Put this documentation in the Refence Manual!}.
The function
@ordering(X, Y)@ returns @(=)@ if @X = Y@, @(<)@ if \verb!X @< Y!, and
@(>)@ if \verb!X @> Y!.

We can substitute any type we like for the parameter @T@ in @tree(T)@:
@tree(int)@ is a binary tree of integers, @tree(string)@ is a binary tree of
strings, and so on.  The implementations for @insert/2@ and @contains/2@
will work regardless of which type is substituted for @T@, something we
could not have managed if we'd stuck to using @int_tree/0@, @string_tree/0@
and so forth.
Moreover, Mercury will tell us at compile time if we ever attempt to insert
a @string@, say, into a @tree(int)@ (compare this with Java, say, where such
a problem would only become apparent when the running program threw an
exception.)

A type may have more than one parameter.  We might want to
generalise our @tree/1@ type to store arbitrary key-value mappings to form a
dictionary structure:
\begin{verbatim}
:- type dict(K, V)
    --->    empty
    ;       branch(K, V, dict(K, V), dict(K, V)).
\end{verbatim}
The insert and lookup operations are similar to their @tree/1@ counterparts:
\begin{verbatim}
:- func insert(K, V, dict(K, V)) = dict(K, V).

insert(Ka, Va, empty) =
    branch(Ka, Va, empty, empty).

insert(Ka, Va, branch(Kb, Vb, L, R)) = Tree :-
    O = ordering(Ka, Kb),
    (   O = (=),    Tree = branch(Ka, Va, L, R)
    ;   O = (<),    Tree = branch(Kb, Vb, insert(Ka, Va, L), R)
    ;   O = (>),    Tree = branch(Kb, Vb, L, insert(Ka, Va, R))
    ).

:- pred lookup(dict(K, V), K,  V).
:- mode lookup(in,         in, out) is semidet.

lookup(branch(Ka, Va, L, R), Kb, Vb) :-
    O = ordering(Ka, Kb),
    (   O = (=),    Vb = Va
    ;   O = (<),    lookup(R, Kb, Vb)
    ;   O = (>),    lookup(L, Kb, Vb)
    ).
\end{verbatim}
It is a requirement that any type variables appearing in a data constructor
in a type definition must be parameters of the type.  The following
violates the rule and would be reported as an error:
\begin{verbatim}
:- type awful ---> mistake(T).
\end{verbatim}
The justification is that Mercury has to be able to work out what type @X@
has in @mistake(X)@ in every deconstruction.
This definition,
were it legal, says that @X@ could be anything at all
-- there is, in general, simply no way for the compiler to work out the
actual type ahead of time.
(Chapter \XXX{} explains existentially quantified types which \emph{can} be
used to construct heterogeneous collections.)

\XXX{Should I mention the shadow-types ``pattern''?}

\section{Equivalence Types}

Imagine we are writing a personnel database relating people to lists of
various attributes and that we have already defined the types @person/0@ and
@attribute/0 at .
The @dict/2@ type defined above is a likely candidate for the job.
Specifically, we'd want to use
\begin{verbatim}
    dict(person, list(attribute))
\end{verbatim}
This is something of a mouthful.  Typing this everywhere we needed it would be
tiring and, worse, a maintenance problem if we later decide that, say,
@set(attribute)@ is a better idea than @list(attribute)@.

The Mercury solution to the problem is to use an equivalence type:
\begin{verbatim}
:- type persattrs == dict(person, list(attribute)).
\end{verbatim}
and hereafter we can write @persattrs@ as shorthand for the type on the
right hand side of the @==@ symbol.  It makes no difference to Mercury
whether you use @persattrs@ or the full type name.

A common idiom is to use equivalence types to give use-specific names
to primitive types.  For instance
\begin{verbatim}
:- type name         == string. % Must be non-empty.
:- type age          == int.    % Must be non-negative.
:- type num_children == int.    % Must be non-negative.
\end{verbatim}
@type@, @pred@ and @func@ declarations that use these names
appropriately become much more readable.

Equivalence types may also be parameterised.  For example, the following
describes a suitable representation for graph structures with labelled
vertices:
\begin{verbatim}
:- type graph(T) == map(vertex(T), list(vertex(T))).
\end{verbatim}
(We assume the type @vertex/1@ has been defined elsewhere.)
\XXX{Do we need more explanation of this type?  Or a simpler example?}

Note that, as with discriminated union types, any type variable appearing on
the right hand side of the @==@ must also appear as a parameter on the left
hand side.  Thus the following is in error:
\begin{verbatim}
:- type woeful == list(T).
\end{verbatim}

\section{Abstract Types}

It's not always a good to reveal the \emph{definition} of a type to its
users.  For one thing, doing so means one cannot later change the
definition of the type without the risk of breaking other programs that use
it.  It is a good engineering principle to hide 
implementation detail from users wherever possible.

The @dict/2@ dictionary data type we defined earlier is a good candidate
for a library module that we can re-use over and over again.  The
implementation details of @dict/2@ are not important to the users of the
module: they will only be interested in the functionality on offer.
Furthermore, we want the freedom to change the definition of @dict/2@ at
some later point if we find a more efficient representation -- and we want
to be able to do so without requiring existing users to make changes to
their code.

The solution to the problem is to make @dict/2@ an \emph{abstract type}.
\begin{myverbatim}
:- module dict.
:- interface.


:- type dict(K, V).     % Abstract type.

:- func new_dict = dict(K, V).

:- func insert(K, V, dict(K, V)) = dict(K, V).

:- pred lookup(dict(K, V), K,  V  ).
:- mode lookup(in,         in, out) is semidet.


:- implementation.


:- type dict(K, V)      % Definition of abstract type.
    --->    empty
    ;       branch(K, V, dict(K, V), dict(K, V)).

new_dict = empty.

...
\end{myverbatim}
The line @:- type dict(K, V).@ declares @dict/2@ to be an abstract type:
an abstract type is just one whose type declaration omits the definition.
The full type definition instead appears in the implementation section.

Since the users of an abstract type have no idea how it is defined, they can
neither create values of that type directly via construction nor examine
them with pattern matching (deconstruction).  The only way to manipulate
such values is to use the operations exported by the module defining the
abstract type.

This module includes the function @new_dict/0@ so that users can create new,
empty @dict/2@ values in the first place (functions of arity zero are just
fixed values.)
Other than that, users can only extend @dict/2@ values with @insert/3@ and
examine them with @lookup/3 at .
The following is therefore wrong:
\begin{myverbatim}
:- import_module dict.

this_wont_compile :-
    Dict0 = empty,
    Dict1 = branch("Gateau", 'x', Dict0, Dict0),
    ...
\end{myverbatim}
because the implementation section of the @dict@ module is the \emph{only}
place that knows what the @dict/2@ data constructors are.  This, however, 
is fine:
\begin{myverbatim}
:- import_module dict.

this_will_compile :-
    Dict0 = new_dict,
    Dict1 = dict.insert("Gateau", 'x', Dict0),
    ...
\end{myverbatim}

\section{Explicit Type Qualification}

The types of the \emph{head variables} in a clause (\ie those appearing in
the clause head) are given in the corresponding @pred@ or @func@
declaration.  The compiler deduces the types of the \emph{local} variables
via a process known as \emph{type inference}.  Very occasionally, type
inference alone is not sufficient to unambiguously decide the type of a
local variable.

\subsection{Name Ambiguity}

Sometimes these problems arise due to overloading.  Say we have a module
@a@ exporting a predicate called @num/1@,
\begin{myverbatim}
:- module a.
:- interface.

:- pred num(int).
:- mode num(out) is multi.

:- implementation.

num(1).
num(2).
num(3).
\end{myverbatim}
and a module @b@ also exporting a predicate called @num/1@,
\begin{myverbatim}
:- module b.
:- interface.

:- pred num(float).
:- mode num(out) is multi.

:- implementation.

num(1.414).
num(2.718).
num(3.141).
\end{myverbatim}
and a module that imports both @a@ and @b@, defining a predicate @p/0@
testing whether there are solutions for @num/1@ whose sum is also a
solution:
\begin{myverbatim}
:- import int, float, a, b.

:- pred p.
:- mode p is semidet.

p :-
    num(X),
    num(Y),
    num(X + Y).
\end{myverbatim}
Since the code here does not module qualify @num/1@, the compiler cannot
decide if the programmer means @a.num/1@ and @int.(+)/2@ or @b.num/1@ and
@float.(+)/2 at .

There are two ways to solve the problem: either one can explicitly module
qualify the names (not always convenient) or one can add explicit type
qualifiers to local variables and/or expressions to provide the compiler
with sufficient information to solve the problem.  Explicit type
qualification connects an expression to a type via the @with_type@ keyword.
The type ambiguity in @p/0@ above can be fixed with
\begin{myverbatim}
p :-
    num(X `with_type` int),
    num(Y `with_type` int),
    num((X + Y) `with_type` int).
\end{myverbatim}
although that would be overkill.  In fact only one type qualification is
necessary, for instance
\begin{myverbatim}
p :-
    num(X),
    num(Y),
    num((X + Y) `with_type` int).
\end{myverbatim}
The compiler will reason that since the result of @X + Y@ must be an
@int@, then what is meant here is @int:(+)/2 at .  This means that @X@ and @Y@
must also have type @int@ (since the type of @int:(+)/2@ is
@func(int, int) = int@) and therefore that the calls to @num/1@ must refer
to @a.num/1@ (and not @b.num/1@ which has type @pred(float)@.)

\subsection{Type Ambiguity}

Here is an example of type ambiguity that arises for different reasons:
\begin{myverbatim}
main(!IO) :-
    io.read(Result, !IO),
    (
        Result = ok(X),
        io.print(X, !IO)
    ;
        Result = eof
    ;
        Result = error(_, _),
        throw(Result)
    ).
\end{myverbatim}
The predicates @io.read/3@ and @io.print/3@ read and write values of
virtually \emph{any} type.  However, the compiler needs to be able to work
out the type of @X@ before it can compile @main/2@ (chapter \XXX{} on
run-time type information explains why this is the case.)  Our program,
however, gives no clue as to what type @X@ is supposed to have and the
compiler will issue an error message of the form
\begin{myverbatim}
oops.m:017: In predicate `oops.main/2':
oops.m:017:   warning: unresolved polymorphism.
oops.m:017:   The variables with unbound types were:
oops.m:017:       X :: T
oops.m:017:       Result :: (io.read_result(T))
\end{myverbatim}
\XXX{I think the compiler should use @`with\_type`@ rather than @::@.}

Let's say we intended @X@ to be an @int at .
To resolve this problem we need to add an explicit type qualifier to @X@ --
it doesn't matter where, since @X@ can never change its type -- and it's
usually a good idea to pick the first occurrence:
\begin{myverbatim}
main(!IO) :-
    io.read(Result, !IO),
    (
        Result = ok(X `with_type` int),
        io.print(X, !IO)
    ;
        ...
    ).
\end{myverbatim}

\section{Types with User Defined Equality}

\XXX{Should almost certainly appear later in the book.}

\section{Types with User Defined Comparison}

\XXX{Should almost certainly appear later in the book.}

\section{Conclusion}

\XXX{Some kind of summary.}
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list