# [mercury-users] literature polymorphic recursive calls

Fergus Henderson fjh at cs.mu.OZ.AU
Fri Jul 22 20:46:58 AEST 2005

On 20-Jun-2005, Tom Schrijvers <Tom.Schrijvers at cs.kuleuven.ac.be> wrote:
> > > I am looking for literature about a particular aspect of Mercury's type
> > > inference, namely the way it deals with polymorphic recursive calls.

Enclosed below is the Latex source (sorry) of the "Types" chapter from
my draft PhD thesis.  The section on "Complexity and decidability of
type inference" discusses that issue.

Cheers,
Fergus.

----------

\section{The Mercury type system}

The benefits of static type checking have been widely recognized.
We considered static type checking to be an essential means of
achieving our goals of efficiency and reliability.  So the important
question regarding types in Mercury was not should we use a static
type system?'' but rather what sort of static type system should we
use?''.

We wanted something that was reasonably simple and easy to implement,
but that was as expressive as possible.  We also wanted the type system
to provide information to the compiler that the compiler could
use to optimize data representations.

In the end, we settled on a fairly simple type system
based on a polymorphic many-sorted higher-order logic.
Type systems of this nature have been the subject of much research
(see e.g. \cite{myc-rok,types-book}).
Mercury's type system is very similar to
the Mycroft-O'Keefe type system \cite{myc-rok},
and to the type system of G\"odel \cite{goedel}.
The major difference from G\"odel is that Mercury supports higher-order
predicates (see Chapter~\ref{chap:higher_order}).
The syntax used in the Mercury type system is the same as the syntax
that was used by the NU-Prolog \cite{nuprolog} type checkers.

The remainder of this section gives a general description of
Mercury's type system, as it was originally envisaged.
The Mercury type system has since been extended to support
However, that work goes beyond the scope of this thesis, so
in what follows, we shall describe the original Mercury
type system, which did not include support for type classes.

Mercury's type system supports parametric polymorphism:
types, predicates, and functions may be parameterized with
respect to a list of type parameters.
A \dfn{type constructor} is a name with an associated arity
that can be used to construct types.
The arity of a type constructor is the number of type
parameters that it takes.
(Type constructors can be overloaded on arity, i.e.
it is possible to define two type constructors in one
module that have the same name but different arities.)

Syntactically, a \dfn{type} is normally a term whose top-level functor
is the name of a type constructor, and whose arguments (if any) are a
list of types specify the values of the type parameters.
(For example, the Mercury standard library module \samp{list} defines a
type constructor \samp{list} of arity one, denoted \code{list/1}, and
there is also a built-in type constructor \code{int/0}.
Hence \samp{int} and \samp{list(int)} are types.)
A type may also be a type variable (e.g. \samp{T}); in that case,
it denotes the value of a particular type parameter.

In Mercury, types fall into four categories:
basic types, discriminated union types, equivalence (synonym) types,
and higher-order types.  The basic types and higher-order types
are intrinsic features of the language, whereas discriminated union
types and equivalence types may be defined by the user.
In addition to these four categories, a type which is declared
in the interface of an imported module but which is defined only in the
implementation part of that module, not in its interface, is an
abstract type.

There are only a small number of basic types.
The basic types are either built-in,
or are provided as abstract types in the
Mercury standard library.  The built-in types are
\samp{int}, \samp{char}, \samp{float}, \samp{string};
the type system has special support for literals of these types.
Other basic types defined as abstract types in the standard
library include \samp{io\_\_state}, \samp{array(T)}, and \samp{univ}.
% XXX we will discuss these in more detail later.

Discriminated union types in Mercury are equivalent to
algebraic types'' in Haskell, and are similar to variant records''
in Pascal and Ada.  They encompass the functionality of enumeration
types in other languages, and much of the functionality of record
or struct types in other languages, as special cases.

A discriminated union type definition defines a type constructor
and a set of (data) constructors for values of the type.
It is a declaration of the form

\begin{alltt}
:- type \var{Type} ---> \var{Alternatives}.
\end{alltt}

\noindent
where \var{Type} is a term whose top-level functor is the name of the
type constructor being defined, and whose arguments (if any) are a list
of distinct variables which are the type parameters of this type,
and where \var{Alternatives} is a semicolon-separated list of alternatives.
Each alternative defines a (data) constructor for the values of this type; it
consists of a term whose top-level functor is the name of the constructor,
and whose arguments (if any) are the argument types for the constructor.

Here are some examples:

\begin{verbatim}
:- type bool ---> yes ; no.
:- type list(T) ---> [] ; [T | list(T)].
:- type employee
---> employee(
string,         % name
department_id,  % department
int             % salary
).
\end{verbatim}

The above definitions define the type constructors \code{bool/0},
\code{list/1}, and \code{employee/0}, and hence the types \code{bool},
\code{employee}, \code{list(bool)}, \code{list(employee)},
\code{list(list(bool))}, and so on (ad infinitum).
They also define the constructors \code{yes/0}, \code{no/0},
\code{'[]'/0}, \code{'.'/2}, and \code{employee/3}, which
can be used to construct values of those types.

Equivalence (synonym) types are types which are defined to be
equivalent to some other type.

An equivalance type definition defines a type constructor
which can be used to construct types which are synonyms
of existing types.  It is a declaration of the form

\begin{alltt}
:- type \var{EquivalenceType} == \var{Type}.
\end{alltt}

\noindent
where \var{Type} is a term whose top-level functor is the name of the
type constructor being defined, and whose arguments (if any) are a list
of distinct variables which are the type parameters of this equivalnce type.

Equivalence types are similar to typedefs'' in C or C++,
but they're a bit more powerful, because they can be polymorphic
(C++ doesn't have template typedefs'').   Another difference from
C/C++ typedefs is that in Mercury an abstract type can be implemented
as an equivalence type.

% \noindent
Types can be defined in any order with a module;
Mercury allows forward references as well as (mutually) recursive types.

Mercury requires that all type variables
occurring on the right hand side of a type definition
also occur as type parameters on the left hand side.
This requirement, known as the transparency'' condition,
is the same as in G\"odel.
It would be possible --- and it might well be desirable ---
to relax this requirement, to support types which are locally universally
and/or existentially quantified.
Existentially quantified types have been proposed for Haskell
\cite{existential_types}, % konstantin laufer
and both existentially quantified and universally quantified
types in type definitions have been implemented in at least one
However, in the interests of simplicity, we decided to keep this
requirement, at least for the time being.

The types of the arguments for a predicate can be declared
using a \samp{:- pred} declaration such as the following:

\begin{verbatim}
:- pred member(T, list(T)).
\end{verbatim}

Similarly, the types of the arguments and the type of the return value
for a function can be declared using a \samp{:- func} declarations:

\begin{verbatim}
:- func atan2(float, float) = float.
\end{verbatim}

Type declarations are mandatory for predicates or functions that
are exported from a module, but are optional for predicates or
functions that are local to a module.
The compiler will automatically infer the types of all variables
and the types of all local predicates and functions for which
there is no type declaration.
As will be explained more formally in the following section,
it is a compile-time error if there is not a unique most general
assignment of such types that makes the program type-correct.

This part of the design was influenced by G\"odel, which also
In Mercury, you can have more than function or predicate
with the same name and arity, so long as they are defined in different
modules.  Furthermore, you can also have constructors with the same
name and arity, and (unlike G\"odel) even in the same module,
so long as they are not defined in the same type.

The Mercury compiler uses types to resolve such ambiguities wherever possible.
For each use of an overloaded name, the type inference process will
figure out which one it could be, based on the types.
If there is more than one possibility, then (but only then) the
compiler will report an ambiguity error.

I believe that one of the aims of the designers of Haskell and ML was that
users should never have to include type declarations in their programs,
and that one of the reasons for the prohibition on overloading was
intended was to ensure this.
At the time that these languages were designed, type inference was a
relatively new thing, and the idea of \stress{never} having to declare
the types yet still getting type safety was seen as a very good thing.
more difficult; sometimes the compiler will not be able to infer
an unambiguous type, and in those circumstances it must report an error;
to avoid the error, the user must insert an explicit type declaration
somewhere.

effect to the one that I believe was intended ---
type information in their programs, rather than less.
Because these languages forbid any name clashes, good programmers
will often adopt practices to avoid name clashes,
such as prefixing all constructor names with the name of the type.
In other words, they will effectively type-qualify
every occurrence of every constructor.  With a language like Mercury
type declarations in places where the use of a particular constructor
would otherwise be ambiguous.

allows a programmer to import two different modules that both
happen to define one or more symbols, without any need to explicitly
rename apart the clashing symbol names at the point of import.
It's better for a compiler to report name clashes like this
only when the names involved are actually used, because often
the names will not be used, and in that case there is no ambiguity.
Even if they are used, if the compiler uses types to resolve
ambiguities, then most such uses will not result in any ambiguity.
Only in rare cases will the programmer need to insert an explicit
the amount of effort that programmers need to expend on dealing
with name clashes.

Note that Haskell's type class system was originally seen as
Haskell's type class system doesn't solve the problems discussed
features, and we prefer a language which has both.

programs for which the compiler can easily resolve the ambiguities
using the types, but which human readers may nevertheless find
confusing.  However, we don't consider that to be a major problem.
It is of course possible to write confusing or obfuscated programs
in any language.  We prefer to give programmers the freedom to use

% \section{Dynamic Types}
%
% \xxx{Should I include a section here describing the support for dynamic types?}

\section{Formal definition of type correctness}
\input{formal.tex}

\section{Complexity and decidability of type inference}
\label{sect:decidability}

Having given a formal definition of type correctness,
it is normal practice to proceed by giving an algorithm
for type checking and/or type inference, and to prove
certain positive properties of this algorithm, such as
soundness and completeness (i.e. decidability), and
perhaps some result about its asymptotic complexity.
Ideally we might even hope to give some upper bound on
the complexity of our algorithm.

Since we want to maintain a clear separation between the language
design and the language implementation, we shall defer
a detailed discussion of the algorithm that the Mercury compiler
uses for type inference and type checking until
Part~\ref{part:implementation} (Implementation).
However, at this point we should still at least try to satisfy
the reader that our design is feasible.
So in this section we will investigate a couple of issues
relating to decidability and complexity of type checking
and type inference.  In order to convince the reader of
the feasibility of our design, it turns out that we in fact
will need to divulge a couple of important aspects of our
implementation and of our experiences using it.

The basic type system that we have used is really quite similar to the
Hindley/Milner-style type systems used for ML, Haskell, etc.
By now the programming community has had plenty of experience
with real systems using those type systems, and so it is safe
to conclude that they are quite feasible in practice.

However, two relatively novel aspects of our type system might well lead
alert readers to doubt the feasibility of type checking for this system.
The first one is the presence of overloading.  One way to handle
overloading is for the type inference process to keep track of a
set of type assignments rather than just single one.  However,
in the worst case, the size of this set of type assignments
may increase exponentially, since each new occurrence of an overloaded
symbol can double (or worse) the size of this set.
Other implementations are possible, but the same kind of issues arise.
Indeed, even if we restrict the type system by requiring a
type declaration for every function and every predicate,
so that the type checker only has to infer the types of variables,
then it still turns out that the type checking problem is NP-hard.
The proof of this is in Appendix~\ref{chap:np_complete}.

The other problematic aspect of our type system is polymorphic recursion
\cite{polymorphic_rec1,polymorphic_rec2}.
The original type systems of ML and Haskell prohibited
polymorphic recursion; that is, if a function $f$ has type $F$,
and there is a recursive call to $f$, then the type in the
recursive call must be exactly $F$,
rather than merely being an instance of $F$.
For example, an ML definition equivalent to the following
Mercury code would not be allowed,

\begin{verbatim}
:- type t(T) --> leaf(T) ; node(t(t(T)), t(t(T))).
:- func foo(t(T)) = int.
foo(leaf(_)) = 1.
foo(node(X, Y)) = foo(X) + foo(Y).
\end{verbatim}

\noindent
because in the recursive calls, \samp{foo}
has type \samp{func(t(t(T))) = int} rather
than \samp{func(t(T)) = int}.
In contrast, Mercury does allow this.

Haskell was later changed to allow polymorphic recursion,
but only if there was an explicit type declaration for the
function in question.  Polymorphic recursion was still
forbidden for functions whose type was to be inferred,
because of a fundamental problem:
it has been proven that in the presence of polymorphic recursion
type inference for systems like this
is only semi-decidable \xxx{cite references}.
Since Mercury allows polymorphic recursion, we believe the same
applies to Mercury (although we have not bothered to prove it,
for reasons that will become clear below).
Indeed, for simple examples such as

\begin{verbatim}
bar(bar).
\end{verbatim}

\noindent
the original implementation of type inference in the Mercury
compiler would go into a loop, with each successive iteration
of type inference inferring the type of \samp{bar} as
\samp{pred(T)}, \samp{pred(pred(T))}, \samp{pred(pred(pred(T)))}, \ldots

So, the theoretical results that we have are both negative ones:
type inference is undecidable, we believe, and even mere type checking
(inferring only the types of variables, not the types of
functions or predicates) is definitely NP-hard.
These two issues both sound like serious problems.
So what can we do?

Well, it turns out that for the most part,
we can simply ignore them!  These theoretical results are
based on worse-case scenarios, but in practice it turns
out that these kind of worst case scenarios simply don't
occur often enough to be a significant problem, and even
when they do occur, there are simple work-arounds.

code containing many occurrences of overloaded symbols which
cannot be easily resolved, because such code is usually just
as difficult for humans readers to understand as it is for
a compiler to type-check.

is straight-forward; it does not make any particularly strenuous
efforts to avoid performance problems when type-checking such code.
If you do write code for which the compiler has trouble resolving

\begin{verbatim}
foo.m:123: In clause for predicate foo/2':
foo.m:123: This may cause type-checking to be very slow.
foo.m:123: It may also make your code difficult to understand.
\end{verbatim}

\noindent
However, we have never heard any reports of anyone actually encountering
this warning in real code.

With regard to polymorphic recursion,
we simply set a limit on the number of iterations of
type inference, and if that limit is exceeded, then
the compiler will stop, printing out an error message,
together with the types that it has inferred so far,
and instructions on how to increase the iteration limit:

\begin{verbatim}
Type inference iteration limit exceeded.
This probably indicates that your program has a type error.
You should declare the types explicitly.
(The current limit is 60 iterations.  You can use the
--type-inference-iteration-limit' option to increase the limit).
silly_example.m:004: Inferred :- pred bar((pred (pred (pred
(pred (pred (pred (pred (pred (pred (pred (pred (pred (pred
(pred (pred (pred (pred (pred (pred (pred (pred (pred (pred
(pred (pred (pred (pred (pred (pred (pred (pred (pred (pred
(pred (pred (pred (pred (pred (pred (pred (pred (pred (pred
(pred (pred (pred (pred (pred (pred (pred (pred (pred (pred
(pred (pred (pred (pred (pred (pred (pred T1)))))))))))))))
)))))))))))))))))))))))))))))))))))))))))))))).
\end{verbatim}

The iteration limit is set high enough that the vast majority
of type-correct programs will never encounter it.
So the message does, as it states, usually indicate a type error.
Because the error message contains the types inferred so far,
the programmer can normally see at a glance what the problem is.

In theory, the error message could also sometimes occur
even for type-correct programs.  In that case, the programmer
can simply recompile with an increased iteration limit
(for any type-correct program, the number of iterations required
will be bounded).
% since the types inferred at each new iteration step
% must be an instance of the types inferred at the previous step,
% can only increase
% the size of the types inferred.
But again, in practice we have never heard any reports of
anyone actually having to do this.

% \xxx{Should compare with the above-cited papers}
% This approach is similar to, but simpler than, the idea of using
% semi-unification; neither \stress{eliminates} the decidability problem,
% but in both cases it seems that in practice it is not a significant issue.
% Compared to the semi-unification approach, our suggestion requires far
% less modification of the basic algorithm -- all that we need to do is
% to keep track of the iteration count.

To summarize, you need to be very careful about drawing conclusions
from theoretical results about worst-case complexity and decidability,
because it is often possible to arrange things so that the worst-case
results won't occur frequently enough (or so that the consequences if
they do occur won't be severe enough) for them to pose any significant
problems in practice.  Our confidence in the feasibility of our
type system comes from empirical observations, not from theoretical
considerations.

Having written large amounts of code ourselves, and
having received feedback from scores of other users of our system
over a period of about five years, we're confident that our
empirical observations are reliable.
We have certainly run into some (minor) limitations of our type system,
but they relate to expressiveness issues such as the lack of existential
types, rather than to decidability or speed of type checking.

\section{Records}

One limitation of the Mercury type system as it currently
stands is that the support for discriminated union types
as records is a bit limited; specifically, there is no
direct way to name the fields, and the syntax for extracting
fields from a record or (non-destructively) updating
a field of a record is a bit cumbersome, and requires
hard-coding the number of fields that the record has.
This is a limitation that Mercury has inherited from its Prolog background.
It is possible to use equivalence types to name the fields,
and to write access functions and predicates for field extraction
and update, as in the following example.

\begin{verbatim}
:- type employee
---> employee(
name,
department_id,
salary
).
:- type name == string.
:- type salary == string.

:- func name(employee, name).
name(employee(Name, _, _)) = Name.

:- func salary(employee, salary).
salary(employee(_, _, Salary)) = Salary.

:- pred set_salary(employee::in, salary::in, employee::out) is det.
set_salary(employee(N, D, _), Salary, employee(N, D, Salary)).
\end{verbatim}

However, doing this can be quite tedious.  We plan to extend the language
to support named fields with field extraction and (non-destructive)
field update, similar to the support for that kind of thing in Haskell 1.4,
in some future version of Mercury.

\section{Directed type systems}
\label{sect:subtypes}

One limitation of the Mercury type system is that it does not allow
subtypes and supertypes, or undiscriminated unions.  Instead,
in Mercury these features are handled using the mode system.

An undiscriminated union type, if Mercury were to allow them, might look
like this:

\begin{verbatim}
:- type number == (int ; float).
\end{verbatim}

This would mean that the type \samp{number} was the undiscriminated
union of the types \samp{int} and \samp{float}, i.e. that
\samp{int} and \samp{float} were subtypes of \samp{number},
and conversely that \samp{number} was a supertype of both \samp{int}
and \samp{float}.

There is an immediate difficulty with supporting subtypes in a logic
programming language.  If a predicate declared to take an argument of type
\samp{int} is called with an argument variable whose type is
\samp{number}, should that be a type error?  The answer is that it
depends on the direction of data-flow, i.e. on the mode in which the
predicate is called.  If the argument is an output argument, then
there is no possibility that this could lead to any problem at runtime,
since \samp{int} is a subtype of \samp{number}, and so the \samp{int} value
output will always be a valid value for the type \samp{number}.
However, if the argument is an input argument, then this might lead
to a predicate declared to take an argument of type \samp{int} being
called with an argument of type \samp{float}, which should be reported
as a type error, if the type system is to provide useful guarantees at
compile time.

This has lead to a lot of research on the topic of directed type systems'',
which (from the Mercury point of view) could also be called combined
type/mode systems.
\xxx{insert citations}

Unfortunately, however, the obvious ways of incorporating all the desired
features into a directed type system all seem to require exponential-time
algorithms to infer the types of variables within a clause.
At the time we designed the type system,
finding a feasible (i.e. polynomial time, at least in the usual case)
algorithm to infer the
types of variables within a clause in a type/mode system that includes
parametric polymorphism, that allows subtypes and undiscriminated
unions, and that allows more than one mode per predicate, was
an as yet unsolved research problem.

Gert Smolka's PhD thesis \cite{smolka-phd} described a language call TEL
whose type system combined parametric polymorphism with support for subtypes.
Unfortunately in the search for a feasible algorithm for type inference
of variables within a clause, he was forced to choose one that actually
inferred the wrong type in certain situations.
For a goal such as \samp{X = [1|Xs]}, it would infer the type
\samp{list(int)} for \samp{X}, not \samp{list(number)}.  As a result,
the unification would fail at runtime if \samp{Xs} contained any elements of
type \samp{float}.

Karl Stroetmann and some other people working at Siemens on a system
called PAN have recently published work \cite{pan} on a new approach to the
problem which allows undiscriminated unions and infers types of
variables in a reasonable amount of time.  Their typechecker builds up
a set of subtype constraints and then solves them.  The constraint-solving
algorithm is not complete, so the typechecker rejects some programs
which one would like it to accept, but it is sound, and they claim
(with reasonable justification) that the cases where it is incomplete
would in practice occur only very rarely.

However, I believe their approach only allows a single mode per
predicate.  Disallowing multiple modes in order to allow
undiscriminated unions would be throwing the baby out with the bath
water.

Even more recently, Peter Stuckey described a different algorithm,
also based on viewing type checking/inference as constraint solving
\cite{pjs_subtypes1,pjs_subtypes2}.
and subtypes can be viewed as disjunctive constraint solving, and
suggests solving the resulting disjunctive constraint system
using finite domain methods and generalized propagation.
This approach should lead to good performance in more cases than our
rather naive implementation, which is based on a simple breadth-first search.
It might thus make undiscriminated unions feasible.

However, undiscriminated unions are not really needed,
because it is almost always possible to use Mercury's discriminated unions

\begin{verbatim}
:- type number ---> i(int) ; f(float).
\end{verbatim}

Furthermore, as will be explained below, you can use Mercury's mode system
to represent subtypes.

Undiscriminated unions are an extremely useful feature if you are trying to
retrofit a typechecker on existing code from an untyped language such as
Prolog, without modifying that code.
But if you are willing to modify your code slightly, or if you
are writing code from scratch, they are not needed.  Since we didn't
intend Mercury to be backwards compatible with Prolog, the additional
complexity of undiscriminated unions would have been unwarranted.

One interesting point is that in a Prolog context, people are often
unwilling to modify their code in this way because it would make it
less efficient.  However, because Mercury optimizes the representation
of each data type (which is overall a big win), the trade-offs are
different in Mercury and indeed the code above using discriminated unions
has exactly the same
efficiency that the original code would have if Mercury supported subtypes.

% The difficulty is partly related to the fact that Mercury specializes the
% representation for each type.  For example, the functors foo/1' of type
% foo_a' and bar/1' of type foo_b' have different types, so they may both
% be represented by the tag zero.  Conversion of these functors from type
% foo_a' or foo_b' to the hypothetical union type foo' must necessarily
% involve a representation change, because otherwise both functors have the
% same representation.  This means that when the Mercury compiler generates
% code for X = bar(_)', it would need to know the type of X' so that it
% can construct the appropriate representation of the functor.

If you are using discriminated unions, then you can in fact use
Mercury's mode system to express subtyping.  For example, if you
have a type

\begin{verbatim}
:- type suit ---> spades ; hearts ; clubs ; diamonds.
\end{verbatim}

then you can declare sub-types'' (actually insts) of this type
as follows,

\begin{verbatim}
:- inst black_suit ---> spades ; clubs.
:- inst red_suit ---> hearts ; diamonds.
\end{verbatim}

\noindent
and you can then uses these insts in mode declarations to express
subtype relationships, e.g.

\begin{verbatim}
:- pred matching_suit(suit, suit).
:- mode matching_suit(in(black_suit), out(red_suit)) is det.
matching_suit(clubs, diamonds).
\end{verbatim}

\noindent
(
% Here the mode \samp{in(Inst)} means \samp{Inst -> Inst}
% and \samp{out(Inst)} means \samp{free -> Inst}.
The exact semantics of these inst and mode declarations will be
explained in the next chapter.)

This works because Mercury's mode system is in fact expressive
enough that you can use the mode system as a directed type system.
One disadvantage of that approach is that (at least in the current
version of Mercury) the Mercury mode system is not fully polymorphic:
although you can have polymorphic insts and mode definitions, the
mode declarations for functions or predicates must be monomorphic.
The net result of this is that if you pass a variable whose inst
represents a subtype through a polymorphic predicate, and then attempt
to use an output parameter from that predicate, expecting it to be
of the subtype, rather than the supertype, the you must insert
an explicit runtime check to ensure that the result has the right subtype.

% XXX mention :- subtype' proposal?

It is also possible to use Mercury's overloading of functors to achieve a
similar effect to subtyping (e.g. by declaring \samp{black\_suit}
and \samp{red\_suit} as types rather than as insts).
However, this is likely to be less efficient and less convenient,
because each type will have its own representation, and because
all conversions from a subtype to its supertype must be explicit --
you would need to write conversion predicates to convert between
functors of the same name but different types, since they may have
different representations.

Another thing to be careful of with this approach is
performance problems during type checking, as previously discussed.

--
Fergus Henderson                    |  "I have always known that the pursuit
|  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au