[m-rev.] More tutorial text

Ralph Becket rafe at cs.mu.OZ.AU
Fri May 31 13:10:32 AEST 2002
Previous message: [m-rev.] diff: fix MLDS nondet pragma c_code bug
Next message: [m-rev.] diff: fix `.mih' installation for default grade
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Comments welcome.


sketch.txt
Ralph Becket <rafe at cs.mu.oz.au>
Mon Oct 15 13:05:39 EST 2001
 vim: ft=txt ts=4 sw=4 et wm=10



HELLO, WORLD!

    Because it's traditional...  Type the following into a file called
    "hello.m".

    [begin code]

        :- module hello.

        :- interface.
        :- import_module io.

        :- pred main(io, io).
        :- mode main(di, uo) is det.

        :- implementation.
        :- import_module string.

            % The show starts here.
            %
        main(IO0, IO) :-
            io__print("Hello, world!\n", IO0, IO).

    [end code]

    Compile and run the program with

    [begin code]

        $ mmake hello.depend
        $ mmake hello
        $ ./hello
        Hello, world!

    [end code]

    We'll start by just listing some of the salient points illustrated
    by the "Hello, world!" program.

    - Modules live in files with the same name (with a ".m" suffix).
    - Every module starts with a declaration giving its name.
    - Non-code directives and declarations are introduced with ":-".
    - Every declaration ends with a full stop.
    - Modules are divided into interface and implementation sections.
    - The interface section lists the things that are exported by the
      module.  The top-level module in a Mercury program must export a
      predicate main/2 (functors are conventionally referred to with
      name/arity).
    - The implementation section provides the code for the functions
      and predicates exported in interface section.
    - A module has to be imported before the things it defines can be
      used.  Hence io__print refers to the print predicate defined in
      the io module.
    - The basic computational device in Mercury is the predicate.
      Predicates have type signatures and mode signatures - the latter
      specifying which arguments are inputs and which are outputs.
    - Comments start witha "%" sign and extend to the end of the line.

    (IO is handled in Mercury by explicitly passing around the "state
    of the world" - each IO operation takes the current state and
    produces a new one; the old state becoming unavailable for use
    thereafter.  This might sound odd, but is a necessary part of
    keeping Mercury a pure language.  XXX This will be explained in
    more detail in the section on declaration IO.)

    Special syntax exists to simplify passing state variable pairs
    around; for "Hello, world!" this becomes

    [begin code]

        main(!IO) :-
            io__print("Hello, world!\n", !IO).

    [end code]



DECLARATIVE VS IMPERATIVE PROGRAMMING

    DEFINITIONS

        Imperative: a list of instructions for transforming state.

        Declarative: a specification of /what/ is to be computed.

        Declarative languages typically have a very simple translation
        into conventional mathematical logic, which is how the meaning
        of a program is defined.

        Pure declarative programming languages exhibit referential
        transparency.  In a nutshell, this means that anywhere you see
        a reference to a definition, you can replace it with the
        definition itself and it will make no difference.

        In more formal language,

            let x = e in M  ==  M[e/x]

        This is clearly not true of imperative languages.  Consider
        the following C example:

        [begin code]

            int g = 0;

            int f(int x)
            {
                g = g + 1;
                return x + g;
            }

            void main(int argc, char **argv)
            {
                int tmp = f(1);
                int a   = tmp  + tmp;
                int b   = f(1) + f(1);

                if(a == b)
                    printf("equivalent\n");
                else
                    printf("not equivalent\n");
            }

        [end code]

        According to C semantics, this program should print out "not
        equivalent", proving that the expression f(1) is not equal
        to itself!  This sort of thing is great for writing buggy,
        hard to maintain code.

        [Footnote: one can give a reasonably simple operational
        semantics to C whereby we repeatedly substitute the body of
        f(1) into a string of /instructions/, but as we see here, this
        does not necessarily support the more intuitive, declarative
        reading one might hope for.]

        Since referential transparency means no side-effects, you
        can't have variables that change their state as the program
        evolves.  Instead, a "variable" in a declarative programming
        language is a label given to a value or the result of a
        computation.

        The nearest Mercury equivalent to the function f() in the C
        program above is 

        [begin code]

            :- pred f(int, int, int, int).
            :- mode f(in,  out, in,  out) is det.

            f(X, Result, Old_Value_of_G, New_Value_of_G) :-
                New_Value_of_G = Old_Value_of_G + 1,
                Result         = New_Value_of_G + X.

        [end code]

        Since there are no variables in Mercury (and hence no global
        variables), any "global" state has to be explicitly passed
        around wherever it is needed.  In this case f/4 takes the old
        "state" as its third argument and returns the new "state" in
        its fourth argument.

    BENEFITS OF DECLARATIVE STYLE

        While the lack of mutable state might seem like a serious
        drawback to programmers raised on imperative languages, in
        practice it turns out to be something of a boon.  Experienced
        imperative programmers acknowledge that fewer globals and less
        state means clearer, more maintainable, more reusable code
        with fewer bugs.

        The philosophy underlying declarative languages is to simplify the
        /writing/ of bug free programs and to leave the tedious
        business of identifying programming errors, run-time book
        keeping such as memory management, and non-algorithmic
        optimization to the compiler.  Referential transparency means
        that more optimizations can be applied in more places in a
        program than is the case with imperative programs, simply
        because proving that it is safe to apply a particular
        optimization is so much easier in the absence of side effects.

        Mercury was designed as a purely declarative, industrial
        strength programming language aimed at the rapid development
        of medium and large scale systems with the emphasis on
        producing fast, correct programs.

        To this end, Mercury doesn't support a corner-cutting
        programming style: you /have/ to get the types right, check
        return codes and so forth - the quick-and-dirty fix is rarely
        an option [footnote: Mercury does have support for impure code
        via calls to another language, but one has to label each such
        call explicitly; in the end it is often easier just to do the
        right thing.].  Very often, Mercury programmers find that once
        a program is accepted by the compiler, it also does exactly
        what was intended; in the author's long experience this almost
        never happens with imperative languages, even for small
        programs.

        [XXX Place to mention types, garbage collection, polymorphism,
        pattern matching, etc etc?]

    PRAGMATISM

        "A foolish consistency is the hobgoblin of small minds."
            -- Ralph Waldo Emmerson

        Purity is all very well, but the fact remains that
        occasionally one has to interoperate with code written in
        impure languages and that (very rarely and usually when that
        last drop of speed is required) some low-level algorithms may
        be best expressed as impure constructs.

        For the former, Mercury has a simple and well developed
        foreign language interface allowing the programmer to write
        foreign code in-line with the Mercury program (provided the
        Mercury compiler has an appropriate back-end for the foreign
        language in question - the alternative is to use C as the
        lingua franca in the traditional style.)  It is the
        programmer's responsiblility to supply the appropriate purity
        declarations for predicates defined in terms of foreign code.

        The latter is handled using purity annotations.  Impure code
        (written in a foreign language) must be labelled as such.
        These labels also have to be applied to all predicates that
        use the impure code.  At some point we hope that a pure
        interface can be presented to the programmer, in which case
        the program must include a /promise/ to the compiler that 
        impurity annotations are not required for users of the
        top-level predicate with an impure definition.

    MERCURY PHILOSOPHY

        [XXX I think I've already covered this one.  Tyson suggests an
        implementation philosophy section (e.g. no distributed fat) in
        a much later section.]

TYPES

    Mercury is strongly typed.  Every value has a type and every
    predicate associates a particular type with each argument.

    PRIMITIVE

    The primitive Mercury types are int, float, char and string -
    examples are 123, 3.141, ('x'), and "foo", respectively.
    [Footnote: Mercury also includes tuples, such as {"Route", 66},
    and higher order values, namely predicates and functions, among
    its primitive types; however, we can ignore these for now.]

    While primitive types are recognised by the compiler, no
    operations are.  Operations on the primitive types are defined in
    modules of the same name in the Mercury standard library.

    ALGEBRAIC TYPES

        You can define new types in Mercury:

        [begin code]

            :- type primary_colour
                --->    red
                ;       green
                ;       blue.

            :- type rgb_colour
                --->    rgb(float, float, float).   % red, green, blue.

        [end code]

        Here primary_colour is the name of the new type and its
        /data constructors/ (possible values) are red, green and blue.

        Similarly rgb_colour is a type with only one constructor,
        rgb/3, which in this case takes three float arguments.
        [Footnote: some languages require you to add empty parentheses
        after function symbols, such as red, green, and blue above,
        that have no arguments; Mercury, on the other hand, requires
        that the parentheses be omitted alltogether if the functor has
        no arguments.]

        You can define parametric types

        [begin code]

            :- type binary_tree(T)
                --->    leaf
                ;       branch(T, binary_tree(T), binary_tree(T)).

        [end code]

        T is a /type variable/ that can stand for any type at all.
        The type binary_tree(T) has two constructors: leaf/0 and
        branch/3 which takes a value of type T (for any choice of
        type), a binary_tree(T) and a binary_tree(T) as arguments
        respectively.

        We can have instances of binary_tree(T) for any type T - for
        example, binary_tree(int), binary_tree(string) or even
        binary_tree(binary_tree(...)).

        - T here is said to be a type variable.
        - Variables of any kind in Mercury are distinguished by
          starting with a capital letter.
        - A type may have several type parameters.
        - Any variables that appear in the constructor definitions
          must also appear in the head of the type declaration.
        - All type variables in the head must be distinct.

        You can use other types in your definitions.  For example, we
        could define a more general tree type with

        [begin code]

            :- type tree(T)
                --->    tree(T, list(tree(T))).

        [end code]

        So the type tree(T) has a single constructor tree/2 which
        takes a T and a list(tree(T)) as arguments, respectively
        (list(T) is defined in the list module in the standard
        library.)

        (Note that Mercury doesn't get confused if you use the same
        name for both type and data constructor.)

        [XXX Tyson suggests that diagrams would make this section
        easier to understand.]

    PATTERN MATCHING

        Pattern matching is the primary means of identifying which
        constructor is used for a particular type value.  The
        following function definition illustrates:
        [Footnote: functions are a useful shorthand for predicates
        that have a single output and can therefore appear as
        sub-expressions.]

        [begin code]

            :- func insert(binary_tree(T), T) = tree(T).
            :- mode insert(in, in) = out is det.

            insert(leaf, X) =
                branch(X, leaf, leaf).

            insert(branch(A, L, R), X) =
                ( if      X < A then branch(A, insert(L, X), R)
                  else if A < X then branch(A, L, insert(R, X))
                  else               branch(X, L, R)
                ).

        [end code]

        The first clause matches when the first argument is a leaf/0
        and the second clause matches when the first argument is a
        branch/3, unifying A, L, and R with the arguments of the
        branch/3 value.

        (Here we using if-then-else expressions where what follows the
        "then" or "else" is the result of the whole if-then-else
        expression.)

        Note that all function clauses must compute results of the
        same type.  Similarly, the results of the "then" and "else"
        arms of an if-then-else expression must have the same type.

        Another way of writing insert/2 is to use explicit unification
        to find out what data constructor was used for a particular
        argument:

        [begin code]

            :- func insert(binary_tree(T), T) = tree(T).
            :- mode insert(in, in) = out is det.

            insert(T0, X) = T :-
                (
                    T0 = leaf,           T  = branch(X, leaf, leaf)
                ;
                    T0 = branch(A, L, R),
                    ( if      X < A then T = branch(A, insert(L, X), R)
                      else if A < X then T = branch(A, L, insert(R, X))
                      else               T = branch(X, L, R)
                    )
                ).

        [end code]

        Here we follow the function head with a ":-" and a /goal/ that
        fills in the result.  (In fact, this is exactly how the
        compiler would see the original implementation of insert/2.)

        The goal in this case is a special kind of disjunction
        ("choice") called a /switch/.  In this case, the argument T0
        is unified with leaf/0 in the first disjunct and branch/3 in
        the second.  Since T0 is an "in" mode argument, the compiler
        knows that only one of these disjuncts will succeed (if two or
        more disjuncts could succeed at the same time then this
        wouldn't be a switch.)  Switches are also very common choice
        structures in Mercury.
        [Footnote: the convention is to name a sequence of related
        values as X0, X1, X2, ..., X.]

        (Note that in this version we are using if-then-else /goals/,
        as opposed to if-then-else expressions.)

        The third way we could implement insert/2 is to use an
        if-then-else to decide what to do:

        [begin code]

            :- func insert(binary_tree(T), T) = tree(T).
            :- mode insert(in, in) = out is det.

            insert(T0, X) = T :-
                ( if T0 = branch(A, L, R) then
                    ( if      X < A then T = branch(A, insert(L, X), R)
                      else if A < X then T = branch(A, L, insert(R, X))
                      else               T = branch(X, L, R)
                    )
                  else                   T = branch(X, leaf, leaf)
                ).

        [end code]

        This is also fine, although where possible it is often a
        better idea to use pattern matching or switches, since then
        the compiler generally has better scope for optimisation.

    ALGEBRAIC TYPES WITH FIELDS

        Just as with many languages, it is possible to give names to
        the fields in data constructors, which can be used to access
        just those fields.

        [begin code]

            :- type rgb_colour
                --->    rgb(
                            red     :: float,
                            green   :: float,
                            blue    :: float
                        ).

        [end code]

        [Footnote: not all fields have to be named, although one has
        to use pattern matching or unification to access unnamed
        fields.]

        Now, say X is a value of type rgb_colour.  Then X ^ red,
        X ^ green and X ^ blue are expressions that return the value
        in the corresponding fields.  If we have

        [begin code]

            X = rgb(0.1, 0.2, 0.3)

        [end code]

        then

        [begin code]

            X ^ red   = 0.1,
            X ^ green = 0.2,
            X ^ blue  = 0.3

        [end code]

        A field can be "updated" in the following way:

        [begin code]

            Y = ( X ^ red := 0.4 )

        [end code]

        results in

        [begin code]

            Y = rgb(0.4, 0.2, 0.3)

        [end code]

        What we're actually doing here is applying a substitution to
        a field and obtaining a new value - X is unaffected by the
        substitution.

        We can chain several substitutions together like this:

        [begin code]

            Y = ((( X ^ red   := 0.4 )
                      ^ green := 0.5 )
                      ^ blue  := 0.6 )

        [end code]

        Fields can also nest:

        [begin code]

            :- type rgb_point
                --->    rgb_point(
                            x   :: float,
                            y   :: float,
                            rgb :: rgb_colour
                        ).

        [end code]

        Now, if we have

        [begin code]

            X = rgb_point(3.0, 2.0, rgb_colour(0.1, 0.2, 0.3))

        [end code]

        then

        [begin code]

            X ^ rgb ^ red = 0.1

        [end code]

        and

        [begin code]

            Y = ( X ^ rgb ^ red := 0.4 )

        [end code]

        results in

        [begin code]

            Y = rgb_point(3.0, 2.0, rgb_colour(0.4, 0.2, 0.3))

        [end code]

        Note that it is an error for the same field name to be used in
        data constructors for different /types/ defined in the same
        module.  However, different /constructors/ of a type may have
        fields with the same name.  For example:

        [begin code]

            :- type foo ---> foo(a :: int).

            :- type bar ---> bar(a :: int).

        [end code]
        [Footnote: foo, bar, baz and quux are traditional programming
        names used in much the same way as Tom, Dick or Harry might be
        used to name arbitrary individuals.]

        would lead to the compiler flagging an error since the types
        foo and bar are both defined in the same scope and both have
        constructors with a common field name.  The following is
        acceptable, however, since all the data constructors with the
        same field name belong to the same type:

        [begin code]

            :- type baz
                --->    constructor1(a :: int, ...)
                ;       constructor2(a :: int, ...)
                ;       constructor3(a :: int, ...).

        [end code]

        Field access syntax is just syntactic sugar for functions that
        are automatically created by the compiler.  It is also
        possible to write your own functions that can be used as field
        names (for example, if you wanted to compute a value rather
        than storing it in the data constructor, but make it look like
        a field.)  We will go into this in more detail later on XXX.
        [Footnote: /syntactic sugar/: syntax that makes life easier,
        without otherwise adding anything new to the language.]

    EQUIVALENCE TYPES

        For documentation and brevity it is also useful to be able to
        declare type synonyms:

        [begin code]

            :- type int_tree == tree(int).

        [end code]

        Mercury would make no distinction between int_tree and
        tree(int).

        Equivalence types can also take parameters:

        [begin code]

            :- type list_tree(T) == tree(list(T)).

        [end code]

    ABSTRACT TYPES

        Often one wants to export a type from a module, but hide its
        definition so that users of the module can only manipulate
        values of the type in question via the predicates exported by
        the module interface.

        Abstract types are used for this purpose:

    [begin code]

        :- module complex.
        :- interface.
        :- import_module float.

        :- type complex.                        % Abstract type.

            % Constructor functions for complex numbers.
            %
        :- func rect(float, float) = complex.   % re, im.
        :- func polar(float, float) = complex.  % r, theta.

            % Operations on complex numbers.
            %
        :- func complex + complex = complex.
        :- func complex * complex = complex.
        ...

        :- implementation.

            % This implementation uses a single, rectangular
            % representation.
            %
        :- type complex                         % Concrete type.
            --->    re_im(float, float).

        ... implementation of interface using private representation

    [end code]

        [Footnote: Mercury supports overloaded names, where the same
        name can be used for several different things; the compiler
        uses type information to work out which is really meant.  This
        is how we can use the functor (+)/2 for addition for ints,
        floats, complex numbers etc., despite the fact that these
        operations have different implementations.]

        The abstract type declaration in the module interface must be
        matched by a concrete type definition (an algebraic or
        equivalence type) in the implementation section.

        The definition of the abstract type is hidden from users of
        the complex module.

        Finally, abstract types may also take parameters:

        [begin code]

            :- module set.
            :- interface.

            :- type set(T).                     % Abstract type.

            :- func empty = set(T).
            :- func union(set(T), set(T)) = set(T).
            ...

            :- implementation.
            :- import_module list.

                % This implementation uses lists to represent sets.
                %
            :- type set(T) == list(T).          % Concrete type.

            empty       = [].
            union(A, B) = A ++ B.
            ...

        [end code]

    * TYPES WITH USER-DEFINED EQUALITY

        [XXX Defer explanation until later.]

PREDICATES

    The key computational unit in Mercury is the predicate.  A
    predicate describes a relationship between its arguments; it is
    also possible to provide a procedural view of Mercury predicates,
    which is how they can be used in a programming language.

    Predicates cover not only tests and functions, as found in other
    languages, but also procedures with multiple return values and
    even non-deterministic relationships, in which there may be
    several different solutions for a given set of input arguments.

    INTRODUCTION

        A predicate is defined by a set of /clauses/, where each
        clause takes the form

            Head :- Body.

        This should be read as saying "Head is true if Body is true" with
        the set of clauses forming an exhaustive specification of the
        predicate.

        If the body of a predicate is empty, one can just write

            Head.

        These sorts of clauses are called /facts/.

        The body of a predicate clause is a /goal/ - a logical formula
        that must be satisfied in order for the head to hold.

        A very simple example of a predicate is

        [begin code]

            :- pred max(int, int, int).         % A, B, Max.
            :- mode max(in,  in,  out) is det.

            max(A, B, Max) :-
                ( if B < A then Max = A else Max = B ).

        [end code]

        This predicate takes two integers, A and B, and succeeds
        binding Max to A if A is greater than B or binding Max to B
        otherwise.  The part of the mode declaration "is det" means
        that this is a deterministic predicate: it will always succeed
        and has just one solution for any pair of inputs.

        (This sort of deterministic predicate with a single output is
        called a function.  Functions are so common that Mercury
        includes special syntax and conventions to simplify working
        with them.  More of this in a later section XXX.)

        For a more interesting example, consider the following small
        genealogical database:

        [begin code]

            :- type person
                --->    arthur  ; bill      ; carl
                ;       alice   ; betty     ; cissy.

        [end code]

        We start by defining predicates that we can use to decide if a
        particular person is male or female.  These predicates are
        labelled semidet because they have at most one solution
        (success) for a given argument, but might also fail.

        [begin code]

            :- pred male(person).
            :- mode male(in) is semidet.

            male(arthur).
            male(bill).
            male(carl).

            :- pred female(person).
            :- mode female(in) is semidet.

            female(Person) :- not male(Person).

        [end code]

        The definition for female/1 states that female(Person)
        succeeds iff male(Person) fails.  (Recall that variables in
        Mercury are distinguished by starting with a capital letter.)
        [Footnote: /iff/: if and only if.]

        [begin code]

            :- pred father(person, person).     % Child, Father.
            :- mode father(in,     out) is semidet.

            father(betty, arthur).
            father(carl,  bill).
            father(cissy, bill).

            :- pred mother(person, person).     % Child, Mother.
            :- mode mother(in,     out) is semidet.

            mother(bill,  alice).
            mother(carl,  betty).
            mother(cissy, betty).

        [end code]

        The predicates father/2 and mother/2 take their first argument
        as an input and return a result in the second.  The
        determinism is semidet again because they are not exhaustive:
        some inputs can cause them to fail (e.g. there is no solution
        for father(arthur, X).)

        [begin code]

            :- pred parent(person, person).     % Child, Parent.
            :- mode parent(in,     out) is nondet.

            parent(Child, Parent) :- father(Child, Parent).
            parent(Child, Parent) :- mother(Child, Parent).

        [end code]

        This simply says that the parent of a child is either the
        father or the mother.  The determinism is nondet because for a
        given Child argument this predicate may fail or may have more
        than one solution:
        - both father/2 and mother/2 can fail
          (e.g. if Child = arthur);
        - just one of father/2 or mother/2 could succeed
          (e.g. if Child = bill);
        - both father/2 and mother/2 could succeed
          (e.g. if Child = cissy).

        (How failure and multiple solutions affect the execution of a
        Mercury program will be explained below.)

        [begin code]

            :- pred ancestor(person, person).   % Person, Ancestor.
            :- mode ancestor(in,     out) is nondet.

            ancestor(Person, Ancestor) :-
                parent(Person, Ancestor).

            ancestor(Person, Ancestor) :-
                parent(Person, Parent),
                ancestor(Parent, Ancestor).

        [end code]

        We can now define an ancestor of a Person as being either
        - a parent of that Person or
        - an ancestor of a parent of that Person.

        Since parent/2 is nondet, so is ancestor/2.

        This table explains the difference between the number of
        solutions a predicate can have for a given determinism:

            Determinism             Number of Solutions
                                    Min             Max

            semidet                 0               1
            nondet                  0               at least one
            det                     1               1
            multi                   1               at least one

        (There are a few other determinisms, but we don't need to
        consider them just yet XXX.)

        The compiler will check that your determinism declarations are
        correct.

        One interesting thing about this database is that there's no
        reason why it shouldn't be run in reverse.  That is, for any
        father or mother, we should be able to deduce who their
        children are and for any ancestor we should be able to work
        out who their descendants are.

        To gain this extra functionality we have only to add the
        required extra mode declarations; there is no need to touch
        the definitions themselves!  The extra mode declarations are

        [begin code]

            :- mode father(out, in) is nondet.   % Infer Child from Father.
            :- mode mother(out, in) is nondet.   % Infer Child from Mother.
            :- mode parent(out, in) is nondet.   % Infer Child from Parent.
            :- mode ancestor(out, in) is nondet. % Infer Person from Ancestor.

        [end code]

        Notice that father/2 and mother/2 are nondet in this mode
        rather than semidet: looking at the definitions we see that
        father(Child, bill) has multiple solutions Child = carl and
        Child = sissy, while mother(Child, betty) also has solutions
        Child = carl and Child = sissy.

        There is no reason why a predicate cannot have several output
        arguments or even no input arguments.  For example, we
        might go on to add

        [begin code]

            :- pred parents(person, person, person).    % Child, Mother, Father.
            :- mode parents(in,     out,    out   ) is semidet.
            :- mode parents(out,    in,     in    ) is nondet.

            parents(Child, Mother, Father) :-
                mother(Child, Mother),
                father(Child, Father).

        [end code]

        The first mode of parents/3 is semidet because both mother/2
        and father/2 are semidet when called with Child as an input
        and both must succeed for parents/3 to succeed.

        The second mode is nondet for similar reasons, but exhibits an
        interesting property.  At first glance you might think that
        something will go wrong here: when parents/3 is called in the
        second mode, initially Mother and Father are inputs and Child
        is an output.  However, if the call to mother/2 succeeds, then
        Child will end up being bound to the identifier for some
        person.  This appears that we would then be trying to call
        father/2 with /both/ arguments as inputs, but there is no such
        mode declaration for father/2.  There is no need to worry,
        however - the compiler will recognise the situation and treat
        the definition of parents/3 like this as far as the second
        mode is concerned:
        [Footnote: the compiler isn't doing any special analysis here;
        this is just a natural consequence of conversion to
        super-homogeneous normal form, which will be explained later
        on XXX.]

        [begin code]

            :- mode parents(out, in, in) is nondet.

            parents(Child, Mother, Father) :-
                mother(Child, Mother),
                father(X,     Father),
                X = Child.

        [end code]

        So here father/2 is being called with (out, in) arguments and
        parents/3 succeeds if the result X is bound to the same value
        as Child.

        In effect the compiler has deduced that father/2 has the
        following implied mode:

        [begin code]

            :- mode father(in, in) is semidet.

        [end code]

        In general, Mercury will reorder each clause body for each
        mode declaration so that results are generated before they are
        needed - each mode of a predicate is referred to as a
        /procedure/.

    EXECUTION

        This section explains Mercury's execution model in more
        detail.  In particular, the notions of failure, backtracking
        and non-determinism are addressed.

        Let's start by looking at some of the code for our
        genealogical database again - this time we're going to label
        the clauses to help us see how programs execute in Mercury.

        [begin code]

            :- pred father(person, person).     % Child, Father.
            :- mode father(in,     out   ) is semidet.
            :- mode father(out,    in    ) is nondet.

        f1  father(betty, arthur).
        f2  father(carl,  bill).
        f3  father(cissy, bill).

            :- pred mother(person, person).     % Child, Mother.
            :- mode mother(in,     out   ) is semidet.
            :- mode mother(out,    in    ) is nondet.

        m1  mother(bill,  alice).
        m2  mother(carl,  betty).
        m3  mother(cissy, betty).

            :- pred parent(person, person).     % Child, Parent.
            :- mode parent(in,     out   ) is nondet.
            :- mode parent(out,    in    ) is nondet.

        p1  parent(Child, Parent) :- father(Child, Parent).
        p2  parent(Child, Parent) :- mother(Child, Parent).

            :- pred ancestor(person, person).   % Person, Ancestor.
            :- mode ancestor(in,     out   ) is nondet.

        a1  ancestor(Person, Ancestor) :-
                parent(Person, Ancestor).

        a2  ancestor(Person, Ancestor) :-
                parent(Person, Parent),
                ancestor(Parent, Ancestor).

            :- pred parents(person, person, person).    % Child, Mother, Father.
            :- mode parents(in,     out,    out   ) is semidet.
            :- mode parents(out,    in,     in    ) is nondet.

        s1  parents(Child, Mother, Father) :-
                mother(Child, Mother),
                father(Child, Father).

        [end code]

        First off, we'll consider the goal parent(cissy, Parent).
        Every time we see a goal we try expanding it according to each
        clause definition in turn (this is what being referentially
        transparent is all about).  If we get to a dead end we have to
        /backtrack/ to the last point where we had a choice and try a
        different clause (when we backtrack we also forget any
        variable bindings we might have made one the way from the
        /choice point/.)

        [begin code]

            parent(cissy, Parent)
            p1 =>
                father(cissy, Parent)
                f1 => FAIL because betty \= cissy
                f2 => FAIL because carl  \= cissy
                f3 =>
                    Parent = bill

        [end code]

        So one solution to parent(cissy, Parent) is Parent = bill.  We
        obtained this by first expanding the goal according to p1 and
        then expanding the resulting goal father(cissy, Parent)
        according to each of f1, f2 and f3 until we found one that
        succeeded.

        If the program later has to backtrack over this goal then we
        have to forget the binding Parent = bill and try the next
        clause for parent/2 (since there are no more clauses for
        father/2):

        [begin code]

            parent(cissy, Parent)
            p2 =>
                mother(cissy, Parent)
                m1 => FAIL because bill \= cissy
                m2 => FAIL because carl \= cissy
                m3 => Parent = betty

        [end code]

        Now, say we wanted to ask the question "which ancestor of
        carl, if any, is also a parent of bill?"  Here's how things
        would proceed:

        [begin code]

            ancestor(carl, X), parent(bill, Y), X = Y
            a1 =>
                parent(carl, X), parent(bill, Y), X = Y
                p1 =>
                    father(carl, X), parent(bill, Y), X = Y
                    f1 => FAIL because betty \= carl
                    f2 =>
                        X = bill, parent(bill, Y), X = Y
                       =>
                        parent(bill, Y), bill = Y
                        p1 =>
                            father(bill, Y), bill = Y
                            f1 => FAIL because betty \= bill
                            f2 => FAIL because carl  \= bill
                            f3 => FAIL because cissy \= bill
                        p2 =>
                            mother(bill, Y), bill = Y
                            m1 =>
                                Y = alice, bill = Y
                               =>
                                bill = alice
                               => FAIL because bill  \= alice
                            m2 => FAIL because carl  \= bill
                            m3 => FAIL because cissy \= bill
                p2 =>
                    mother(carl, X), parent(bill, Y), X = Y
                    m1 => FAIL because bill \= carl
                    m2 =>
                        X = betty, parent(bill, Y), X = Y
                       =>
                        parent(bill, Y), betty = Y
                        p1 =>
                            father(bill, Y), betty = Y
                            f1 => FAIL because betty \= bill
                            f2 => FAIL because carl  \= bill
                            f3 => FAIL because cissy \= bill
                        p2 =>
                            mother(bill, Y), betty = Y
                            m1 =>
                                Y = alice, betty = Y
                               =>
                                betty = alice
                               => FAIL because betty \= alice
                            m2 => FAIL because carl  \= bill
                            m3 => FAIL because cissy \= bill
            a2 =>
                [footnote: we have to supply fresh local vars in each
                 expansion - here Ancestor in the definition of a2 has
                 been replaced with a new variable Z.]
                parent(carl, Z), ancestor(Z, X), parent(bill, Y), X = Y
                p1 =>
                    father(carl, Z), ancestor(Z, X), parent(bill, Y), X = Y
                    f1 => FAIL because betty \= carl
                    f2 =>
                        Z = bill, ancestor(Z, X), parent(bill, Y), X = Y
                       =>
                        ancestor(bill, X), parent(bill, Y), X = Y
                        a1 =>
                            parent(bill, X), parent(bill, Y), X = Y
                            p1 =>
                                father(bill, X), parent(bill, Y), X = Y
                                f1 => FAIL because betty \= bill
                                f2 => FAIL because carl  \= bill
                                f3 => FAIL because cissy \= bill
                            p2 =>
                                mother(bill, X), parent(bill, Y), X = Y
                                m1 =>
                                    X = alice, parent(bill, Y), X = Y
                                   =>
                                    parent(bill, Y), alice = Y
                                    p1 =>
                                        father(bill, Y), alice = Y
                                        f1 => FAIL because betty \= bill
                                        f2 => FAIL because carl  \= bill
                                        f3 => FAIL because cissy \= bill
                                    p2 =>
                                        mother(bill, Y), alice = Y
                                        m1 =>
                                            Y = alice, alice = Y
                                           =>
                                            alice = alice
                                           => TRUE

        [end code]

        So our program concludes that a solution to 

        [begin code]

            ancestor(carl, X), parent(bill, Y), X = Y

        [end code]

        is

        [begin code]

            X = alice, Y = alice

        [end code]

        As we indicated earlier in the definition of parents/3, in
        practice we'd be more likely to write the goal as just

        [begin code]

            ancestor(carl, X), parent(bill, X)

        [end code]

        and let Mercury work out where the implicit unification should
        go.

        [The above example might give the impression that Mercury is a term
        rewriting system.  This is not true (although conceivably a very
        slow Mercury implementation might use such a technique...)  What
        happens behind the scenes is much more efficient, albeit
        computationally equivalent.]

        In practice, ninety per cent of the code one writes is
        deterministic.  However, there are times (as in the example
        above) when non-determinism can be used to write very clear,
        concise programs.  Support for backtracking and unification is
        what distinguishes Mercury from purely functional languages.

    RECURSION

        Imperative languages like C and Java provide a whole slew of
        mechanisms for supporting iterative (looping) control flow -
        while loops, repeat-until loops, for loops and so forth.

        Declarative languages typically provide but one mechanism for
        such things: recursion.  A recursive predicate is simply one
        defined in terms of itself.  (Later on we will discover
        /higher order/ predicates and observe that the standard
        library supplies predicates that stand in for the common
        iterative mechanisms found in imperative languages.)

        Some simple examples:

        [begin code]

            :- func factorial(int) = int.

            factorial(N) =
                ( if N = 0 then 1 else N * factorial(N - 1) ).

            :- func fibonacci(int) = int.

            fibonacci(N) =
                ( if N =< 2 then 1
                            else fibonacci(N - 1) + fibonacci(N - 2) ).

        [end code]

        These are examples of what is sometimes called "middle
        recursion" where the recursive call is /not/ the last thing
        that the predicate (or function) does when looping.  Here,
        calls to factorial/1 finishes with a multiplication and calls
        to fibonacci/1 finish with an addition.

        Although middle recursion is easy to read, it incurrs some
        performance penalty in that each iteration has to record
        something extra on the stack in order to finish up the
        computation.
        [Footnote: that said, the compiler can in fact turn some
        middle recursive predicates into equivalent tail recursive
        predicates that do not incurr a stack overhead.]

        [XXX Include a side-bar or something explaining stack frames.]

        /Tail recursion/ describes the situation where the last thing a
        predicate does is call itself.  Since there is nothing left to
        do after the call, the compiler can reuse the current call's
        stack frame for the recursive call, allowing the predicate to
        execute using only a fixed amount of stack space.  For
        example, tail recursive implementations of the above
        functions are

        [begin code]

            factorial(N) = fac(N, 1).

            fac(N, X) = ( if N = 0 then X else fac(N - 1, N * X) ).



            fibonacci(N) = fib(1, 1, N).

            fib(FN_2, FN_1, N) =
                ( if N =< 2 then FN_1
                            else fib(FN_1, FN_1 + FN_2, N - 1) ).

        [end code]

        Tail recursive code like this should execute just as quickly
        and efficiently as a for or while loop in an imperative
        language.

    UNIFICATIONS

        Mercury has but two basic atomic (i.e. indivisible) types of
        goal: unifications and calls.

        A unification is written X = Y.  A unification can fail if X
        and Y are not unifiable.

        Two values are unifiable if they are "structurally similar" -
        that is, where you see a data constructor in one, you either
        see the same data constructor in the other (and the
        corresponding arguments are also structurally similar) or a
        variable, which will end up being bound to the corresponding
        term on the other side if the unification is successful.

        [Footnote: unlike Prolog, Mercury forbids the aliasing of
        variables whereby a partially instantiated data structure
        may contain the same unbound variable in two different places.
        This will be explained fully in a later chapter. XXX]

        Unifications, therefore, can be used to bind variables to
        values, test to see if a variable is bound to a particular
        constructor, unpack the arguments of a constructor or all of
        the above.

        In the following examples we assume that variables are
        initially unbound:

                 123 = 123          - Succeeds.

                   X = 123          - Succeeds binding X to 123.

                 123 = 234          - Fails.

                   X = foo(1, 2, 3) - Succeeds, binding X to foo(1, 2, 3).

        foo(X, Y, Z) = foo(1, 2, 3) - Succeeds, binding A = 1, B = 2, Z = 3.

        foo(X, Y, 4) = foo(1, 2, 3) - Fails, because 4 \= 3.

        foo(X, 2, 3) = foo(1, 2, Z) - Succeeds, binding X = 1, Z = 3.

        The complex unifications are most easily understood by first
        converting them into /super homogeneous normal form/ (this is
        what the compiler does.)  In SHNF, the left hand side of each
        unification is a variable and the right hand side is either
        another variable or a functor, all of whose arguments are
        variables.  For example

        foo(X, 2, 3) = foo(1, 2, Z)

        becomes

            V = foo(A, B, C), A = X, B = 2, C = 3,
            V = foo(P, Q, R), P = 1, Q = 2, R = Z,
            A = P,
            B = Q,
            C = R

        where V, A, B, C, P, Q, and R are all temporary variables
        introduced in the conversion to SHNF.

    CALLS

        The other kind of primitive goal supported by Mercury is the
        predicate call, p(X1, ..., Xn), which we have already seen in
        the examples above.

        One way to picture the evaluation of a call is to think of it
        as expanding into the different clause bodies for the
        predicate definition while looking for solutions.

        Two relatively important built-in predicates are true and
        false XXX.  true always succeeds and false always fails.
        [Footnote: false is often written as fail, which is a hangover
        from Mercury's Prolog roots where it was sometimes more useful
        to think procedurally than declaratively.]

    CONJUNCTION

        A goal of the form G1, G2, ..., Gn is called a conjunction
        with the separating commas read as "and".  A conjunction
        succeeds iff a consistent solution to each of the sub-goals G1,
        G2, ..., Gn can be found by the program.

        (The compiler may have to reorder the sequence of goals in a
        clause in order to satisfy the mode constraints.  Although one
        can set a flag to force the compiler to do no more reordering
        than is necessary, in general this will mean that certain
        optimizations will not be possible.  The upshot of this is
        that one should avoid writing code that assumes a particular
        evaluation order other than that dictated by the mode
        constraints.)

        A conjunction executes by trying each of the goals in order.
        If a goal fails then the program backtracks to the nearest
        preceding choice-point (i.e. non-deterministic goal that may
        have other solutions).

    NEGATION

        A goal of the form not G succeeds iff G has no solutions.  G
        may be a compound goal, in which case it should be enclosed in
        parentheses to avoid syntactic precedence problems.

        The sub-goal G is said to be in a /negated context/ and as
        such cannot bind any variables visible outside the negation
        (since the only way not G can succeed is if G fails, in which
        case it will not produce any variable bindings.)

        Note that not not G is equivalent to G and hence may bind
        variables if it succeeds (while not not G would be an odd
        thing to write, some of the code transformations the compiler
        performs can generate such things; getting the modes right for
        such things requires that the compiler observe this
        simplification rule.)

    IF-THEN-ELSE GOALS

        Mercury's if-then-else construct looks like this:

        [begin code]

            ( if ConditionGoal then YesGoal else NoGoal )

        [end code]

        Note that the else part is /not/ optional.

        You may also see if-then-elses written as

        [begin code]

            ( ConditionGoal -> YesGoal ; NoGoal )

        [end code]

        although the author finds this style less appealing.

        While the parentheses are not always required, it is a very
        good idea to include them in order to avoid confusing
        syntactic precedence errors.  One common exception is a chain
        of if-then-elses where only the top level of parentheses are
        necessary:

        [begin code]

            ( if      ConditionGoal1 then YesGoal1
              else if ConditionGoal2 then YesGoal2
              else if ConditionGoal3 then YesGoal3
              ...
              else                        NoGoal
            )

        [end code]

        Extra parentheses are not required if any of ConditionGoals,
        YesGoals or NoGoal are compound goals.

        The if-then-else goal

        [begin code]

            ( if ConditionGoal then YesGoal else NoGoal )

        [end code]

        is equivalent to the disjunction

        [begin code]

            ( ConditionGoal, YesGoal ; not ConditionGoal, NoGoal )

        [end code]

        but will be implemented more efficiently by the compiler (if
        ConditionGoal produces no solutions in the first disjunct then
        there's no point in checking again that it has none in the
        second.)

        It's worth looking at a few examples to really understand how
        if-then-else goals work.  Again, we assume that all variables
        are initially unbound:

        [begin code]

            ( if ( X = 1 ; X = 2 ) then ( X = 2 ; X = 4 ) else X = 5 )

        [end code]

        The above goal is non-deterministic (the condition is multi
        and the then-goal is non-deterministic), but has the single
        solution X = 2.

        [begin code]

            ( if ( X = 1 ; X = 2 ) then ( X = 3 ; X = 4 ) else X = 5 )

        [end code]

        The above goal is also non-deterministic for the same reason,
        but has no solutions.

        [begin code]

            ( if 1 = 2 then ( X = 3 ; X = 4 ) else ( X = 5 ; X = 6 ) )

        [end code]

        The above goal is non-deterministic because the else-goal is
        non-deterministic.  Since the condition must fail, the
        else-goal is evaluated, with solutions X = 5 and X = 6.

        [begin code]

            ( if 1 = 1 then ( X = 3 ; X = 4 ) else ( X = 5 ; X = 6 ) )

        [end code]

        The above goal is non-deterministic because the then-goal is
        non-deterministic.  The condition has precisely one solution,
        so the then-goal is evaluated, with solutions X = 3 and X = 4.

        That said, in the vast majority of cases where the
        condition-goal is semidet and the then- and else-goals are
        deterministic, if-then-else goals will act in very much the
        same way as similar structures in other programming languages.

        Since the condition-goal is in a negated context in the "else"
        arm of the disjunctive form of an if-then-else, it cannot
        produce any outputs that would be used in NoGoal or anything
        outside the if-then-else as a whole.  It can, however, produce
        outputs that are only used in the YesGoal.  (The reason for
        this restriction is slightly subtle and will be explained in
        more detail later.  [XXX It's to do with having mode
        independent semantics.])

        For example, the following somewhat contrived code violates
        the constraint because S is bound inside the condition and is
        also visible outside the if-then-else goal.

        [begin code]

            :- pred prime_divisor(int, int).
            :- mode prime_divisor(in,  out) is nondet.
            ...

            :- pred prime_divisor_or_zero(int, int).
            :- mode prime_divisor_or_zero(in,  out) is multi.

            prime_divisor_or_zero(N, S) :-
                ( if   prime_divisor(N, S)
                  then true
                  else S = 0
                ).

        [end code]

        The correct way to write prime_divisor_or_zero/2 is

        [begin code]

            prime_divisor_or_zero(N, S) :-
                ( if   prime_divisor(N, D)
                  then S = D
                  else S = 0
                ).

        [end code]

        Note that if-then-else /expressions/ are slightly different;
        the following is perfectly legal:

        [begin code]

            prime_divisor_or_zero(N, S) :-
                S = ( if prime_divisor(N, D) then D else 0 ).

        [end code]

    DISJUNCTION

        Just as conjunction lets you use "and" to construct goals,
        disjunction lets you use "or".  A disjunctive goal takes the
        form (G1 ; G2 ; ... ; Gn) with the separating semicolons being
        read as "or".

        A disjunction succeeds iff any of its disjunct sub-goals
        succeeds.  A disjunction has as many solutions as all of its
        disjuncts put together: if one disjunct fails or backtracking
        exhausts all the solutions for one disjunct then execution
        proceeds with another disjunct.  Again, the compiler is
        generally free to reorder disjuncts, although this should not
        have a visible impact on programs.  Disjunctions are typically
        non-deterministic, although switches, mentioned earlier, are a
        special case.

        [XXX Need examples?]

        The clausal notation we have been using in the examples above
        is in fact convenient syntactic sugar for writing top-level
        disjunctions.  For example, the ancestor/2 predicate we
        defined earlier

        [begin code]

            ancestor(Person, Ancestor) :-
                parent(Person, Ancestor).

            ancestor(Person, Ancestor) :-
                parent(Person, Parent),
                ancestor(Parent, Ancestor).

        [end code]

        could equivalently be written as

        [begin code]

            ancestor(Person, Ancestor) :-
                (
                    parent(Person, Ancestor)
                ;
                    parent(Person, Parent),
                    ancestor(Parent, Ancestor)
                ).

        [end code]

        Indeed, this is how the compiler sees multi-clause
        definitions.

        In general it is better style to use clausal form for
        top-level disjunctions.

    SWITCHES

        Mercury recognises particular forms of (semi-)deterministic
        disjunction which it can compile very efficiently.

        The string library module defines the following type:

        [begin code]

            :- type poly_type
                --->    f(float)
                ;       i(int)
                ;       s(string)
                ;       c(char).

        [end code]

        which can be used to form heterogeneous collections of the
        primitive types (this is useful, amongst other things, for
        supplying argument lists for formatted output.)

        Say we wanted to write a predicate that would convert any
        poly_type value into a string.  Here's how we might write the
        code (in practice we would use a function, here we use a
        predicate for the purposes of illustration):

        [begin code]

            :- pred poly_type_to_string(poly_type, string).
            :- mode poly_type_to_string(in, out) is det.

            poly_type_to_string(f(F), S) :- float_to_string(F, S)
            poly_type_to_string(i(I), S) :- int_to_string(I, S)
            poly_type_to_string(s(S), S).
            poly_type_to_string(c(C), S) :- char_to_string(C, S)

        [end code]

        this is equivalent to the single clause definition

        [begin code]

            poly_type_to_string(Poly, S) :-
                (   Poly = f(F), float_to_string(F, S)
                ;   Poly = i(I), int_to_string(I, S)
                ;   Poly = s(S)
                ;   Poly = c(C), char_to_string(C, S)
                ).

        [end code]

        The compiler knows that since Poly is an input variable it
        must be bound when the disjunction is evaluated.  The compiler
        also sees that each arm of the disjunction unifies Poly
        against a different data constructor.  The compiler therefore
        deduces that at most one disjunct can succeed on a particular
        call (and, since all poly_type data constructors are tested
        for, /exactly/ one must succeed.)  The compiler generates very
        efficient code for so-called /switch/ constructs such as this.
        [Footnote: the name /switch/ is used because of its similarity
        to the C language construct of the same name.]

        Switches are an elegant way of describing conditions based on
        unification tests and are typically more efficient than the
        equivalent if-then-else chains.

    EXISTENTIAL QUANTIFICATION

        Sometimes we only need to know whether a solution exists, but
        are not interested in the result.  For this we use existential
        quantification, which looks like this:

            (some [X, Y, Z] G)

        A goal of this form will succeed iff there is a solution to G,
        but any bindings for X, Y and Z will not be visible outside
        the quantification - it's rather like saying that X, Y and Z
        are local variables for the goal G.

        Mercury has an rule that any variables in a clause that do not
        also appear in the head are implicitly existentially
        quantified, which means you never actually need to use
        explicit existential quantification in your programs.

    UNIVERSAL QUANTIFICATION

        On the other hand, we may wish to know whether a particular
        property holds for all solutions to a particular goal.  This
        is where universal quantification is useful.

        The goal

            (all [X, Y, Z] G)

        is equivalent to writing

            not (some [X, Y, Z] not G)

    IMPLICATION

        Mercury has three types of goal for describing implicative
        relationships between goals.

            - (G1 => G2) is shorthand for (not G1 ; G2);
            - (G1 <= G2) is shorthand for (G2 => G1); and
            - (G1 <=> G2) is shorthand for ((G1 => G2), (G1 <= G2)).

        [Footnote: the translations are given by de Morgan's laws.]

        Note that parentheses are required around G1 and G2 if they
        are not atomic goals; it is a good idea to also put
        parentheses around the implication as a whole to avoid
        ambiguity in the scope of the implication.)

        Implication is most often used with universal quantification
        to test for some general property.

        EXAMPLES

        This example uses the predicate member(X, Xs) to
        non-deterministically project members X from the list Xs and
        the semidet predicate even(X) which succeeds iff X is even.

        [begin code]

            :- pred all_even(list(int)).
            :- mode all_even(in) is semidet.

            all_even(Xs) :-
                all [X] ( member(X, Xs) => even(X) ).

        [end code]

        [Footnote: the convention is to name a list of items, X, as
        Xs.]

        [begin code]

                % Two list can be interpreted as equivalent sets if
                % each contains the same members as the other.
                %
            :- pred equivalent_sets(list(T), list(T)).
            :- mode equivalent_sets(in, in) is semidet.

            equivalent_sets(Xs, Ys) :-
                all [Z] ( member(Z, Xs) <=> member(Z, Ys) ).

        [end code]

        The auxiliary predicates member/2 and even/1 are defined as

        [begin code]

            :- pred member(T, list(T)).
            :- mode member(out, in) is nondet.

                % X is a member of a list if it is either the head of
                % that list or a member of the tail.
                %
            member(X, [X | _ ]).
            member(X, [_ | Xs]) :- member(X, Xs).


            :- pred even(int).
            :- mode even(in) is semidet.

            even(X) :- X `mod` 2 = 0.

        [end code]

        [Footnote: the Mercury parser views anything in `backquotes`
        as an infix operator.  This is the same rule as used by
        Haskell.]

    HIGHER ORDER APPLICATION

        [XXX I'll talk about this later.]

    ANONYMOUS AND SINGLETON VARIABLES

        Often one is not interested in a particular output variable
        from a call or unification.  In these cases you can use the
        special variable named "_" (a single underscore) which stands
        for a different anonymous or "don't care" variable every time
        it appears.

        Sometimes, however, it makes programs easier to read if you do
        name don't care variables.  Since variables that only appear
        once in a clause are usually the result of typographical
        error, the compiler will issue a warning when it sees such
        things.  To get around this problem, /prefixing/ a variable
        name with an underscore (e.g. "_X") tells the compiler that
        you know this is a named don't care variable and that there's
        no need to issue a warning.

FUNCTIONS

    Functions are deterministic (or semi-deterministic) relations with
    (at least) one output.

    Essentially, a function is any relation with a single solution for
    a given set of inputs.

    Functions are so common that Mercury provides special syntax to
    make working with them easier.  One of the key advantages of
    functions is that they can be used as parts of expressions, rather
    than having to have a separate goal computing each subexpression
    in turn.  That is, one can use an expression as in

    [begin code]

        X = (B + sqrt(B*B - 4*A*C)) / (2*A)

    [end code]

    rather than the somewhat opaque goal

    [begin code]

        square(B, BSquared),
        multiply(4, A, FourA),
        multiply(FourA, C, FourAC),
        subtract(BSquared, FourAC, Diff),
        sqrt(Diff, Sqrt),
        add(B, Sqrt, Numerator),
        multiply(2, A, Denominator),
        divide(Numerator, Denominator, X)

    [end code]

    DEFINITION

        This example illustrates how functions are defined:

        [begin code]

            :- func length(list(T)) = int.
            :- mode length(in) = out is det.

                % The length of an empty list is 0.
                % The length of a non-empty list is 1 for the head
                % plus the length of the tail.
                %
            length([]      ) = 0.
            length([_ | Xs]) = 1 + length(Xs).

        [end code]

        Like predicates, functions may be defined using several clauses
        and make use of pattern matching.

        Functions can be computed from goals, where the head and goal
        are separated by ":-" in the definition:

        [begin code]

                % take(N, Xs) is the length min(N, length(Xs))
                % prefix of Xs.
                %
            :- func take(int, list(T)) = list(T).
            :- func take(in, in) = out is det.

            take(N, Xs) = Ys :-
                split(N, Xs, Ys, _).

                % drop(N, Xs) is the length max(0, length(Xs) - N)
                % suffix of Xs.
                %
            :- func drop(int, list(T)) = list(T).
            :- func drop(in, in) = out is det.

            drop(N, Xs) = Zs :-
                split(N, Xs, _, Zs).

                % split(N, Xs, Prefix, Suffix)
                %
            :- pred split(int, list(T), list(T), list(T)).
            :- mode split(in,  in,      out,     out    ) is det.

            split(N, Xs, Ys, Zs) :-
                ( if N > 0, Xs = [X | Xs0] then
                    Ys = [X | Ys0],
                    split(N - 1, Xs0, Ys0, Zs)
                  else
                    Ys = [],
                    Zs = Xs
                ).

        [end code]

        By far the most common mode for a function is

        [begin code]

            :- mode f(in, in, ..., in) = out is det.

        [end code]

        If a function has (just) this sort of mode, then the mode
        declaration can be ommitted and the compiler will simply assume
        this mode is what is intended.  Hereafter we will omit unnecessary
        mode declarations for functions.

        [XXX What exactly are the constraints on function
        determinisms?  Remember to point out (somewhere) that
        functions may also have multiple procedures.  See the ref.
        manual section on Determinism.]

    PATTERN MATCHING

        [XXX Dealt with above.]

    RECURSION

        [XXX Dealt with above.]

    CONDITIONAL EXPRESSIONS

        Conditional expressions look like this:

        [begin code]

            ( if ConditionGoal then YesExpr else NoExpr )

        [end code]

        where the usual rules for if-then-elses apply (in particular,
        ConditionGoal may bind output variables that are used in
        YesExpr, but not elsewhere), except that the then and else
        branches are /expressions/, rather than goals.

        Note that as with if-then-else goals, the else clause is /not/
        optional in a conditional expression (it would make no sense
        not to have one.)

    * PARTIAL (SEMIDET) FUNCTIONS

        [XXX Leave for later.]

    OVERVIEW OF SEMIDET PREDICATES

        [XXX Deal with this in a later section.  It's sort-of advanced
        stuff.]

    POLYMORPHISM

        [XXX Dealt with in section on types.]

    INFIX NOTATION

        Mercury syntax supports a number of prefix, infix and postfix
        operators, including all the usual arithmetic operators.  This
        is just syntactic sugar, however, and there is no difference
        between X + Y and +(X, Y) as far as the compiler is concerned.

        If you want to use another name as an infix operator, you can
        simply place it in backquotes:

        [begin code]

            X `union` Y `union` Z

        [end code]

        is seen by the compiler as

        [begin code]

            union(X, union(Y, Z))

        [end code]

        Backquoted symbols bind more tightly than anything else and
        associate to the right.

INPUT AND OUTPUT

    One unfortunate consequence of being a pure declarative language
    is that IO becomes somewhat more complex than is the case for
    imperative languages.

    IO *IS* A SIDE EFFECT

    One problem is that performing IO necessarily has an effect on the
    outside world that cannot be backtracked over or undone - there is
    no returning to an earlier state of the world!  This is in
    contrast to the mathematically pure world that Mercury inhabits,
    where there is no concept of a value (such as the state of the
    world) "changing", only one of new such values being constructed.

    ORDER IS IMPORTANT

    Another problem is that since logically there is no difference
    between the goal (G1, G2) and the goal (G2, G1), we also need to
    find some mechanism for ensuring that IO operations happen in the
    intended order and are not mixed up as a consequence of the
    compiler reordering goals for optimization purposes and so forth.

    UNIQUENESS

        A number of solutions to the IO problem have been adopted by
        the pure, declarative languages, the main contenders being the
        monadic approach (as exemplified by Haskell) and the
        uniqueness approach (as exemplified by Clean and Mercury.)

        The uniqueness approach works like this: we view a Mercury
        program as a function from world states to world states.  The
        top-level main/2 predicate of a Mercury program takes the
        current world state as an input and computes a new world state
        as its result.  Every IO operation does the same thing: takes
        a world state as input and produces a new world state as
        output - notionally updated with the effects of the IO
        operation (and the actions of the world at large between IO
        operations).  The so-called IO state is opaque to the Mercury
        program; it can only be queried via the operations defined in
        the io module.
        [Footnote: of course, the Mercury program doesn't actually
        pass the state of the world around in fact - the IO state
        abstraction serves both to ensure the properties we require
        and to give a semantics to IO in Mercury.]

            Example:

            [begin code]

                :- pred main(io, io).
                :- mode main(di, uo) is det.

                main(IO0, IO) :-
                    io__write_string("pi = ", IO0, IO1),
                    io__write_float(math__pi, IO1, IO2),
                    io__nl(                   IO2, IO ).

            [end code]

            The type of the IO state is called just io and is defined
            as an abstract type in the io library module.  The
            top-level predicate main/2 takes the initial IO state as a
            /destructive input/ (di), and produces another as a
            /unique output/ (uo).  The three IO operations in the body
            show how the initial IO state, IO0, is transformed into
            the final IO state, IO.  So, io__write_string/3 destroys
            IO0 and produces IO1 as a unique output.  Next,
            io__write_float/3 destroys IO1 and produces IO2 as a
            unique output.  Finally, io__nl (which writes out a
            newline) destroys IO2 and produces IO as a unique output.

        In order to ensure that old versions of the world state cannot
        be accessed after an IO operation, the IO state is /unique/ -
        this means that there can only ever be one live reference to
        it.  [Footnote: a live reference is one that will be used
        later on in computation.]  The old version of the IO state
        is said to be clobbered by an IO operation - the compiler will
        report an error if the old version is still live after the IO
        operation in question.  Similarly, it is impossible to make a
        copy the IO state.  [Footnote: it is possible to "fork" the IO
        state, this is necessary to support concurrency.  Concurrency
        is dealt with in a later chapter. XXX]

            Example:

            [begin code]

                :- pred main(io, io).
                :- mode main(di, uo) is det.

                main(IO0, IO) :-
                    io__write_string("Hello, ",  IO0, IO1),
                    io__write_string("world!\n", IO0, IO ).

            [end code]

            Here IO0 is used twice; the compiler spots the bug and
            rejects the program with

            [begin code]

                In clause for `main(di, uo)':
                  in argument 2 of call to predicate `io:write_string/3':
                  unique-mode error: the called procedure would clobber
                  its argument, but variable `IO0' is still live.

            [end code]

        The uniqueness constraint is sufficient to ensure that IO
        operations happen in a strict order, specified by the
        programmer, and that it is impossible to backtrack over IO or
        refer to a dead IO state.

        Note that uniqueness is not a property reserved for IO states:
        it is used to implement destructively updated arrays, the
        store data type which allows the construction of arbitrary
        pointer graphs, hash tables and so forth.  Uniqueness allows
        the compiler to use a safe form of destructive update of data
        structures: there is no reason why a dead unique object cannot
        be reused to create a new live unique object (since the old
        value can never be accessed), which is exactly what happens
        for the data types just mentioned.

    STYLISTIC CONSIDERATIONS

        Since passing the IO state around everywhere is a little
        cumbersome and also quite restrictive (it can only be passed
        into det predicates), Mercury programmers naturally find
        themselves separating applications into two parts: the part
        that handles all the IO and the part that handles all the
        interesting processing.  This is good style in general, and
        although one might find it slightly annoying not to be able to
        insert print statements willy-nilly as is the case with impure
        languages, one soon finds that the discipline imposed pays
        real dividends in terms of reusability, maintainability, ease
        of debugging and so forth.

    * DETERMINISM RESTRICTIONS

        Since IO operations cannot be backtracked across, the IO state
        (and, indeed, unique objects in general) cannot be passed to
        non-deterministic predicates - that is, only deterministic
        predicates can take unique IO states as arguments.  [Footnote:
        this is not strictly true; there is another determinism,
        cc_multi, which is compatible with uniqueness.]

    * STATE VARIABLE SYNTAX

        Having to name and pass two variables around for every IO
        operation quickly becomes tiresome.  Mercury has a special
        /state variable/ syntax for just this purpose.  The idea is to
        write code that looks a little more like what one would write
        in an imperative language, but which is transformed by the
        compiler into pure Mercury.  A state variable argument !X
        stands for two real arguments, !.X and !:X, which in turn
        stand for the "current" and "next" values of the state variable
        X, respectively.  Occurrences of !.X and !:X are converted by
        the compiler into appropriately numbered variables.

        For example, the following code

        [begin code]

                % Writes out a list of strings, separated by
                % commas.
                %
            :- pred write_strings(list(string), io, io).
            :- mode write_strings(in,           di, uo) is det.

            write_strings([],            !IO).

            write_strings([S1],          !IO) :-
                io__write_string(S1,     !IO).

            write_strings([S1, S2 | Ss], !IO) :-
                io__write_string(S1,     !IO),
                io__write_string(", ",   !IO),
                write_strings(Ss,        !IO).

        [end code]

        is transformed by the compiler into
        [Footnote: note that the pred and mode declarations reflect
        the fact that !IO is actually two arguments, not one.]

        [begin code]

                % Writes out a list of strings, separated by
                % commas.
                %
            :- pred write_strings(list(string), io, io).
            :- mode write_strings(in,           di, uo) is det.

            write_strings([],            IO0, IO) :-
                IO = IO0.

            write_strings([S1],          IO0, IO) :-
                io__write_string(S1,     IO0, IO).

            write_strings([S1, S2 | Ss], IO0, IO) :-
                io__write_string(S1,     IO0, IO1),
                io__write_string(", ",   IO1, IO2),
                write_strings(Ss,        IO2, IO ),

        [end code]

        Henceforth we will use state variable syntax rather than
        explicitly IO states.

    COMMON IO OPERATIONS

        The io library module defines a plethora of useful IO
        operations and as usual with libraries, the reader is
        encouraged to take some time to peruse the interface section.
        Here we will present some basic IO operations to help get the
        ball rolling.

        OUTPUT

            Output is generally simpler to deal with than input,
            because, by and large, there are no error codes to deal
            with.

            The io library module includes predicates for the output
            of the basic types:

            [begin code]

                :- pred io__write_string(string, io, io).
                :- mode io__write_string(in,     di, uo) is det.

                :- pred io__write_char(char, io, io).
                :- mode io__write_char(in,   di, uo) is det.

                :- pred io__write_int(int, io, io).
                :- mode io__write_int(in,  di, uo) is det.

                :- pred io__write_float(float, io, io).
                :- mode io__write_float(in,    di, uo) is det.

            [end code]

            However, it's typically easier to use the more general
            formatted output predicate:

            [begin code]

                :- pred io__format(string, list(poly_type), io, io).
                :- mode io__format(in,     in,              di, uo) is det.

            [end code]

            The string argument is a format string, very similar in
            spirit to what one would pass to C's printf().  The list
            argument is a type safe means of passing the parameters to
            be formatted.

            Using io__format/4 one might write

            [begin code]

                :- pred write_record(string, int, float, io, io).
                :- mode write_record(in,     in,  in,    di, uo) is det.

                write_record(Name, Age, Children, !IO) :-
                    io__format("%s is %d years old and has %f children.\n",
                               [s(Name), i(Age), f(Children)], !IO).

            [end code]

            and then we could call

            [begin code]

                write_record("Joe Bloggs", 32, 2.4, !IO)

            [end code]

            and the program would write out "Joe Bloggs is 32 years
            old and has 2.4 children.", followed by a newline.  The
            implementation of write_record/5 is much simpler than the
            functionally equivalent

            [begin code]

                :- pred write_record(string, int, float, io, io).
                :- mode write_record(in,     in,  in,    di, uo) is det.

                write_record(Name, Age, Children,           !IO) :-
                    io__write_string(Name,                  !IO),
                    io__write_string(" is ",                !IO),
                    io__write_int(Age,                      !IO),
                    io__write_string(" years old and has ", !IO),
                    io__write_float(Children,               !IO),
                    io__write_string(" children.\n",        !IO).

            [end code]

            In fact, io__format/4 is quite a bit more powerful, in
            that the "%..." formatting specifications can include
            details as to the style of formatting, precision,
            justification and so forth.

            [XXX I'm not sure I should mention io__print/3 this early.]

            Another useful predicate the io library module provides is

            [begin code]

                :- pred io__print(T,  io, io).
                :- mode io__print(in, di, uo) is det.

            [end code]

            io__print/3 is used to print a representation of arbitrary
            Mercury values.  Be aware, though, that if you try to
            print the results of expressions, the compiler may ask you
            to supply more type information to help resolve what
            exactly it is you are printing (i.e. the expression or the
            value of the expression.)  This is subtle stuff and will
            be dealt with in a later chapter. XXX

        INPUT

            Input is marginally more complex than output since
            typically on of three things can happen:
            1. we successfully read a value of the required type from
            the input stream;
            2. we hit the end-of-file;
            3. an error of some kind occurs (e.g. the input is
            malformed or the input source has gone away unexpectedly.)

            To this end the io library module defines the following
            type to report the results of input operations:

            [begin code]

                :- type io__result(T) ---> ok(T)
                                      ;    eof
                                      ;    error(io__error).

            [end code]

            In order, ok(X) is returned if the input operation
            succeeded, reading X; eof is returned if the end of file
            has been reached; and error(ErrorCode) is returned if
            something went wrong (the function io__error_message/1 can
            be used to turn ErrorCode into something printable.)

            This arrangement forces the programmer to handle the error
            cases.  There is still lively debate over whether error
            codes or throwing exceptions is the best way to handle
            errors for things like this.  A genuine advantage of the
            error code approach is that you have to consider the error
            cases from the outset, which, while requiring a little
            more initial thought from the programmer, usually pays
            real dividends later on.
            [XXX is this the right place to say this?  Should I
            enlarge on the debate?  Probably no and yes
            respectively...]

            Two very basic input predicates are

            [begin code]

                :- pred io__read_char(io__result(char), io, io).
                :- mode io__read_char(in,               di, uo) is det.

                :- pred io__read_line_as_string(io__result(string), io, io).
                :- mode io__read_line_as_string(in,                 di, uo)
                            is det.

            [end code]

            The input predicates are also less comprehensive in the
            sense that there are no predicates io__read_int/3,
            io__read_float/3 or io__read_string/3.  The problem is
            that it's not clear exactly what should be allowed to
            terminate the input stream for an int, float or string.
            Instead, the library leaves issues of parsing up to
            applications (there are several programs in the Mercury
            extras distribution to help, including a lexer and parser
            generators.)

            Here's a simple program that XXX Here !!!

MODES
    DIRECTION OF DATA FLOW
    DETERMINISM CATEGORIES
    COMMITTED CHOICE
    * SUBTYPING
    * UNIQUE MODES
        DETERMINISM RESTRICTIONS
        MOSTLY UNIQUE MODES
    HIGHER ORDER MODES
        THE STANDARD FUNC MODE
        STANDARD FUNCS AND THE GROUND INST

MODULES
    NAMESPACES
    OVERLOADING AND NAME RESOLUTION
    SUBMODULES
        NESTED
        SEPARATE

HIGHER ORDER PROGRAMMING
    HIGHER ORDER APPLICATION
    PRED AND FUNC MODES
    * MONOMORPHISM RESTRICTION

* TYPE CLASSES
    OO PROGRAMMING
    TYPE CLASS DECLARATIONS
        METHOD SIGNATURES
        TYPE CLASS CONSTRAINTS
    INSTANCE DECLARATIONS
        METHOD IMPLEMENTATIONS
        TYPE CLASS CONSTRAINTS
    EXISTENTIALLY QUANTIFIED TYPES
        USE
        WHY OUTPUT ONLY
    ...CONSTRUCTOR CLASSES
    ...FUNCTIONAL DEPENDENCIES
    RESTRICTIONS AND EXPLANATIONS THEREOF
        ON TYPE CLASS DEFINITIONS
        ON INSTANCE DEFINITIONS

LISTS
    DESCRIPTION
    MAP, FOLD & MEMBER

MAPS

ARRAYS

COMPILING PROGRAMS
    MMAKE
    COMPILER FLAGS
    COMPILATION GRADES

* STORES

* EXCEPTIONS
    THROWING
    CATCHING
        EFFECT ON DETERMINISM
        RESTORING (PLAIN) DETERMINISM (PROMISE_ONLY_SOLUTION)

* FOREIGN LANGUAGE INTERFACE
    DECLARATIONS
    DATA TYPES

* IMPURE CODE
    LEVELS OF PURITY
    EFFECT OF IMPURITY ANNOTATIONS
    PROMISING PURITY (PRAGMA PROMISE_PURE)

* PRAGMAS
    INLINING
    TYPE SPECIALIZATION
    OBSOLESCENCE
    MEMOING
    ...PROMISES

* DEBUGGING
    COMPILING FOR DEBUGGING
    BASIC TOUR OF THE DEBUGGER
    DECLARATIVE DEBUGGING

* OPTIMIZATION
    WHEN TO DO IT AND WHEN TO AVOID IT
    PROFILING
    VARIOUS CONSIDERATIONS
    AN OVERVIEW OF CONTEMPORARY OPTIMIZER TECHNOLOGY

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
Previous message: [m-rev.] diff: fix MLDS nondet pragma c_code bug
Next message: [m-rev.] diff: fix `.mih' installation for default grade
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list