[m-dev.] Request for comments on book sketch
Ralph Becket
rafe at cs.mu.OZ.AU
Fri Jan 11 15:06:51 AEDT 2002
I've been working on a sketch of the tutorial book. This is intended as
a rough guide for the real thing and to help resolve issues of order,
depth, et cetera. I would be very grateful to receive any comments.
Cheers,
- Ralph
| sketch.txt
| Ralph Becket <rafe at cs.mu.oz.au>
| Mon Oct 15 13:05:39 EST 2001
| vim: ft=txt ts=4 sw=4 et wm=10
|
| [This is intended as a brief sketch of the book, to be used as a guide
| for writing the real text.]
|
|
|
| HELLO, WORLD!
|
| Because it's traditional... Type the following into a file called
| "hello.m".
|
| [begin code]
|
| :- module hello.
|
| :- interface.
| :- import_module io.
|
| :- pred main(io, io).
| :- mode main(di, uo) is det.
|
| :- implementation.
| :- import_module string.
|
| % The show starts here.
| %
| main --> io__print("Hello, world!\n").
|
| [end code]
|
| Compile and run the program with
|
| [begin code]
|
| $ mmake hello.depend
| $ mmake hello
| $ ./hello
| Hello, world!
|
| [end code]
|
| We'll start by just listing some of the salient points illustrated
| by the "Hello, world!" program.
|
| - Modules live in files with the same name (with a ".m" suffix).
| - Every module starts with a declaration giving its name.
| - Non-code directives and declarations are introduced with ":-".
| - Every declaration ends with a full stop.
| - Modules are divided into interface and implementation sections.
| - The interface section lists the things that are exported by the
| module. The top-level module in a Mercury program must export a
| predicate main/2 (functors are conventionally referred to with
| name/arity).
| - The implementation section provides the code for the functions
| and predicates exported in interface section.
| - A module has to be imported before the things it defines can be
| used.
| - The basic computational device in Mercury is the predicate.
| Predicates have type signatures and mode signatures - the latter
| specifying which arguments are inputs and which are outputs.
| - Comments start witha "%" sign and extend to the end of the line.
|
|
|
| DECLARATIVE VS IMPERATIVE PROGRAMMING
|
| DEFINITIONS
|
| Imperative: a list of instructions for transforming state.
|
| Declarative: a specification of /what/ is to be computed.
|
| Pure declarative programming languages exhibit referential
| transparency. In a nutshell, this means that anywhere you see
| a reference to a definition, you can replace it with the
| definition itself and it will make no difference.
|
| In more formal language,
|
| let x = e in M == M[e/x]
|
| This is clearly not true of imperative languages. Consider
| the following C example:
|
| [begin code]
|
| int g = 0;
|
| int f(int x)
| {
| g = g + 1;
| return x + g;
| }
|
| void main(int argc, char **argv)
| {
| int tmp = f(1);
| int a = tmp + tmp;
| int b = f(1) + f(1);
|
| if(a == b)
| printf("equivalent\n");
| else
| printf("not equivalent\n");
| }
|
| [end code]
|
| According to C semantics, this program should print out "not
| equivalent", proving that the expression f(1) is not equal
| to itself! This sort of thing is great for writing buggy,
| hard to maintain code.
|
| Since referential transparency means no side-effects, you
| can't have variables that change their state as the program
| evolves. Instead, a "variable" in a declarative programming
| language is a label given to a value or the result of a
| computation.
|
| The nearest Mercury equivalent to the function f() in the C
| program above is
|
| [begin code]
|
| :- pred f(int, int, int, int).
| :- mode f(in, out, in, out) is det.
|
| f(X, Result, GIn, GOut) :-
| GOut = GIn + 1,
| Result = GOut + X.
|
| [end code]
|
| Since there are no variables in Mercury (and hence no global
| variables), any "global" state has to be explicitly passed
| around wherever it is needed. In this case f/4 takes the old
| "state" as its third argument and returns the new "state" in
| its fourth argument.
|
| BENEFITS OF DECLARATIVE STYLE
|
| While the lack of mutable state might seem like a serious
| drawback to a programmer used only to imperative languages, in
| practice it turns out to be something of a boon. Experienced
| imperative programmers know that fewer globals means clearer,
| more maintainable, more reusable code with fewer bugs.
|
| The philosophy behind declarative languages is to simplify the
| /writing/ of bug free programs and to leave the tedious
| business of book keeping and optimization to the compiler.
| Referential transparency means that more optimizations can be
| applied in more places in a program than is the case with
| imperative programs, simply because proving that it is safe to
| apply a particular optimization is so much easier in the
| absence of side effects.
|
| Mercury was designed as an industrial strength purely
| declarative programming language aimed at the rapid
| development of medium and large scale programs with the
| emphasis on producing fast, correct programs.
|
| To this end, Mercury doesn't support a corner-cutting
| programming style: you /have/ to get the types right, check
| return codes and so forth - quick lashups are rarely an
| option. The payoff is that nine times out of ten Mercury
| programmers find that a program that compiles also does
| exactly what was intended; few imperative languages, if any,
| can make such a claim.
|
| [XXX Place to mention types, garbage collection, polymorphism,
| pattern matching, etc etc?]
|
| PRAGMATISM
|
| "A foolish consistency is the hobgoblin of small minds."
| -- Ralph Waldo Emmerson
|
| Purity is all very well, but the fact remains that
| occasionally one has to interoperate with code written in
| impure code and that (very rarely and usually when that last
| drop of speed is required) some low-level algorithms may be
| best expressed as impure constructs.
|
| For the former, Mercury has a simple and well developed
| foreign language interface allowing the programmer to write
| foreign code in-line with the Mercury program (provided the
| Mercury compiler has an appropriate back-end for the foreign
| language in question - the alternative is to use C as the
| lingua franca in the traditional style.) It is the
| programmer's responsiblility to supply the appropriate purity
| declarations for predicates defined using foreign code.
|
| The latter is handled using purity annotations. Impure code
| (written in a foreign language) must be labelled as such.
| These labels also have to be applied to all predicates that
| use the impure code. At some point (hopefully) a pure
| interface can be presented to the programmer, in which case
| the program must include a "promise" to the compiler that
| impurity annotations are not required for users of the
| top-level predicate with an impure definition.
|
| MERCURY PHILOSOPHY
|
| [XXX Haven't I already done this?]
|
| TYPES
|
| Mercury is strongly typed. Every value has a type and every
| predicate associates a particular type with each argument.
|
| PRIMITIVE
|
| The primitive Mercury types are int, float, char and string -
| examples are 123, 3.141, ('x'), and "foo", respectively.
|
| While primitive types are recognised by the compiler, no
| operations are. Operations on the primitive types are defined in
| modules of the same name in the standard library.
|
| ALGEBRAIC TYPES
|
| You can define new types in Mercury as the following examples
| illustrate:
|
| [begin code]
|
| :- type primary_colour
| ---> red
| ; green
| ; blue.
|
| :- type rgb_colour
| ---> rgb(float, float, float). % red, green, blue.
|
| [end code]
|
| Here primary_colour is the name of the new type and its
| /constructors/ (possible values) are red, green and blue.
|
| Similarly rgb_colour is a type with only one constructor,
| rgb/3, which takes three float arguments (this is rather like
| a C struct).
|
| You can define parametric types
|
| [begin code]
|
| :- type binary_tree(T)
| ---> leaf
| ; branch(T, binary_tree(T), binary_tree(T)).
|
| [end code]
|
| So the type binary_tree(T) has two constructors: leaf/0 and
| branch/3 which takes a T, a binary_tree(T) and a
| binary_tree(T) as arguments respectively.
|
| We can have instances of binary_tree(T) for any type T - for
| example, binary_tree(int), binary_tree(string) or even
| binary_tree(binary_tree(...)).
|
| - T here is said to be a type variable.
| - Variables of any kind in Mercury are distinguished by
| starting with a capital letter.
| - A type may have several type parameters.
| - Any variables that appear in the constructor definitions
| must also appear in the head of the type declaration.
| - All type variables in the head must be distinct.
|
| You can use other types in your definitions. For example, we
| could define a more general tree type with
|
| [begin code]
|
| :- type tree(T)
| ---> tree(T, list(tree(T))).
|
| [end code]
|
| So the type tree(T) has a single constructor tree/2 which
| takes a T and a list(tree(T)) as arguments, respectively
| (list(T) is defined in the list module in the standard
| library.)
|
| (Note that Mercury doesn't get confused if you use the same
| name for both type and constructor.)
|
| PATTERN MATCHING
|
| Pattern matching is the primary means of identifying which
| constructor is used for a particular type value. For example:
|
| [begin code]
|
| :- func insert(binary_tree(T), T) = tree(T).
|
| insert(leaf, X) =
| branch(X, leaf, leaf).
|
| insert(branch(A, L, R), X) =
| ( if X < A then branch(A, insert(L, X), R)
| else if A < X then branch(A, L, insert(R, X))
| else branch(X, L, R)
| ).
|
| [end code]
|
| The first clause matches when the first argument is a leaf/0
| and the second clause matches when the first argument is a
| branch/3.
|
| Alternatively one can use explicit unification as in
|
| [begin code]
|
| :- func insert(binary_tree(T), T) = tree(T).
|
| insert(T0, X) = T :-
| (
| T0 = leaf,
| T = branch(X, leaf, leaf)
| ;
| T0 = branch(A, L, R),
| ( if X < A then
| T = branch(A, insert(L, X), R)
| else if A < X then
| T = branch(A, L, insert(R, X))
| else
| T = branch(X, L, R)
| )
| ).
|
| [end code]
|
| or unification in an if-then-else:
|
| [begin code]
|
| :- func insert(binary_tree(T), T) = tree(T).
|
| insert(T0, X) = T :-
| ( if T0 = leaf then
| T = branch(X, leaf, leaf)
| else
| T0 = branch(A, L, R),
| ( if X < A then
| T = branch(A, insert(L, X), R)
| else if A < X then
| T = branch(A, L, insert(R, X))
| else
| T = branch(X, L, R)
| )
| ).
|
| [end code]
|
| EQUIVALENCE TYPES
|
| It is also useful to be able to declare type synonyms. For
| example:
|
| [begin code]
|
| :- type int_tree == tree(int).
|
| [end code]
|
| Mercury would then make no distinction between a reference to
| int_tree and a reference to tree(int).
|
| Equivalence types can also be parametric:
|
| [begin code]
|
| :- type list_tree(T) == tree(list(T)).
|
| [end code]
|
| ABSTRACT TYPES
|
| Often one needs to export a type, but hide it's definition.
| Abstract types are used for this purpose:
|
| [begin code]
|
| :- module complex.
| :- interface.
| :- import_module float.
|
| :- type complex.
|
| :- func rectangular(float, float) = complex. % re, im.
| :- func polar(float, float) = complex. % r, theta.
| :- func complex + complex = complex.
| :- func complex * complex = complex.
| ...
|
| :- implementation.
|
| :- type complex
| ---> re_im(float, float).
|
| ... definition of funcs using private rectangular
| representation
|
| [end code]
|
| The abstract type declaration in the module interface must be
| matched by a concrete type definition (which may be an
| equivalence) in the implementation section.
|
| The definition of the abstract type is hidden from users of
| the complex module.
|
| Abstract types may also be parametric.
|
| ALGEBRAIC TYPES WITH FIELDS
|
| Just as with C, it is possible to give names to constructor
| fields:
|
| [begin code]
|
| :- type rgb_colour
| ---> rgb(
| red :: float,
| green :: float,
| blue :: float
| ).
|
| [end code]
|
| Now, if X is a value of type rgb_colour then X ^ red,
| X ^ green and X ^ blue are expressions that return the value
| in the corresponding fields. For example, if we have
|
| X = rgb(0.1, 0.2, 0.3)
|
| then
|
| X ^ red = 0.1,
| X ^ green = 0.2,
| X ^ blue = 0.3
|
| A field can be updated in the following way:
|
| Y = ( X ^ red := 0.4 )
|
| results in
|
| Y = rgb(0.4, 0.2, 0.3)
|
| So updating a field is the same as applying a substitution to
| a field and obtaining the new value. We can chain several
| substitutions together:
|
| Y = ((( X ^ red := 0.4 )
| ^ green := 0.5 )
| ^ blue := 0.6 )
|
| * Note that it is currently an error for the same field name to
| be used in type different /types/ in the same scope. However,
| different constructors of a type may have fields with the same
| name. For example:
|
| [begin code]
|
| :- type foo ---> foo(a :: int).
|
| :- type bar ---> bar(a :: int).
|
| [end code]
|
| would lead to an error since types foo and bar are both
| visible in the same scope and both have constructors with a
| common field name. The following is acceptable, however:
|
| [begin code]
|
| :- type baz
| ---> constructor1(a :: int, ...)
| ; constructor2(a :: int, ...)
| ; constructor3(a :: int, ...).
|
| [end code]
|
| * TYPES WITH USER-DEFINED EQUALITY
|
| [XXX Defer explanation until later.]
|
| PREDICATES
|
| In logical terms, a predicate describes a relationship between its
| arguments.
|
| Predicates cover not only tests and functions, as found in other
| languages, but also procedures with multiple return values and
| even non-deterministic relationships.
|
| INTRODUCTION
|
| A predicate is defined by a set of /clauses/, where each
| clause takes the form
|
| Head :- Body.
|
| This should be read as saying "Head is true if Body is true" with
| the set of clauses forming an exhaustive specification of the
| predicate.
|
| If the body of a predicate is empty, one can just write
|
| Head.
|
| These sorts of clauses are called facts.
|
| The body of a predicate clause is a /goal/ - a logical formula
| that must be satisfied in order for the head to hold.
|
| A very simple example of a predicate is
|
| [begin code]
|
| :- pred max(int, int, int). % A, B, Max.
| :- mode max(in, in, out) is det.
|
| max(A, B, Max) :-
| ( if B < A then Max = A else Max = B ).
|
| [end code]
|
| This predicate takes two integers, A and B, and succeeds
| binding Max to A if A is greater than B or binding Max to B
| otherwise. The part of the mode declaration "is det" means
| that this is a deterministic predicate: it will always succeed
| and has just one solution for any pair of inputs.
|
| (This sort of deterministic relation with a single output is
| called a function. Functions are so common that Mercury
| includes special syntax and conventions to simplify working
| with them. More of this in a later section.)
|
| For a more interesting example, consider the following small
| genealogical database:
|
| [begin code]
|
| :- type person
| ---> arthur ; bill ; carl
| ; alice ; betty ; cissy.
|
| [end code]
|
| We start by defining predicates that we can use to decide if a
| particular person is male or female. These predicates are
| labelled semidet because they have at most one solution
| (success) for a given argument.
|
| [begin code]
|
| :- pred male(person).
| :- mode male(in) is semidet.
|
| male(arthur).
| male(bill).
| male(carl).
|
| :- pred female(person).
| :- mode female(in) is semidet.
|
| female(Person) :- not male(Person).
|
| [end code]
|
| The definition for female/1 states that female(Person)
| succeeds iff male(Person) fails. (Recall that variables in
| Mercury are distinguished by starting with a capital letter.)
|
| [begin code]
|
| :- pred father(person, person). % Child, Father.
| :- mode father(in, out) is semidet.
|
| father(betty, arthur).
| father(carl, bill).
| father(cissy, bill).
|
| :- pred mother(person, person). % Child, Mother.
| :- mode mother(in, out) is semidet.
|
| mother(bill, alice).
| mother(carl, betty).
| mother(cissy, betty).
|
| [end code]
|
| The predicates father/2 and mother/2 take their first argument
| as an input and return a result in the second. The
| determinism is semidet again because they are not exhaustive:
| some inputs can cause them to fail (e.g. there is no solution
| for father(
|
| [begin code]
|
| :- pred parent(person, person). % Child, Parent.
| :- mode parent(in, out) is nondet.
|
| parent(Child, Parent) :- father(Child, Parent).
| parent(Child, Parent) :- mother(Child, Parent).
|
| [end code]
|
| This simply says that the parent of a child is either the
| father or the mother. The determinism is nondet because for a
| given Child argument this predicate may fail or may have more
| than one solution:
| - both father/2 and mother/2 can fail
| (e.g. if Child = arthur);
| - just one of father/2 or mother/2 could succeed
| (e.g. if Child = bill);
| - both father/2 and mother/2 could succeed
| (e.g. if Child = cissy).
|
| (How failure and multiple solutions affect the execution of a
| Mercury program is explained below.)
|
| [begin code]
|
| :- pred ancestor(person, person). % Person, Ancestor.
| :- mode ancestor(in, out) is nondet.
|
| ancestor(Person, Ancestor) :-
| parent(Person, Ancestor).
|
| ancestor(Person, Ancestor) :-
| parent(Person, Parent),
| ancestor(Parent, Ancestor).
|
| [end code]
|
| We can now define an ancestor of a Person as being either
| - a parent of that Person or
| - an ancestor of a parent of that Person.
|
| Since parent/2 is nondet, so is ancestor/2.
|
| This table explains the difference between the number of
| solutions a predicate can have for a given determinism:
|
| Determinism Number of Solutions
| Min Max
|
| semidet 0 1
| nondet 0 at least one
| det 1 1
| multi 1 at least one
|
| (There are a few other determinisms, but we shan't cover them
| at this point.)
|
| The compiler will check that your determinism declarations are
| correct.
|
| One interesting thing about this database is that there's no
| reason why it shouldn't be run in reverse. That is, for any
| father or mother, we should be able to deduce who their
| children are and for any ancestor we should be able to work
| out who their descendants are.
|
| To gain this extra functionality we have only to add the
| required extra mode declarations; there is no need to touch
| the definitions themselves! The extra mode declarations are
|
| [begin code]
|
| :- mode father(out, in) is nondet. % Infer Child from Father.
| :- mode mother(out, in) is nondet. % Infer Child from Mother.
| :- mode parent(out, in) is nondet. % Infer Child from Parent.
| :- mode ancestor(out, in) is nondet. % Infer Person from Ancestor.
|
| [end code]
|
| Notice that father/2 and mother/2 are nondet in this mode
| rather than semidet: looking at the definitions we see that
| father(Child, bill) has multiple solutions Child = carl and
| Child = sissy, while mother(Child, betty) also has solutions
| Child = carl and Child = sissy.
|
| There is no reason why a predicate cannot have several output
| arguments (or even no input arguments...) For example, we
| might go on to add
|
| [begin code]
|
| :- pred parents(person, person, person). % Child, Mother, Father.
| :- mode parents(in, out, out) is semidet.
| :- mode parents(out, in, in) is nondet.
|
| parents(Child, Mother, Father) :-
| mother(Child, Mother),
| father(Child, Father).
|
| [end code]
|
| The first mode of parents/3 is semidet because both mother/2
| and father/2 are semidet when called with Child as an input.
|
| The second mode is nondet for similar reasons, but exhibits an
| interesting property. At first glance you might think that
| something will go wrong here: when parents/3 is called in the
| second mode, initially Mother and Father are inputs and Child
| is an output. However, if the call to mother/2 succeeds, then
| Child will end up being bound to the identifier for some
| person. This appears that we would then be trying to call
| father/2 with /both/ arguments as inputs, but there is no such
| mode declaration for father/2. There is no need to worry,
| however - the compiler will recognise the situation and treat
| the definition of parents/3 like this as far as the second
| mode is concerned:
|
| [begin code]
|
| :- mode parents(out, in, in) is nondet.
|
| parents(Child, Mother, Father) :-
| mother(Child, Mother),
| father(X, Father),
| X = Child.
|
| [end code]
|
| So here father/2 is being called with (out, in) arguments and
| the parents/3 succeeds if the result X is bound to the same
| value as Child.
|
| In effect the compiler has deduced that father/2 has the
| following implied mode:
|
| [begin code]
|
| :- mode father(in, in) is semidet.
|
| [end code]
|
| In general, Mercury will reorder each clause body for each
| mode declaration so that results are generated before they are
| needed - each mode of a predicate is referred to as a
| /procedure/.
|
| EXECUTION
|
| This section explains Mercury's execution model in more
| detail. In particular, the notions of failure, backtracking
| and non-determinism are addressed.
|
| Let's start by looking at some of the code for our
| genealogical database again - this time we're going to label
| the clauses to help us see how programs execute in Mercury.
|
| [begin code]
|
| :- pred father(person, person). % Child, Father.
| :- mode father(in, out) is semidet.
| :- mode father(out, in) is nondet.
|
| f1 father(betty, arthur).
| f2 father(carl, bill).
| f3 father(cissy, bill).
|
| :- pred mother(person, person). % Child, Mother.
| :- mode mother(in, out) is semidet.
| :- mode mother(out, in) is nondet.
|
| m1 mother(bill, alice).
| m2 mother(carl, betty).
| m3 mother(cissy, betty).
|
| :- pred parent(person, person). % Child, Parent.
| :- mode parent(in, out) is nondet.
| :- mode parent(out, in) is nondet.
|
| p1 parent(Child, Parent) :- father(Child, Parent).
| p2 parent(Child, Parent) :- mother(Child, Parent).
|
| :- pred ancestor(person, person). % Person, Ancestor.
| :- mode ancestor(in, out) is nondet.
|
| a1 ancestor(Person, Ancestor) :-
| parent(Person, Ancestor).
|
| a2 ancestor(Person, Ancestor) :-
| parent(Person, Parent),
| ancestor(Parent, Ancestor).
|
| :- pred parents(person, person, person). % Child, Mother, Father.
| :- mode parents(in, out, out) is semidet.
| :- mode parents(out, in, in) is nondet.
|
| s1 parents(Child, Mother, Father) :-
| mother(Child, Mother),
| father(Child, Father).
|
| [end code]
|
| First off, we'll consider the goal parent(cissy, Parent).
| Every time we see a goal we try expanding it according to each
| clause definition in turn (this is what being referentially
| transparent is all about). If we get to a dead end we have to
| /backtrack/ to the last point where we had a choice and try a
| different clause (when we backtrack we also forget any
| variable bindings we might have made one the way from the
| /choice point/.)
|
| [begin code]
|
| parent(cissy, Parent)
| p1 =>
| father(cissy, Parent)
| f1 => FAIL because betty \= cissy
| f2 => FAIL because carl \= cissy
| f3 =>
| Parent = bill
|
| [end code]
|
| So one solution to parent(cissy, Parent) is Parent = bill. We
| obtained this by first expanding the goal according to p1 and
| then expanding the resulting goal (father(cissy, Parent))
| according to each of f1, f2 and f3 until we found one that
| matched.
|
| If the program later has to backtrack over this goal then we
| have to forget the binding Parent = bill and try the next
| clause for parent/2 (since there are no more clauses for
| father/2):
|
| [begin code]
|
| parent(cissy, Parent)
| p2 =>
| mother(cissy, Parent)
| m1 => FAIL because bill \= cissy
| m2 => FAIL because bill \= cissy
| m3 => Parent = betty
|
| [end code]
|
| Now, say we wanted to ask the question "which ancestor of
| carl, if any, is also a parent of bill?" Here's how things
| would proceed:
|
| [begin code]
|
| ancestor(carl, X), parent(bill, Y), X = Y
| a1 =>
| parent(carl, X), parent(bill, Y), X = Y
| p1 =>
| father(carl, X), parent(bill, Y), X = Y
| f1 => FAIL because betty \= carl
| f2 =>
| X = bill, parent(bill, Y), X = Y
| =>
| parent(bill, Y), bill = Y
| p1 =>
| father(bill, Y), bill = Y
| f1 => FAIL because betty \= bill
| f2 => FAIL because carl \= bill
| f3 => FAIL because cissy \= bill
| p2 =>
| mother(bill, Y), bill = Y
| m1 =>
| Y = alice, bill = Y
| =>
| bill = alice
| => FAIL because bill \= alice
| m2 => FAIL because carl \= bill
| m3 => FAIL because cissy \= bill
| p2 =>
| mother(carl, X), parent(bill, Y), X = Y
| m1 => FAIL because bill \= carl
| m2 =>
| X = betty, parent(bill, Y), X = Y
| =>
| parent(bill, Y), betty = Y
| p1 =>
| father(bill, Y), betty = Y
| f1 => FAIL because betty \= bill
| f2 => FAIL because carl \= bill
| f3 => FAIL because cissy \= bill
| p2 =>
| mother(bill, Y), betty = Y
| m1 =>
| Y = alice, betty = Y
| =>
| betty = alice
| => FAIL because betty \= alice
| m2 => FAIL because carl \= bill
| m3 => FAIL because cissy \= bill
| a2 =>
| [footnote: we have to supply fresh local vars in each
| expansion - here Ancestor in the definition of a2 has
| been replaced with a new variable Z.]
| parent(carl, Z), ancestor(Z, X), parent(bill, Y), X = Y
| p1 =>
| father(carl, Z), ancestor(Z, X), parent(bill, Y), X = Y
| f1 => FAIL because betty \= carl
| f2 =>
| Z = bill, ancestor(Z, X), parent(bill, Y), X = Y
| =>
| ancestor(bill, X), parent(bill, Y), X = Y
| a1 =>
| parent(bill, X), parent(bill, Y), X = Y
| p1 =>
| father(bill, X), parent(bill, Y), X = Y
| f1 => FAIL because betty \= bill
| f2 => FAIL because carl \= bill
| f3 => FAIL because cissy \= bill
| p2 =>
| mother(bill, X), parent(bill, Y), X = Y
| m1 =>
| X = alice, parent(bill, Y), X = Y
| =>
| parent(bill, Y), alice = Y
| p1 =>
| father(bill, Y), alice = Y
| f1 => FAIL because betty \= bill
| f2 => FAIL because carl \= bill
| f3 => FAIL because cissy \= bill
| p2 =>
| mother(bill, Y), alice = Y
| m1 =>
| Y = alice, alice = Y
| =>
| alice = alice
| => TRUE
|
| [end code]
|
| So our program concludes that a solution to
|
| [begin code]
|
| ancestor(carl, X), parent(bill, Y), X = Y
|
| [end code]
|
| is
|
| [begin code]
|
| X = alice, Y = alice
|
| [end code]
|
| (As was indicated earlier, in practice we'd be more likely to
| write the goal as just
|
| [begin code]
|
| ancestor(carl, X), parent(bill, X)
|
| [end code]
|
| and let Mercury work out where the implicit unification should
| go.)
|
| [The above example might give the impression that Mercury is a term
| rewriting system. This is not true (although conceivably a very
| slow Mercury implementation might use such a technique...) What
| happens behind the scenes is much more efficient, albeit
| computationally equivalent.]
|
| In practice, ninety per cent of the code one writes is
| deterministic and the majority of what's left will be
| semi-deterministic. However, there are times (like in the example
| above) when non-determinism can be used to write very clear,
| concise programs. Support for backtracking and unification is
| what distinguishes Mercury and other logic programming languages
| from functional languages.
|
| RECURSION
|
| Imperative languages like C and Java provide a whole slew of
| mechanisms for supporting iterative (looping) control flow -
| while loops, repeat-until loops, for loops and so forth.
|
| Declarative languages typically provide but one mechanism for
| such things: recursion. A recursive predicate is simply one
| that calls itself. (Later on we will discover /higher order/
| predicates and observe that the standard library supplies
| predicates that stand in for the common iterative mechanisms
| found in imperative languages.)
|
| Some simple examples:
|
| [begin code]
|
| :- func factorial(int) = int.
|
| factorial(N) =
| ( if N = 0 then 1 else N * factorial(N - 1) ).
|
| :- func fibonacci(int) = int.
|
| fibonacci(N) =
| ( if N =< 2 then 1
| else fibonacci(N - 1) + fibonacci(N - 2) ).
|
| [end code]
|
| These are examples of what is sometimes called "middle
| recursion" where the recursive call is /not/ the last thing
| that the predicate (or function) does when looping. Here,
| calls to factorial/1 finishes with a multiplication and calls
| to fibonacci/1 finish with an addition.
|
| Although middle recursion is easy to read, it incurrs some
| performance penalty in that each iteration has to record
| something extra on the stack in order to finish up the
| computation (that said, the compiler can in many cases turn
| middle recursive predicates into equivalent tail recursive
| predicates that do not incurr a stack overhead.)
|
| [XXX Include a side-bar or something explaining stack frames?]
|
| Tail recursion describes the situation where the last thing a
| predicate does is call itself. Since there is nothing left to
| do after the call, the compiler can reuse the current call's
| stack space for the recursive call, allowing the predicate to
| execute using only a fixed amount of stack space. For
| example, tail recursive implementations of the above
| functions are
|
| [begin code]
|
| factorial(N) = fac(N, 1).
|
| fac(N, X) = ( if N = 0 then X else fac(N - 1, N * X) ).
|
|
|
| fibonacci(N) = fib(1, 1, N).
|
| fib(FN_2, FN_1, N) =
| ( if N =< 2 then FN_1
| else fib(FN_1, FN_1 + FN_2, N - 1) ).
|
| [end code]
|
| UNIFICATIONS
|
| Mercury has but two basic atomic (non-compound) types of goal:
| unifications and calls.
|
| A unification is written X = Y. A unification can fail if X
| and Y are not unifiable.
|
| Two values are unifiable if they are "structurally similar" -
| that is, where you see a type constructor in one, you either
| see the same type constructor in the other (and the
| corresponding arguments are also structurally similar) or a
| variable, which will end up being bound to the corresponding
| term on the other side if the unification is successful.
|
| (Note that unlike Prolog, Mercury forbids the aliasing of
| variables where a partially instantiated data structure
| contains the same unbound variable in two different places.)
|
| Unifications, therefore, can be used to bind variables to
| values, test to see if a variable is bound to a particular
| constructor, unpack the arguments of a constructor or all of
| the above.
|
| Examples:
|
| X = 123
| - if X is unbound then it succeeds binding X to the
| value 123;
| - if X is bound then it succeeds iff the value X is
| bound to is also 123.
|
| X = Y
| - if one of X or Y is bound and the other is not then
| this succeeds with the unbound variable being bound
| to the same value as the bound variable;
| - if both are bound then this succeeds iff the two
| variables are bound to identical values.
| [XXX Should we mention partial instantiation?]
|
| X = foo(A, B, C)
| - if X is unbound and A, B and C are bound, then this
| succeeds binding X to the value foo(A, B, C) (here
| foo/3 is a type constructor);
| - if X is bound then this succeeds iff X is bound to a
| value foo(P, Q, R) and the unifications A = P, B = Q
| and C = R succeed - this may result in some of A, B
| and C being bound as a result if they are unbound at
| the time of the unification.
|
| CALLS
|
| The other kind of primitive goal supported by Mercury is the
| predicate call, p(X1, ..., Xn), which we have already seen in
| the examples above.
|
| CONJUNCTION
|
| A goal of the form G1, G2, ..., Gn is called a conjunction
| with the separating commas read as "and". A conjunction
| succeeds iff a consistent solution to each of the sub-goals G1,
| G2, ..., Gn can be found by the program.
|
| (The compiler may have to reorder the sequence of goals in a
| clause in order to satisfy the mode constraints. Although one
| can set a flag to force the compiler to do no more reordering
| than is necessary, in general this will mean that many
| optimizations will not be applicable. The upshot of this is
| that one should avoid writing code that assumes a particular
| evaluation order other than that dictated by mode
| constraints.)
|
| A conjunction executes by trying each of the goals in order.
| If a goal fails then the program backtracks to the nearest
| preceding choice-point (i.e. non-deterministic goal that may
| have other solutions).
|
| NEGATION
|
| A goal of the form not G succeeds iff G has no solutions. G
| may be a compound goal, in which case it should be enclosed in
| parentheses to avoid syntactic precedence problems.
|
| The sub-goal G is said to be in a /negated context/ and as
| such cannot bind any variables visible outside the negation
| (since the only way not G can succeed is if G fails.)
|
| IF-THEN-ELSE
|
| Mercury's if-then-else construct looks like this:
|
| [begin code]
|
| ( if ConditionGoal then YesGoal else NoGoal )
|
| [end code]
|
| (You may also see if-then-elses written as
|
| [begin code]
|
| ( ConditionGoal -> YesGoal ; NoGoal )
|
| [end code]
|
| although the author finds this style less appealing.)
|
| Note that the else clause is /not/ optional in Mercury. Also,
| while the parentheses are not always required, it is a very
| good idea to include them in order to avoid confusing errors
| with syntactic precedence.
|
| An if-then-else goal works just as you would expect, with the
| following provisos:
| - if the ConditionGoal is non-deterministic and the YesGoal
| fails then the program will backtrack into the
| ConditionGoal;
| - if there is any solution to the ConditionGoal then the
| NoGoal will never be executed;
| - the ConditionGoal may not bind any variables that are
| visible outside the if-then-else (logically the
| ConditionGoal is in a negated context and the same rules
| apply as for ordinary negated goals - however, the YesGoal
| /can/ bind externally visible variables to results obtained
| in the ConditionGoal);
| - an if-then-else goal can fail in its entirety just like any
| other compound goal.
|
| DISJUNCTION
|
| Just as conjunction lets you use "and" to construct goals,
| disjunction lets you use logical "or". A disjunctive goal
| takes the form (G1 ; G2 ; ... ; Gn) with the semicolons being
| read as "or".
|
| A disjunction succeeds iff any of its disjunct sub-goals
| succeeds. A disjunction has as many solutions as all of its
| disjuncts put together: if one disjunct fails or backtracking
| exhausts all the solutions for one disjunct then execution
| proceeds with the next disjunct. Again, the compiler is
| generally free to reorder disjuncts, although this should not
| have a visible impact on programs. Disjunctions are typically
| non-deterministic, although switch disjunctions described
| below are a special case.
|
| The clausal notation we have been using in the examples above
| is in fact convenient syntactic sugar for writing top-level
| disjunctions. For example, the ancestor/2 predicate we
| defined earlier
|
| [begin code]
|
| ancestor(Person, Ancestor) :-
| parent(Person, Ancestor).
|
| ancestor(Person, Ancestor) :-
| parent(Person, Parent),
| ancestor(Parent, Ancestor).
|
| [end code]
|
| could be written equivalently as
|
| [begin code]
|
| ancestor(Person, Ancestor) :-
| (
| parent(Person, Ancestor)
| ;
| parent(Person, Parent),
| ancestor(Parent, Ancestor)
| ).
|
| [end code]
|
| In general it is better style to use clausal form for
| top-level disjunctions.
|
| SWITCHES
|
| Mercury recognises particular forms of (semi-)deterministic
| disjunction which it can compile very efficiently.
|
| The "string" library module defines the following type:
|
| [begin code]
|
| :- type poly_type
| ---> f(float)
| ; i(int)
| ; s(string)
| ; c(char).
|
| [end code]
|
| which can be used to form heterogeneous collections of the
| primitive types (this is useful amongst other things for
| supplying argument lists to string formatting predicates.)
|
| Now, say we wanted to write a predicate that would convert any
| poly_type value into a string representation. Here's how we
| might write the code:
|
| [begin code]
|
| :- pred poly_type_to_string(poly_type, string).
| :- mode poly_type_to_string(in, out) is det.
|
| poly_type_to_string(Poly, String) :-
| ( Poly = f(Float), float_to_string(Float, String)
| ; Poly = i(Int), int_to_string(Int, String)
| ; Poly = s(String)
| ; Poly = c(Char), char_to_string(Char, String)
| ).
|
| [end code]
|
| The compiler knows that since Poly is an input variable it
| must be bound when the disjunction executes. It will then
| observe that the disjunction has one arm for each possible
| poly_type constructor (the Poly unifications) and deduce that
| for any value of Poly, exactly one disjunct applies. The
| compiler can therefore emit faster, simpler code for the
| resulting object code for this disjunction. These kinds of
| disjunction are called /switches/ because they are very
| similar to switch constructs in C. Moreover, because the rest
| of each disjunct is deterministic, the compiler can deduce
| that the switch (and hence the predicate as a whole) must be
| deterministic.
|
| Switches are a very elegant way of describing conditions based
| on unification tests and are typically more efficient than the
| equivalent if-then-else chains.
|
| (Aside: in real the above code would probably be written in
| clausal form:
|
| [begin code]
|
| poly_type_to_string(f(F), S) :- float_to_string(F, S)
| poly_type_to_string(i(I), S) :- int_to_string(I, S)
| poly_type_to_string(s(S), S).
| poly_type_to_string(c(C), S) :- char_to_string(C, S)
|
| [end code]
|
| The switch would still be detected, of course.)
|
| IMPLICATION
|
| Mercury has three types of goal for describing implicative
| relationships between goals.
|
| - G1 => G2 is shorthand for (not G1 ; G2);
| - G1 <= G2 is shorthand for (G2 => G1); and
| - G1 <=> G2 is shorthand for ((G1 => G2), (G1 <= G2)).
|
| Note that parentheses are required around G1 and G2 if they
| are not atomic goals.
|
| EXISTENTIAL QUANTIFICATION
|
| Sometimes we only need to know whether a solution exists, but
| are not interested in the result. For this we use existential
| quantification, which looks like this:
|
| (some [X, Y, Z] G)
|
| A goal of this form will succeed iff there is a solution to G,
| but any bindings for X, Y and Z will not be visible outside
| the quantification - it's rather like saying that X, Y and Z
| are local variables for the goal G.
|
| Mercury has an rule that any variables in a clause that do not
| also appear in the head are implicitly existentially
| quantified, which means you never actually need to use
| explicit existential quantification in your programs.
|
| UNIVERSAL QUANTIFICATION
|
| On the other hand, we may wish to know whether a particular
| property holds for all solutions to a particular goal. This
| is where universal quantification is useful.
|
| The goal
|
| (all [X, Y, Z] G)
|
| is equivalent to writing
|
| not (some [X, Y, Z] not G)
|
| An example of use:
|
| [begin code]
|
| :- pred all_even(list(int)).
| :- mode all_even(in) is semidet.
|
| all_even(Xs) :-
| all [X] (member(X, Xs) => X `mod` 2 = 0).
|
| [end code]
|
| HIGHER ORDER APPLICATION
|
| [XXX I'll talk about this later.]
|
| NOTE ON VARIABLE NAMES
|
| Often one is not interested in a particular output variable
| from a call or unification. In these cases you can use the
| special variable named "_" (a single underscore) which stands
| for a different "don't care" variable every time it appears.
|
| Sometimes, however, it makes programs easier to read if you do
| name don't care variables. Since variables that only appear
| once in a clause are usually the result of typographical
| error, the compiler will issue a warning when it sees such
| things. To get around this problem, /prefixing/ a variable
| name with an underscore (e.g. "_X") tells the compiler that
| you know this is a named don't care variable and that there's
| no need to issue a warning.
|
| FUNCTIONS
| DEFINITION
| PATTERN MATCHING
| RECURSION
| CONDITIONAL EXPRESSIONS
| OVERVIEW TO SEMIDET PREDICATES
| POLYMORPHISM
| * PARTIAL (SEMIDET) FUNCTIONS
| INFIX NOTATION AND SYNTACTIC SUGAR IN GENERAL
|
| INPUT AND OUTPUT
| UNIQUENESS
| * DETERMINISM RESTRICTIONS
| * DCG NOTATION IN DETAIL
|
| MODES
| DIRECTION OF DATA FLOW
| DETERMINISM CATEGORIES
| COMMITTED CHOICE
| * SUBTYPING
| * UNIQUE MODES
| DETERMINISM RESTRICTIONS
| MOSTLY UNIQUE MODES
| HIGHER ORDER MODES
| THE STANDARD FUNC MODE
| STANDARD FUNCS AND THE GROUND INST
|
| MODULES
| NAMESPACES
| OVERLOADING AND NAME RESOLUTION
| SUBMODULES
| NESTED
| SEPARATE
|
| HIGHER ORDER PROGRAMMING
| HIGHER ORDER APPLICATION
| PRED AND FUNC MODES
| * MONOMORPHISM RESTRICTION
|
| * TYPE CLASSES
| OO PROGRAMMING
| TYPE CLASS DECLARATIONS
| METHOD SIGNATURES
| TYPE CLASS CONSTRAINTS
| INSTANCE DECLARATIONS
| METHOD IMPLEMENTATIONS
| TYPE CLASS CONSTRAINTS
| EXISTENTIALLY QUANTIFIED TYPES
| USE
| WHY OUTPUT ONLY
| ...CONSTRUCTOR CLASSES
| ...FUNCTIONAL DEPENDENCIES
| RESTRICTIONS AND EXPLANATIONS THEREOF
| ON TYPE CLASS DEFINITIONS
| ON INSTANCE DEFINITIONS
|
| LISTS
| DESCRIPTION
| MAP, FOLD & MEMBER
|
| MAPS
|
| ARRAYS
|
| COMPILING PROGRAMS
| MMAKE
| COMPILER FLAGS
| COMPILATION GRADES
|
| * STORES
|
| * EXCEPTIONS
| THROWING
| CATCHING
| EFFECT ON DETERMINISM
| RESTORING (PLAIN) DETERMINISM (PROMISE_ONLY_SOLUTION)
|
| * FOREIGN LANGUAGE INTERFACE
| DECLARATIONS
| DATA TYPES
|
| * IMPURE CODE
| LEVELS OF PURITY
| EFFECT OF IMPURITY ANNOTATIONS
| PROMISING PURITY (PRAGMA PROMISE_PURE)
|
| * PRAGMAS
| INLINING
| TYPE SPECIALIZATION
| OBSOLESCENCE
| MEMOING
| ...PROMISES
|
| * DEBUGGING
| COMPILING FOR DEBUGGING
| BASIC TOUR OF THE DEBUGGER
| DECLARATIVE DEBUGGING
|
| * OPTIMIZATION
| WHEN TO DO IT AND WHEN TO AVOID IT
| PROFILING
| VARIOUS CONSIDERATIONS
| AN OVERVIEW OF CONTEMPORARY OPTIMIZER TECHNOLOGY
|
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to: mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions: mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------
More information about the developers
mailing list