[m-dev.] Inductive goals

Mark Brown mark at mercurylang.org
Sat Jan 18 15:00:44 AEDT 2014
Previous message: [m-dev.] map.merge argument order
Next message: [m-dev.] Inductive goals
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi everyone,

Here is a concrete proposal for "inductive goals" in Mercury. I'm
going to avoid calling these "loops", since logic programmers
understandably take a dim view of these (hint: if it terminates, it's
not really a loop). The goals are called inductive because they are
logically equivalent to the conjunction of a sequence of goals, with
this sequence being inductively defined. They could also be called
"for" goals, since that is the keyword they begin with.

This follows on from the "Adding Loops to Mercury" thread earlier this
month, but I've started a new thread for this specific proposal. This
is to put the design on the record. I'm not planning to implement it
myself at this time.

Would this meet people's requirements?

Cheers,
Mark.


Inductive Goals
----------------------

1. Background

The declarative semantics is essentially that of the "more
complicated" version of sequence quantifiers that Fergus Henderson
writes about [1], which are based on Peter Schachte's Sequence
Quantification [2]. The operational semantics are derived from
Sebastian Brand's generic inline iteration proposal [3], which in turn
is derived from Joachim Schimpf's Logical Loops [4].

The main challenge for Mercury is to perform the Logical Loops
translation in such a way that precise determinisms are preserved.
This proposal achieves that by using available mode information to
direct the translation.

[1] http://lists.mercurylang.org/pipermail/developers/2000-December/010812.html
[2] http://lists.mercurylang.org/pipermail/developers/2014-January/015962.html
[3] http://lists.mercurylang.org/pipermail/developers/2005-December/013919.html
[4] http://www.eclipseclp.org/reports/index.html

2. Sequences

We start by defining two kinds of sequence: which I will call map
sequences and fold sequences. For a given sequence of values, the map
sequence is just those values in the same order, while the fold
sequence is the sequence of consecutive pairs of values. So for the
values 1, 2, 3, 4, 5, we have:

elements of map sequence: 1, 2, 3, 4, 5
elements of fold sequence: (1, 2), (2, 3), (3, 4), (4, 5)

The actual values involved will be defined inductively by the goal body.

Two or more sequences "correspond" if they are all of the same length.
Note that this implies that fold sequences contain one more value than
corresponding map sequences, since they include both the initial and
final value.

3. Syntax

Inductive goals are written like this:

    for $sequence, ... mismatch $mismatch do $body

The intuitive meaning of this is to iterate over all sequences in
lockstep, executing $body for each set of corresponding elements, and
executing $mismatch if the sequences don't correspond. The mismatch
goal is optional, and defaults to throwing a software error. If given,
the mismatch goal is not allowed to succeed (i.e., it must have
determinism failure or erroneous).

Sequences are written in one of the following forms:

    $X maps $Xs
    [$Current, $Next] folds [$First, $Last]
    [$A0, $A]

The first says that $X is an element in the map sequence defined by
$Xs. The second says that ($Current, $Next) is an element in the fold
sequence whose first value is $First and whose last value is $Last.
The third is just shorthand for '[$A0, $A] folds [$A0, $A]', and can
be used if $A0 and $A are Mercury variables (the other forms can be
used with arbitrary Mercury expressions).

Mercury variables on the left hand side of 'maps' or 'folds' are
quantified in the body. Those on the right hand side are free (i.e.,
non-local to the inductive goal). For the shorthand form, $A0 and $A
stand for two pairs of variables with the same names, one pair of
which is quantified in the body while the other is free.

4. Examples

Here's how to implement a few of the existing list.m library predicates.

all_true(P, Xs) :- for X maps Xs do P(X).

map(P, Xs, Ys) :- for X maps Xs, Y maps Ys do P(X, Y).

length(Xs, N) :- for _ maps Xs, [I, I + 1] folds [0, N] do true.

map_corresponding3_foldl(P, As, Bs, Cs, Ds, !S) :-
    for A maps As, B maps Bs, C maps Cs, D maps Ds, [!S]
    mismatch throw("list.map_corresponding3_foldl: length mismatch")
    do P(A, B, C, D, !S).

The code for map3_foldl is the same as for map_corresponding3_foldl,
except that the mismatch clause isn't needed. These predicates are
effectively different modes of the same inductive goal.

Sequences that can fail can be useful:

take(N, List, Start) :-
    for
        [I, I + 1] folds [0, N],
        [[X | Xs0], Xs0] folds [List, _],  % can fail
        X maps Start
    do true.

The do goal can refer to non-local variables, although it is not
allowed to bind them. This example refers to the non-local Stream
variable to print the elements in a list:

    write_elements(Stream, Xs, !IO) :-
        for X maps Xs, [!IO] do
            write(Stream, X, !IO),
            nl(Stream, !IO).

Filtering is not done by the construct itself, but is easily done in the body:

filter_map_corresponding(F, As, Bs) = Cs :-
    for A maps As, B maps Bs, [Cs1, Cs0] folds [Cs, []]
    mismatch throw("filter_map_corresponding/2: length mismatch")
    do
        ( if C = F(A, B) then
            Cs1 = [C | Cs0]
        else
            Cs1 = Cs0
        ).

Here's an example that calls P on all the unordered pairs of elements
of list Xs, with accumulator !A:

    for [[X | Tail], Tail] folds [Xs, []], [!A] do
        for Y maps Tail, [!A] do
            P(X, Y, !A).

Here's an example of iterating over a sequence defined by the
functions empty/0, next/1 and value/1:

    for [S, next(S)] folds [Start, empty], [!IO] do write(value(S), !IO).

Here's a similar example using function empty/0 and predicate next/3:

    for [!S] folds [Start, empty], [!IO] do next(I, !S), write(I, !IO).

One thing that inductive goals can't do is use a predicate to decide
if the sequence is empty. The test for whether the sequence is empty
is done by unifying the sequence with a specific value. So an integer
sequence can end with a test like I = N, but not one like I > N. The
same effect, however, can be obtained by adding an extra sequence of
booleans that say whether the induction should continue:

    % Counts while $Test succeeds.
    for [I, I + 1] folds [0, _], [!Continue] folds [yes, no] do
        ( if $Test then $Body else !:Continue = no ).

As a final example, inductive goals can also be used to generate
sequences of all lengths. This goal generates all lists of 0s and 1s:

    for X maps Xs do ( X = 0 ; X = 1 ).

5. Semantics

The meaning of the inductive goal is given by universally quantifying
over all sequence positions. So for example the goal:

    for
        $X maps $Xs,
        [$Current, $Next] folds [$First, $Last],
        [$A0, $A],
        ...
    do
        $Body

means the following:

    all [I] (
        $X is the I'th element of $Xs  /\
        $Current is the I'th value of the first fold sequence  /\
        $Next is the (I+1)'th value of the first fold sequence  /\
        $A0 is the I'th value of the second fold sequence  /\
        $A is the (I + 1)'th value of the second fold sequence  /\
       ...
    =>
        $Body
    )

6. Types and Modes

Types are determined by the type signatures of the sequence
expressions, which are as follows:

    all [T] (T maps list(T))
    all [T] ([T, T] folds [T, T])

These ensure each sequence is typed consistently, but allow different
sequences to have unrelated types.

Like types, modes are determined by the mode signatures of the
sequence expressions. For each form of sequence there is a selection
of modes to choose from. This choice is made for each sequence
independently. The modes are as follows:

    % Input modes:
    in(I) maps in(list(I))
    [in(I), out(I)] folds [in(I), in(I)]

    % Output modes:
    out(I) maps out(list(I))
    [in(I), out(I)] folds [in(I), out(I)]
    [mdi, muo] folds [mdi, muo]
    [di, uo] folds [di, uo]
    [out(I), in(I)] folds [out(I), in(I)]

Sequences with input modes determine the number of iterations that are
generated. Sequences with output modes work with any number of
iterations.

In order to schedule an inductive goal, the mode analyser needs to
schedule all sequences at the same time. This involves checking that,
for each fold sequence, at least one of the terms on the right hand
side is instantiated. Scheduling should probably be delayed until as
many sequences are in the input mode as possible, which will ensure
the strongest determinism result. Note that all inst variables will be
bound when the goal is scheduled, so inst variables won't need to
appear in generated code.

7. Determinism

In Mercury, the determinism of a goal or expression is specified by
the answers to three questions:

    - Can the goal or expression fail?
    - Can the goal produce multiple solutions? (Expressions aren't
allowed to do this.)
    - Does the goal or expression have to be evaluated in a committed
choice context?

For any of these questions, if the answer is yes for the body goal or
the head expressions on the right hand side of the sequence operators,
then it is yes for the inductive goal as a whole. This is also true
for the expressions on the left hand side of the sequence operators,
with one exception. The exception is that, in the input mode for
'folds', the first argument on the left hand side is never tested
against the value of the second argument on the right hand side; had
it taken that value, the inductive goal would already have terminated.
So this expression does not itself need to be proven never to fail. If
it does actually ever fail on some other value, the inductive goal
either fails or the mismatch goal is executed (behaviour in this
respect is undefined - in the translation below it fails if all
sequences are output, but throws the mismatch exception in all other
modes).

Aside from the previous paragraph, the only other way an inductive
goal can fail is if the mismatch goal fails, which can only occur if
it has a failure determinism. The only other way an inductive goal can
produce multiple solutions is if all sequences are in the output mode.
There are no other circumstances in which an inductive goal must be
evaluated in a committed choice context.

8. Translation

Inductive goals are translated at compile time into a call to a
compiler generated predicate. The predicate is passed the following
arguments:

    - One argument for each non-local variable occurring in the body
or in the mismatch goal.
    - One argument for each map sequence.
    - Two arguments for each fold sequence.

In the first case, the arguments are the non-local variable
themselves. In the other two cases, the arguments are those
expressions appearing on the right hand side of the relevant sequence.

For example, Separator is non-local in the following goal:

    for X maps Xs, [!IO] do write(X, !IO), write_string(Separator, !IO)

This is translated into the call:

    $predname(Separator, Xs, !IO)

where $predname is the name of the generated predicate.

In the signature of the generated predicate, the types of the
parameters are determined directly from the types of the arguments
listed above. For each of the non-local arguments the mode is 'in(I)',
where I is the inst of the non-local at the time the call is
scheduled. For the other arguments, the mode is determined by the
selected mode of the sequence in question. The determinism can be
inferred according to the rules given earlier.

For the example above, $predname would be given the following signature:

    :- pred $predname(string::in, list(T)::in, io::di, io::uo) is det.

The head of the generated clause uses the same names for the non-local
head variables as are used in the original goal. For map sequences,
the head variables are named _M1, _M2, etc, and for fold sequences
they are named _Fa1, _Fb1, _Fa2, _Fb2, etc.

For each sequence we define two goals, the end goal and the step goal.
The end goal defines the end of the sequence, while the step goal
defines each step along the way. For i'th map sequence '$X maps $Xs',
the goals are:

    end goal:    _Mi = []
    step goal:   _Mi = [$X | _Mri]

where _Mri is a fresh variable. For the i'th fold sequence '[$Current,
$Next] folds [$First, $Last]', the goals are:

    end goal:    _Fai = _Fbi
    step goal:   _Fai = $Current

The generated clause makes one recursive call to itself. The
non-locals are passed as is. For the i'th map sequence, the recursive
argument is _Mri. For the i'th fold sequence, the two arguments are
$Next and _Fbi.

The general form of the generated clause body depends on which of the
sequences are used in an input mode, and which are output. If there is
at least one input sequence, the body has this general form:

    ( if
        conjunction of input end goals
    then
        conjunction of output end goals
    else if
        conjunction of input step goals
    then
        conjunction of output step goals,
        $Body,
        recursive call
    else
        $Mismatch
    ).

If all of the sequences are output, the body has this general form:

    (
        conjunction of end goals
    ;
        conjunction of step goals,
        $Body,
        recursive call
    ).

The example from above is repeated here:

    for X maps Xs, [!IO] do write(X, !IO), write_string(Separator, !IO)

For this inductive goal, which is called with the first sequence in
the input mode, we generate the following clause (with state variable
!IO expanded out to _IO_i):

    $predname(Separator, _M1, _Fa1, _Fb1) :-
        ( if
            _M1 = []
        then
            _Fa1 = _Fb1
        else if
            _M1 = [X | _Mr1]
        then
            _Fa1 = _IO_0,
            write(X, _IO_0, _IO_1),
            write_string(Separator, _IO_1, _IO_2),
            $predname(Separator, _Mr1, _IO_2, _Fb1)
        else
            error("length mismatch in inductive goal")
        ).

9. Example translations

Here are the translations for a few of the examples from earlier. The
declarations include the determinism that ought to be inferred for the
introduced predicates.

Here is length/2 again:

    length(Xs, N) :- for _ maps Xs, [I, I + 1] folds [0, N] do true.

In the (in, out) mode, this translates to:

    length(Xs, N) :- aux_1(Xs, 0, N).

    :- pred aux_1(list(T)::in, int::in, int::out) is det.

    aux_1(_M1, _Fa1, _Fb1) :-
        ( if
            _M1 = []
        then
            _Fa1 = _Fb1
        else if
            _M1 = [_ | _Mr1]
        then
            _Fa1 = I,
            true,
            aux_1(_Mr1, I + 1, _Fb1)
        else
            error("length mismatch in inductive goal")
        ).

Here is map_corresponding3_foldl/7 again:

    map_corresponding3_foldl(P, As, Bs, Cs, Ds, !S) :-
        for A maps As, B maps Bs, C maps Cs, D maps Ds, [!S]
        mismatch throw("list.map_corresponding3_foldl: length mismatch")
        do P(A, B, C, D, !S).

The mode of the sequences is (in, in, in, out, out), so we generate this:

    :- pred aux_2(
        pred(A, B, C, D, S, S)::in(pred(in, in, in, out, in, out) is det),
        list(A)::in, list(B)::in, list(C)::in, list(D)::out,
        S::in, S::out) is det.

    aux_2(P, _M1, _M2, _M3, _M4, _Fa1, _Fb1) :-
        ( if
            _M1 = [], _M2 = [], _M3 = []
        then
            _M4 = [],
            _Fa1 = _Fb1
        else if
            _M1 = [A | _Mr1], _M2 = [B | _Mr2], _M3 = [C | _Mr3]
        then
            _M4 = [D | _Mr4],
            _Fa1 = _S_0,
            P(A, B, C, D, _S_0, _S_1),
            aux_2(P, _Mr1, _Mr2, _Mr3, _Mr4, _S_1, _Fb1)
        else
            throw("list.map_corresponding3_foldl: length mismatch")
        ).

If the mode of the sequences was (in, out, out, out, in, out), and
without a mismatch clause, then we would instead generate code like
map3_foldl:

    :- pred aux_3(
        pred(A, B, C, D, S, S)::in(pred(in, out, out, out, in, out) is det),
        list(A)::in, list(B)::out, list(C)::out, list(D)::out,
        S::in, S::out) is det.

    aux_3(P, _M1, _M2, _M3, _M4, _Fa1, _Fb1) :-
        ( if
            _M1 = []
        then
            _M2 = [], _M3 = [], _M4 = [],
            _Fa1 = _Fb1
        else if
            _M1 = [A | _Mr1],
        then
            _M2 = [B | _Mr2], _M3 = [C | _Mr3], _M4 = [D | _Mr4],
            _Fa1 = _S_0,
            P(A, B, C, D, _S_0, _S_1),
            aux_3(P, _Mr1, _Mr2, _Mr3, _Mr4, _S_1, _Fb1)
        else
            error("length mismatch in inductive goal")
        ).

Here is take/3 again:

    take(N, List, Start) :-
        for
            [I, I + 1] folds [0, N],
            [[X | Xs0], Xs0] folds [List, _],  % can fail
            X maps Start
        do true.

We would generate:

    :- pred aux_4(list(T)::out, int::in, int::in, list(T)::in,
list(T)::out) is semidet.

    aux_4(_M1, _Fa1, _Fb1, _Fa2, _Fb2) :-
        ( if
            _Fa1 = _Fb1
        then
            _M1 = [],
            _Fa2 = _Fb2
        else if
            _Fa1 = I
        then
            _M1 = [X | _Mr1],
            _Fa2 = [X | Xs0],
            true,
            aux_4(_Mr1, I + 1, _Fb1, Xs0, _Fb2)
        else
            error("length mismatch in inductive goal")
        ).

Note that the unification of _Fa2 and [X | Xs0] can fail, making the
whole predicate semidet. This occurs when there are not enough
elements in List for the given value of N.

Here's the goal to generate all lists of zeroes and ones:

    for X maps Xs do ( X = 0 ; X = 1 )

We would generate:

    :- pred aux_5(list(int)::out) is multi.

    aux_5(_M1) :-
        (
            _M1 = []
        ;
            _M1 = [X | _Mr1],
            ( X = 0 ; X = 1 ),
            aux_5(_Mr1)
        ).
Previous message: [m-dev.] map.merge argument order
Next message: [m-dev.] Inductive goals
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the developers mailing list