[m-dev.] proposal

Peter Schachte schachte at cs.mu.OZ.AU
Fri Aug 27 13:41:34 AEST 2004


Hi all,

Here is my proposal for resources in Mercury.  Comments are very
welcome.

This doesn't cover the problem of ensuring that resource usage follows
a policy (except for very simple policies).  I'll address that later.
Also, I'm not really happy yet with the syntax, particularly the
'using' operator.  I'm happy to change the syntax, but it's probably
better to focus on the semantics first.

Ralph:  could you please forward this to the HAL mailing list (and add
me to that list, as well)?  Thanks.

-- 
Peter Schachte              So easily do weak men put in high positions turn
schachte at cs.mu.OZ.AU        villains.
www.cs.mu.oz.au/~schachte/      -- Dmitry Pisarev 
Phone: +61 3 8344 1338      
-------------- next part --------------
Motivation

Mercury's state variable (!) notation makes threading state through a program
considerably easier.  Their use can make a program much more succinct and
readable by removing as many as half of the apparent arguments in a goal or
clause head, and by removing the need for many "tie the knot" unifications.
The price paid for this is that in clauses written with state variables,
conjunction is not commutative (or, if you prefer, comma is not conjunction).
This seems a fair price to pay for more readable code.

However, state variables only partially solve some of these problems, and
create a new one.  These include:

     o	It is still necessary to change every call and every clause head of
        every predicate involved when a new value needs to be threaded.  It
        is often the case that predicates that thread one value through will
        need to thread others later.  It would be better if only the :- pred
        declarations needed to be changed to thread more or fewer values.

     o There is no abstraction of what is being threaded:  the type and modes
        must be repeated everywhere the value flows.  State variables are a
        useful ad hoc tool for localized dataflow, but for more pervasive
        threading a formal, checked declaration of the dataflow would be
        useful documentation.  Specifying types and modes of threaded values
        in only one place also prevents (rather than just detecting) type or
        mode errors, and makes maintenance easier.

     o	With state variable syntax, the apparent arity of a predicate call
        differs from that of the :- pred declaration.

     o	State variables cannot hide a dataflow in cases where the user does
        not need to know about it, such as a constraint store.  This leaves
        implementors of constraint solvers to use impurity to hide that
        flow, which seems rather heavy-weight for this purpose.

     o	State variables cannot help the programmer find where values are
        threaded but not actually needed.

This proposal seeks to address those problems.


Proposal

We propose adding a declaration:

	:- resource Name : Type :: Inst.

eg,

	:- resource var_supply : int :: ground.

Since the specified Inst will usually be ground, that should be the default
if the ":: Inst" part is omitted.  Note that resource names are atoms; this
syntactically distinguishes them from state variables.

In some cases, there will be a particular initial value for a resource that
will most often be used when the resource's scope is entered.  In this case,
one may also specify an initial value by ending the resource declaration with
"= Expression."  For example, resource holding a set of students that passed
a subject might be specified as:

	:- resource passing_students : set(student) = empty.

To use a resource, we name it in some pred or func declarations:

	:- pred Name(Types...) using [Resource, ..., Resource].

The compiler transforms this into ordinary Mercury code by appending to the
argument list, for each listed resource Name : Type :: Inst, two arguments of
type Type.  Then for each mode, an argument of mode Inst>>Inst and one of
mode free>>Inst is added.  If Inst is unique, then unique>>dead and
free>>unique are added instead.  Note that the programmer need add nothing to
the mode declarations, and nothing but the resource name to the predicate
declaration.  For example,

	:- pred compute_marks(list(studinfo), list(int))
		using [passing_students].
	:- mode compute_marks(in, out) is det.

would be transformed to

	:- pred compute_marks(list(studinfo), list(int), 
			      set(student), set(student)).
	:- mode compute_marks(in, out, in, out) is det.

Sometimes one only wishes to pass a resource into a predicate; ie, it is
read-only for that predicate.  And occasionally a predicate treats a resource
as write-only.  For these cases, we may precede a resource name in a using
clause with !. or !: for input only and output only resources, respectively.

In the code, calls to predicates that use any resources must be preceded with
a ! symbol.  This serves to remind the reader that more arguments are passed
to this predicate than appear textually, and that reordering such calls with
respect to one another changes the meaning of the predicate.  It also tells
the compiler to add extra arguments to the goal for each resource used by the
called predicate.

By default, naming the resources used by a call is optional.  The reader can
look at a pred or func declaration to see which resources are used by a call,
just as they would look to see what arguments of what types a predicate or
function takes.  Since resources are passed in the order listed in the pred
declaration, and the names and types are standardized, resources should be
easier for a programmer to deal with than ordinary arguments.

If the programmer wishes to include the names of resources used in a call for
documentation purposes, she may write:

      Goal using [Resource_1, Resource_2, ...]

where each Resource is a resource name.  If the compiler switch
--no-infer-resources is specified, all calls using resources must have
complete lists of resources used.

The using clause may also be used to specify an initial value for a resource
and/or obtain the final value.  For this, a Resource listed in a using clause
may be a term !Name = (Initial,Final) or !.Name = Initial or !:Name = Final.
As a special case, the (Initial,Final) pair may be specified as a !Resource
name or a !State Variable.  If an initial value for a resource is specified
in the resource declaration and not in the using clause, that value is
passed.  Note that when a using clause specifies an initial value for a
resource that already has a value, it creates a dynamic scope:  the current
value of the resources is saved before invoking Goal, and restored after it
completes.

Resources are used within a clause exactly as state variables:  !.Resource
denotes the current value of a resource, !:Resource denotes the next value,
and !Resource, in a call, denotes the two separate arguments !.Resource,
!:Resource.


Modules

If a :- resource declaration appears in an implementation section, it is not
visible outside that module.  In particular, exported predicates cannot use
such resources.  If it appears in the interface section, however, it becomes
publicly visible, and may be used by exported predicates.  In such cases, the
calling predicate must supply those resources.


Hidden Resources

In some cases, a module designer may wish to hide the presence of resources
from module users.  For example, while a constraint store may be needed by
the module implementor to store constraints, the user need know nothing about
the store.  This is sensible because all operations to add constraints are
commutative with respect to the constraint store arguments.  That is

	    op1(X1, ... Xm, S0, S1), op2(Y1, ... Yn, S1, S2)

is equivalent to

	    op2(Y1, ... Yn, S0, S1), op1(X1, ... Xm, S1, S2)

For such families of operations, the user should not be required to prefix
goals with !, since even without the resource arguments, the predicates are
purely logical.

Predicates promise that they use a resource in a purely commutative way by
wrapping the resource name with 'commutative' in the pred or func declaration
using clause, eg:

	:- func cint + cint = cint using [commutative(cint_store)].

When some operations are not commutative with respect to this resource, such
as asking whether a constraint holds in the current constraint store, then
users must know about the constraint store to understand that operations
cannot be reordered.  Those predicates' use of the resource must not be
promised pure, ensuring that calls to them must be adorned with the !
operator.  Note that "ask" predicates will use resources in an input-only
way, so they will only have one argument added, making them naturally
commutative with one another.

Hidden resources may also pervade any program that uses the module that
exports them.  In this case, the programmer should specify an initial value
for the resource by adding "= Expr" to the resource declaration, eg:

	:- resource var_supply : int = 0.

This will be used as the initial value of the resource in the using clause
the compiler will wrap around as much of the program as necessary to
encompass all uses of that hidden resource.


Higher Order Usage

Higher order calls will sometimes need to have extra argument pairs added for
resources.  I see three different approaches to handling this.  For the
moment, I propose the first, and simplest, of these.  In the longer term, the
others may be preferable.

Approach 1:  Require a full using clause on all higher-order calls.

Approach 2:  Attach a list of resources to higher order types.  A predicate
that expects a higher order argument would then accept any higher order term
of the right type, provided the all the resources demanded by the higher
order term are available to the predicate.  This would allow the compiler to
generate the appropriate using clause itself.  However, it would require that
all higher order terms passed to a predicate use exactly the same resources.

Approach 3:  Approach 2 could be generalized to accept any higher order terms
as long as the used resources is a subset of those available, by having the
compiler automatically generate higher-order argument-dropping combinator
predicates/functions, and using them to wrap the higher order argument
actually passed.  For example, if I pass a closure that expects a resource
foo to a predicate that also uses resources bar and zip, the closure C gets
wrapped in a closure like drop_1_3_of_3(C).  When this is actually invoked,
it will be called as

	 drop_1_3_of_3(C, Bar0, Bar, Foo0, Foo, Zip0, Zip)

where drop_1_3 is defined as:

	drop_1_3_of_3(Closure, A, B, C, D, E, F) :-
		Closure(C, D).

This would impose some overhead to higher order calls, but perhaps this could
be mitigated with program specialization.


The Compiler

The compiler can infer which predicates use which resources based on uses of
!.Resource and !:Resource, and should issue error messages for missing
resources in using clauses.  Furthermore, it can issue warnings where
unneeded resources are listed, or where they should have been listed as
!.input-only or !:output-only.  Where pred declarations are inferred, the
compiler should emit them suitably adorned with using clauses as needed.

An alternative approach would be to compile the code to pass resources in
memory, rather than as arguments.  This would probably be more efficient, as
it would avoid shuffling resources among registers (and pseudo-registers
actually passed in memory) as they are threaded from one predicate to
another.  It would probably also make implementing higher order resources
easier and more efficient.  But in either case, the semantics should be the
same:  viewed as a transformation adding extra arguments to predicates.


Backward Compatibility

This proposal is largely backward compatible.  In particular, the state
variable syntax remains unchanged.  The library should be augmented with
resource declarations as appropriate; at least, the io module should define
an io resource, which would be used by all predicates that expect an
io__state argument pair.  This, too, is backward compatible, as existing
calls to io predicates will not have the ! prefix, so will not have arguments
added.  Since the argument types and arities will be correct without adding
resource arguments, the compiler should not complain.

Of course, introducing two new operators ('resource' and 'using') may cause
some compatibility problems.  Hopefully these will be few.


Examples

Here are a few examples written using resources:

Firstly, assume the interface section of the io module contains:

	:- resource io : io__state :: unique.
	...
	:- pred io__print(T:in) using [io] is det.
	...

Then hello.m might look like:

	:- module hello.
	:- interface.
	:- import_module io.
	:- pred main using [io] is det.

	:- implementation.

	main :-
		! print("hello "),
		! print("world!"),
		! nl.

This would be translated to:

	:- module hello.
	:- interface.
	:- import_module io.
	:- pred main(io__state::di, io__state::uo) is det.

	:- implementation.

	main(IO0, IO) :-
		print("hello ", IO0, IO1),
		print("world!", IO1, IO2),
		nl(IO2, IO).

More on I/O:

Resources neatly handle issues with having "current" input and output
streams, and separate versions of all I/O operations with and without an
explicit stream argument.  I'm assuming here that exceptions are used to
handle I/O errors, as it simplifies the code considerably.  If we declare

	:- resource io     : io__state :: unique.
	:- resource input  : input_stream = stdin.
	:- resource output : output_stream = stdout.

	:- pred open_input(string::in) using [io, !:input] is det.
	:- pred read_char(char::out) using [io, !.input] is det.

etc, then when we just need one input file, we can let the input resource
carry the "current" input stream:

	! open_input(Filename),
	! read_char(Firstchar),
	...

These would be translated to

	open_input(Filename, Input, IO0, IO),
	read_char(Firstchar, Input, IO0, IO),
	...

But when we want to use two input files, we can do

	! open_input(Filename1) using [!:input=Stream1],
	! open_input(Filename2) using [!:input=Stream2],
	! read_char(Firstchar1) using [!.input=Stream1],
	! read_char(Firstchar2) using [!.input=Stream2],
	...

Or if we use one stream much more than any other, we can also do

	! open_input(Filename1),
	! open_input(Filename2) using [!:input=Stream2],
	! read_char(Firstchar1),
	! read_char(Firstchar2) using [!.input=Stream2],
	...

Both of these would be translated to

	open_input(Filename1, Stream1, IO0, IO1),
	open_input(Filename2, Stream2, IO1, IO2),
	read_char(Firstchar1, Stream1, IO2, IO3),
	read_char(Firstchar2, Stream2, IO3, IO4),
	...

The difference between the two is that the second one propagates the input
resource beyond the given code.


A bigger example:

	:- resource predtable :: map(predspec, predinfo) = map__empty.
	:- resource vartable :: map(varid, predinfo) = map__empty.
	:- resource varsupply :: varid = initial_varsupply.


	:- pred process_file(string::in) using [io].

	process_file(Name) :-
	    (	compute_output_name(Name, Outname),
		! open_input(Name),
		! open_output(Outname),
		% This is OK because predtable has a default initial value
		! compile_stream using [predtable],
		! close_output,
		! close_input
	    ) using [input,output].


	:- pred compile_stream using [predtable, input, output, io] is det.

	compile_stream :-
		! read_source_code,
		! generate_code.

	:- pred read_source_code using [predtable, input, io] is det.

	read_source_code :-
		! read_term(Term),
		(   at_end_of_file(Term) ->
			true
		;   is_directive(Term) ->
			! handle_directive(Term),
			! read_source_code
		;   % Use the default initial varsupply:
		    ! handle_clause(Term) using [varsupply],
		    ! read_source_code
		).

	:- pred handle_directive(term::in) using [predtable].
	:- pred handle_clause(term::in) using [predtable].

And so on.  This would translate to:

	:- pred process_file(string::in, io::di, io::uo).

	process_file(Name, IO0, IO) :-
		compute_output_name(Name, Outname),
		open_input(Name, Instream, IO0, IO1),
		open_output(Outname, IO1, IO2, Outstream),
		compile_stream(map__empty, _, Instream, Outstream, IO2, IO3),
		close_output(Outstream, IO3, IO4),
		close_input(Instream, IO4, IO).


	:- pred compile_stream(map(predspec,predinfo)::in,
			       map(predspec,predinfo)::out,
			       input_stream::in, output_stream::in,
			       io::di, io::uo) is det.

	compile_stream(Predtable0, Predtable, Instream, Outstream, IO0, IO) :-
		read_source_code(Predtable0, Predtable1, Instream, IO0, IO1),
		generate_code(Predtable1, Predtable, Instream, Outstream,
			      IO1, IO).

	:- pred read_source_code(map(predspec,predinfo)::in,
			       map(predspec,predinfo)::out,
			       input_stream::in, io::di, io::uo) is det.

	read_source_code(Predtable0, Predtable, Instream, IO0, IO) :-
		read_term(Term, Instream, IO0, IO1),
		(   at_end_of_file(Term) ->
			Predtable = Predtable1,
			IO = IO1
		;   is_directive(Term) ->
			handle_directive(Term, Predtable1, Predtable2),
			read_source_code(Predtable1, Predtable, Instream,
					 IO1, IO)
		;   % Use the default initial varsupply:
		    handle_clause(Term, Predtable1, Predtable2),
		    read_source_code(Predtable1, Predtable, Instream,
					 IO1, IO)
		).



 LocalWords:  io int pred Inst eg predmode inst ie func


More information about the developers mailing list