[mercury-users] Pred defns

Richard A. O'Keefe ok at atlas.otago.ac.nz
Wed Apr 8 14:26:45 AEST 1998


This whole thread on argument threading and updating has been most
interesting.  Ralph Becket and Fergus Henderson have said that
	RB ... this facility ... should look simple and *elegant*!
	RH [it must] *be* simple and elegant.  
	
Frankly, all of the concrete proposals I've seen so far have been
rather appalling messes of punctuation that don't satisfy these criteria.

I'm co-teaching a 4th year functional programming course this year.
Making sure I do all the student exercises and assignments, in both
SML/NJ 110 and Clean 1.2.4 is proving quite instructive.  The latest
thing I've done is a little genetic algorithm kit, and of course
while it does no I/O it _does_ have a random number generator state
popping up all over the place.  Instead of

	breed :: Population -> Population

you have

	breed :: Population -> RNG_State -> (Population, RNG_State)

How do functional languages handle this kind of thing?

    1.  Lexical scope.

	make_random_list f x n s
	 = loop [] n s
	   where loop w 0 s = (w, s)
	         loop w m s = let (c,s) = f x s in loop [c:w] (m-1) s
	
	The inner function has only three arguments, all of them
	'changing'; it can refer to f and x without having to pass
	them around (unlike 'flat' languages like Prolog and Erlang).

	Nested predicate definitions would in no way detract from the
	'logic'-based nature of Mercury.  The source-to-source
	transform is, if anything, simpler than the DCG transform.

	There's another application of lexical scope there too.
	let (c,s) = f x s in ...
	the s on the right hand side refers to the old state,
	the s on the left hand side is a new variable that refers to the
	new state.  To be honest, my actual code uses s, s`, s``, s```,
	while my ML code uses s, s', s'', s''' (and I'm _always_ using
	the wrong kind of prime in each language, _sigh_).

	Again, there's nothing wrong with lexically scoped variables;
	indeed, they are _standard_ in ordinary logical notation (to
	the limited extent that anything is standard in mathematics).
	We only really need name decoration schemes when we _want_ to
	refer to multiple states of the 'same' thing at once, and of
	course _that_ interferes with the uniqueness of reference that
	implementation using destructive update depends on.

    2.	Combinators.

	For example, suppose I have two ordinary functions

	    f :: b -> c			// a, b, c are type variables		
	    g :: a -> b			// in Clean, use 'a 'b 'c in ML.

	I can compose them using 

	    (o) infixr 9 :: (b -> c) (a -> b) a -> c
	    (o) f g x = f (g x)

	Similarly, if I have two 'functions' that mutate a random number
	generator

	    f :: b RNG_State -> (c, RNG_State)
	    g :: a RNG_State -> (b, RNG_State)

	I can compose them using

	    (compose) infixr 9 ::
		(b RNG_State -> (c, RNG_State))
		(a RNG_State -> (b, RNG_State)) ->
		(a RNG_State -> (c, RNG_State))
	    (compose) f g x s0 = let (y,s1) = g x s0 in f y s1

	Now I can chain together as many functions as I want:
	    f compose g compose h
	isn't any harder to write than
	    f o g o h

	This stuff leads on to 'parsing combinators' and 'monads' and
	all that kind of stuff, without any _syntactic_ extensions at all.

    3.  What-you-see-is-not-what-the-compiler-has-to-do.

	There's a certain amount of _syntactic_ clumsiness in passing
	back pairs and triples, and there's this subliminal feeling
	that all of these tuples will actually be _constructed_ and
	that doing so will waste space and time.  It's enough to make
	anyone long for Prolog/Mercury syntax, where passing back
	multiple results is so clean, so simple.

	But functional language compilers do strange and wonderful
	things.  If a compiler _knows_ that function f always returns
	an N-tuple, it can instead compile it so that the elements of
	the tuple are left in N 'result registers'.  Calls that don't
	extract the elements can have a boxing step inserted, or the
	compiler can generate two copies of the function.  I suppose
	we could call this technique 'untupling'.

	The Mercury compiler knows just as much about the programs it
	is compiling as the average functional language compiler.  And
	if it doesn't already do some kinds of untupling, it will
	eventually have to.  Just at the moment, I've got functional
	syntax paged in, so I'll stick with that.  Consider two types:

	:: Date = MkDate {yr::Int, mo::Int, dy::Int}
	:: Person = MkPerson {name::String, birth::Date}

	In typical Prolog implementations,
	a person(Name,date(Yr,Mo,Dy)) term would consist of two separate
	linked records, requiring 7 words.  Accessing someone's birth
	year, X = person(_,date(Y,_,_)) requires traversing an extra
	pointer.  I'm sure Mercury can turn this into the equivalent of
	C's "Y = X->name->yr" without breathing hard.  With the type
	information to hand, a functional language compiler can store
	the date record directly inside the person record (which also
	requires that the garbage collector support embedded objects,
	but that's certainly doable), getting the equivalent of C's
	"Y = X->name.yr", a single memory reference.  I don't know if
	Mercury does this yet, but it has the information it needs; it
	_could_ use this method, and if the Mercury project continues
	long enough I'm sure it _will_ do this.  This is another
	instance of 'untupling'.

	Indeed, before assuming that packaging several state variables
	into one record is going to be inefficient, it might be an idea
	to try it out and see if it really is.  My expectation is that
	very occasionally it _will_ be a performance problem, but in
	the presence of destructive-update-based-on-uniqueness-info,
	it will be as efficient as the traditional implementation of
	access to variables in outer blocks in Pascal or Algol, because
	it will _be_ that implementation, and even without destructive
	update, the time may well be going somewhere else.

By the way, there was a paper not so very many decades ago, which
showed how the very carefully designed imperative language 'Euclid'
could indeed be seen quite directly as a pure functional language.
(The crucial point was that a Euclid compiler is required to enforce
a ban on aliasing.)  One of the Xerox blue-and-white reports, as I recall.




More information about the users mailing list