[mercury-users] Prolog DCGs Pattern Matching in Mercury
    Ralph Becket 
    rafe at cs.mu.OZ.AU
       
    Wed Apr 30 10:07:31 AEST 2003
    
    
  
Goncalo Jorge Coelho e Silva, Wednesday, 30 April 2003:
> 
>  Recognizing terms which are a mix
> of atomic things (like strings) and 
> variables, without using 'if the else'
> or switches.
Currently (and for the forseeable future, other than in an experimental
branch of the compiler) Mercury does not support partially instantiated
data (i.e. data containing uninstantiated subterms), if that is what you
mean by "variables".
For example, the following is valid Prolog, but not Mercury:
				% Initially X, A, B, C are free
				% (uninstantiated).
	X = foo(A, B, C),	% Now X = foo(A, B, C).
	A = 123			% Now X = foo(123, B, C).
Prolog also allows variable aliasing which Mercury does not.  The
following is valid Prolog, but not Mercury:
				% Initially X, A are free
				% (uninstantiated).
	X = foo(A, B, C),	% Now X = foo(A, A).
	A = 123			% Now X = foo(123, 123).
> ie.:
>  recognizing patterns in a list such as
> 		push_structure FunctorArity, Xi
>  where Xi is a variable
> 		set_variable Xi
>  where Xi is a variable, or
> 		set_value Vn
>  where Vn is a variable
> 
> like [push_structure f/2, x2, set_variable x2, ...]
These are not valid Mercury terms (the syntax is wrong.)  You'd need to
use something like
	[push_structure(f/2), x2, set_variable(x2), ...]
noting that x2, for instance, is a *representation of a variable* and
not a Mercury variable (if it was a Mercury variable its name would
have to start with a capital letter - X2 - and would lead to a compile
time error if X2 is free at the point where the list is constructed
(as I pointed out above, you can't have partially instantiated data
structures in Mercury.)
> >There are handy predicates in the library string to convert strings to 
> >lists of chars and vice versa: string__from_char_list, 
> >string__from_rev_char_list and string__to_char_list. 
> 
> Right. For now, I'm not sure whether I can use
> that for this issue. That would imply analising
> char by char (or actually, lst of chars by list
> of chars). DCGs in Prolog would enable me to
> identify the whole pattern.
DCG parsers *do* work datum by datum, regardless of whether you're using
them in Mercury or Prolog (have a look at the DCG transformation in the
reference manual.)  The only reason you can write the following in
Prolog
animal(dog) --> "dog".
animal(cat) --> "cat".
...
is because "dog" is syntactic sugar for [d, o, g] in Prolog.  The
expanded form of these clauses is
animal(dog, DCG0, DCG) :- DCG0 = [d | DCG1],
			  DCG1 = [o | DCG2],
			  DCG2 = [g | DCG ].
animal(cat, DCG0, DCG) :- DCG0 = [c | DCG1],
			  DCG1 = [a | DCG2],
			  DCG2 = [t | DCG ].
...
In Mercury you'd have to write
animal(dog) ---> [d, o, g].
animal(cat) ---> [c, a, t].
because Mercury strings are not syntactic sugar for lists of characters.
To call the Mercury version of animal//1 given a string, I'd do
something like this:
	Chars0 = string__to_char_list(String),
	animal(Animal, Chars0, Chars1),
	RestOfString = string__from_char_list(Chars1)
(although I probably would want to do more parsing with Chars1 rather
than turn it straight back into a string.)
> So, on the cases I mentioned, I'd be handling pattern
> matching for a list of strings as input.
>  Thing is, in the DCG clause, there's part of the
> string in the input, I'd like to unify with a 
> variable.
I don't understand this.  You need to be very specific when presenting a
specification.
>  % HEAP[H] <- Vm ;    sets a Vm on the Hth array position (HEAP)
>  % H <- H+1;          adds the value of H by 1
>  
> code_gen([set_value V | Tail],Heap, H) --> 
> 		 	{ 
> 		 	 array__array_set(Heap,H,V), 
> 		 	 H_New = H + 1, 
> 		 	} 
> 			code_gen(Tail,H_New,Heap). 
Again...  *DON'T* abuse DCG notation this way, even if it has been
abused in some Mercury programs.
Do you understand the DCG transformation?  Do you understand what it
is for?  Do you understand its implications for determinism?  If not,
then you shouldn't be using it.
Let's assume your code_gen/3 predicate has the following specification:
code_gen(Program, OldHeap, NewHeap) if NewHeap is the result of
executing Program on OldHeap, where Program is a list of instructions
(of course, code_gen is a misleading name since it runs code rather than
generate it.)
Without DCGs we could write
		% code_gen/3 would be better written as a function...
		%
	code_gen([], Heap, Heap).
	code_gen([Instruction | Program], Heap0, Heap) :-
		run_instr(Instruction, Heap0, Heap1),
		code_gen(Program, Heap1, Heap).
	run_instr(set_value(Address, Value), Heap0, Heap) :-
		set_heap(Heap0, Address, Value, Heap).
	...
If you wanted to use DCG notation then you'd write something like
	code_gen(Program, Heap0, Heap) :-
		run_prog(Heap0, Heap, Program, _).
	run_prog(Heap, Heap) -->
		=([]).
	run_prog(Heap0, Heap) -->
		[set_value(Address, Value)],
		{ set_heap(Heap0, Address, Value, Heap1) },
		run_prog(Heap1, Heap).
	...
but this is, IMHO, less elegant.
>  No quite...
>  Simply to produce code with side-effects on the 
> main program, without building fancy auxiliary predicates,
> just like PROLOG would enable.
> 
> So, to sum it all up, is this possible in Mercury?
You *can* write impure code in Mercury, but what would be the point?
Rather than printing stuff out in the course of a computation, why
not compute *what* you are going to print, then print that afterwards?
This separates computation proper from IO which is usually a Good Thing.
What's wrong with using auxiliary predicates?  They usually make code
easier to read.
- Ralph
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
    
    
More information about the users
mailing list