[m-rev.] lex and moose changed

Peter Schachte schachte at cs.mu.OZ.AU
Fri Aug 3 17:54:07 AEST 2001


On Thu, Aug 02, 2001 at 02:54:22AM -0700, Ralph Becket wrote:
> > From: Michael Day [mailto:mikeday at bigpond.net.au] 
> > Sent: 02 August 2001 02:56
> > 
> > What do people think about scanning using predicates and backtracking?
> > I've written some rules like this:
> > 
> > whitespace  --> \[' ','\t','\n'].
> > digit       --> \['0','1','2','3','4','5','6','7','8','9'].
> > number      --> +digit.
> > fraction    --> \('.'), number.
> > exponent    --> \['e','E'], ?(\['+','-']), number.
> > float       --> ((number, ?(fraction)) ; fraction), ?(exponent).
> > 
> > rule([]) --> +whitespace.
> > rule([float]) --> float.
> > rule([number]) --> number.
...
> It's hard to make this sort of thing work on files without reading
> the whole thing in first, although that's probably much less of a
> problem these days.

A sequence or stream type class should solve that.  I'd define a sequence as
having an emtpy/1 predicate and a next/3 predicate that returns the next and
rest.  A stream would be the same as a sequence, except it would have only
unique (destructive) modes.

Then you just need a predicate char/3:

	char(C, S0, S) :- next(S0, C, S).

and you can use char(C) where you would use [C] in a grammar.  char is
probably not the right name for this.  Maybe some operator to make it
prettier.

> I confess I've often found it hard to cleanly accumulate the
> matched characters to the point where I can do something useful
> with them.  For example, your number rule would be more useful
> if it told you what number it had identified.  You then end up
> with stuff like
> 
> digit(0) --> ['0'].
> ...
> digit(9) --> ['9'].
> 
> number(N)     --> number(0, N).
> number(N0, N) --> ( if digit(D) then number(10 * N0 + D, N)
>                                 else { N = N0 }
>                   ).

How about putting this in the library somewhere:

	in_range(Lo, Hi, Val) -->
		[Val],
		{ Lo =< Val },
		{ Val =< Hi }.

then you can do:

	digit(N) --> in_range('0', '9', D), { N = to_int(D) - to_int('0') }.

Your definition of number looks pretty reasonable to me.  If you don't
define it this way, you'll have to write a separate function to translate
from a string to a number, which will look a lot like this anwyay.


-- 
Peter Schachte                     Even if you are a minority of one, the
mailto:schachte at cs.mu.OZ.AU        truth is the truth.
http://www.cs.mu.oz.au/~schachte/      -- Mahatma Gandhi 
Phone: +61 3 8344 9166             
PGP key available, see web page    
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list