[m-rev.] lex and moose changed
Peter Schachte
schachte at cs.mu.OZ.AU
Fri Aug 3 17:54:07 AEST 2001
On Thu, Aug 02, 2001 at 02:54:22AM -0700, Ralph Becket wrote:
> > From: Michael Day [mailto:mikeday at bigpond.net.au]
> > Sent: 02 August 2001 02:56
> >
> > What do people think about scanning using predicates and backtracking?
> > I've written some rules like this:
> >
> > whitespace --> \[' ','\t','\n'].
> > digit --> \['0','1','2','3','4','5','6','7','8','9'].
> > number --> +digit.
> > fraction --> \('.'), number.
> > exponent --> \['e','E'], ?(\['+','-']), number.
> > float --> ((number, ?(fraction)) ; fraction), ?(exponent).
> >
> > rule([]) --> +whitespace.
> > rule([float]) --> float.
> > rule([number]) --> number.
...
> It's hard to make this sort of thing work on files without reading
> the whole thing in first, although that's probably much less of a
> problem these days.
A sequence or stream type class should solve that. I'd define a sequence as
having an emtpy/1 predicate and a next/3 predicate that returns the next and
rest. A stream would be the same as a sequence, except it would have only
unique (destructive) modes.
Then you just need a predicate char/3:
char(C, S0, S) :- next(S0, C, S).
and you can use char(C) where you would use [C] in a grammar. char is
probably not the right name for this. Maybe some operator to make it
prettier.
> I confess I've often found it hard to cleanly accumulate the
> matched characters to the point where I can do something useful
> with them. For example, your number rule would be more useful
> if it told you what number it had identified. You then end up
> with stuff like
>
> digit(0) --> ['0'].
> ...
> digit(9) --> ['9'].
>
> number(N) --> number(0, N).
> number(N0, N) --> ( if digit(D) then number(10 * N0 + D, N)
> else { N = N0 }
> ).
How about putting this in the library somewhere:
in_range(Lo, Hi, Val) -->
[Val],
{ Lo =< Val },
{ Val =< Hi }.
then you can do:
digit(N) --> in_range('0', '9', D), { N = to_int(D) - to_int('0') }.
Your definition of number looks pretty reasonable to me. If you don't
define it this way, you'll have to write a separate function to translate
from a string to a number, which will look a lot like this anwyay.
--
Peter Schachte Even if you are a minority of one, the
mailto:schachte at cs.mu.OZ.AU truth is the truth.
http://www.cs.mu.oz.au/~schachte/ -- Mahatma Gandhi
Phone: +61 3 8344 9166
PGP key available, see web page
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list