[m-rev.] lex and moose changed

Michael Day mikeday at bigpond.net.au
Thu Aug 2 11:56:19 AEST 2001


> As a suggested application, here's a token type I'd like to have produced by
> the lexer:
>
> 	:- type token --->
> 		plus ; minus ; times ; divide ; int(int) ; ident(string).
>
> Hopefully this is simple enough to be easy to code, and complex enough to
> illustrate the differences between the approaches.

What do people think about scanning using predicates and backtracking?
I've written some rules like this:

whitespace  --> \[' ','\t','\n'].
digit       --> \['0','1','2','3','4','5','6','7','8','9'].
number      --> +digit.
fraction    --> \('.'), number.
exponent    --> \['e','E'], ?(\['+','-']), number.
float       --> ((number, ?(fraction)) ; fraction), ?(exponent).

rule([]) --> +whitespace.
rule([float]) --> float.
rule([number]) --> number.

that all operate on strings using first_char (before I found out that it's
horrendously inefficient...) and would probably be better off working on
lists of characters or generic streams or something.

Maybe it would be worth making a little collection of scanners that solve
Peter's example using different approaches so that they can be compared
for clarity and efficiency, and illuminate people like myself who still
don't really know much about using Mercury/Prolog for scanning and
parsing...

(Maybe take this to mercury-users, too?)

Michael

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list