[m-rev.] lex and moose changed

Ralph Becket rbeck at microsoft.com
Thu Aug 2 00:31:51 AEST 2001


> From: Holger Krug [mailto:hkrug at rationalizer.com]
> Sent: 30 July 2001 07:52
> 
> Changes concerning lex:
> 
>
%-----------------------------------------------------------------------
-----%
> %
> % 07/26/01 hkrug at rationalizer.com:
> %    * the value contained in a token now may belong to an arbitrary
> %      Mercury type, not only string as before
> %    * lexemes now may contain a function to compute an arbitrary
token
> %      value from the string matched
> %    * the lexeme definition by the user now is more difficult,
> %      because the user has to name higher order functions
> %    * the advantage is that no postprocessing is needed to evaluate
> %      the string, hence the interface to the consumer of the tokens
is
> %      cleaner
> %
>
%-----------------------------------------------------------------------
-----%

This change is mainly concerned with changing the lexer result type
from

:- type lexer_result(Token)
	--->	ok(Token)           % For noval tokens.
	;	ok(Token, string)
	;	eof
	;	error(int).

to just

:- type lexer_result(Token)
	--->	ok(Token)           % For noval tokens.
	;	eof
	;	error(int).

and requiring that each lexeme be associated with a function
constructing a token of the required type from the matched
string.

I really don't think this is a good change to lex.  It complicates
the interface and the implementation without gaining anything much
in terms of utility.

The effect of this change can be obtained by using a thin
wrapper around lex__read//1 on a per application basis, e.g.

my_read(T) -->
	lex__read(Result),
	{	Result = error(_), ...
	;	Result = eof, ...
	;	Result = ok(Token),         T = convert(Token, "")
	;	Result = ok(Token, String), T = convert(Token, String)
	}.

This doesn't add any real cost in terms of computation or 
complexity to the application.  My own opinion is that converting
strings into other representations is properly done by the parser.

I'd like to hear what other people think, though.



> Changes concerning moose:
> 
>
%-----------------------------------------------------------------------
-----%
> %
> % 07/24/01 hkrug at rationalizer.com:
> %    * added option --unique-state/-u

Another option would be to pass in appropriate modes for the lexer
state,
e.g. {in, in, out} vs {ui, di, uo}.  But that's a minor change that
isn't
urgent.

> %    * `parser_state' renamed to `lexer_state'
> %    * the `lexer_state' is managed using di/uo modes, if
> %      called with --unique-state
> %    * predicates `write_types', `write_rule_num_type',
> %      `write_state_num_type' and `write_action_type' added
> %    * changed type for rule number for `int' to generated
discriminated
> %      union type `rule_num' allowing some procedures to be declared
> %      `det' instead of `semidet'
> %    * changed type for state number for `int' to generated
discriminated
> %      union type `state_num' allowing some procedures to be declared
> %      `det' instead of `semidet'
> %    * changed definition of type `parsing_action' which now includes
> %      not only the kind of action to be taken but also the value
> %      specifying the concrete action
> %    * obviously unused dump options removed from usage message
> %
>
%-----------------------------------------------------------------------
-----%
> 
> Attention: moose now depends on lex, because the following type forms
the
> interface of moose with its lexer:
> 
> :- type lex__lexer_result(Token)
>     --->    ok(Token)                   % Token matched.
>     ;       eof                         % End of input.
>     ;       error(int).                 % No matches for string at
this offset.

Again, I don't think lex and moose need to by tied together.

You could just define a type moose__lexer_result/1 as above and require
the application's lex__read//1 wrapper to do the conversion.  This
wouldn't
cost anything since that's what the wrapper is doing anyway.

I'll go over the moose changes in detail and get back with a review
ASAP.

Cheers,

Ralph
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list