[m-dev.] cc_multi or det ?

Fergus Henderson fjh at cs.mu.OZ.AU
Tue Aug 7 01:48:14 AEST 2001


On 06-Aug-2001, Ralph Becket <rbeck at microsoft.com> wrote:
> > From: Fergus Henderson [mailto:fjh at cs.mu.OZ.AU]
> > Sent: 06 August 2001 16:11
> > 
> > For such a situation, I'd recommend declaring the entry point `cc_multi'.
> > The user can always add `promise_only_solution' themselves if need be.
> > 
> > Alternatively, you could provide two different entry points,
> > a cc_multi one named `lex__main' (or whatever you want to call your entry
> > point), and a det one named `lex__promise_only_solution_main' that just
> > wraps promise_only_solution around `lex__main'.
> 
> The interface is that the user supplies a list of 
> (regexp - token_constructor) pairs, where each token_constructor is
> a *function* from strings to tokens.  The lexer internals catch any
> lex__excn(string) exceptions raised by a token_constructor in order
> to construct an error result (e.g. a token_constructor for ints may
> want to report overflow errors this way).
> 
> Since the determinism of a token_constructor has to be det, it seems
> reasonable that the determinism of the lexer itself will also be det
> and that the use of promise_only_solution internally is legitimate.
> 
> My feeling is that other exceptions could be raise all over the shop, 
> but that doesn't warrant changing all dets into cc_multis.

Raising exceptions never requires changing the determinism.
It's only *catching* exceptions that is ever problematic.

Consider the following hypothetical token constructor function,
for construction a floating point token with overflow.
The floating point token is presumed to have matched the regexp

	[0-9]+\.[0-9]*(E[0-9]+)?

and the constructor function converts the value,
checking for overflow:

	:- type token --> float_token(float).

	:- func make_int_token(string) = token.
	make_int_token(String) = float_token(Val) :-
		(if string__sub_string_search(String, "E", Pos) then
			Mantissa = string__left(String, Pos),
			Exponent = string__right(String, length(String) - Pos - 1),
			Val = convert_mantissa(Mantissa) *
				10.0 `pow` convert_exponent(Exp)
		else
			Val = convert_mantissa(String)
		).

	:- func convert_mantissa(string) = float.
	convert_mantissa(String) = Val :-
		(if
			string__to_float(String, Float),
			\+ is_inf_or_nan(Float)
		then
			Val = Float
		else
			throw lex__excn("Floating point mantissa out of range")
		).
		
	:- func convert_mantissa(string) = int.
	convert_exponent(String) = Val :-
		(if
			string__to_int(String, Exp),
			Exp >= float_min_exponent,
			Exp =< float_max_exponent
		then
			Val = Exp
		else
			throw lex__excn("Floating point exponent out of range")
		).

	is_inf_or_nan(_) :- ...
	float_min_exponent = ...
	float_max_exponent = ...

Now, suppose you use this function with lex__promise_only_solution_main.

Exercises for the reader:


	Will that work?


	If not, why not? 
	(What case could cause problems?)


	Was the problem easy to spot or hard to spot?


	Would you have even though to look for the problem if
	lex__promise_only_solution_main was named lex__main?


-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list