[m-dev.] For review: improvements to string.m

Fergus Henderson fjh at cs.mu.OZ.AU
Tue Sep 19 03:36:10 AEDT 2000


On 18-Sep-2000, Ralph Becket <rbeck at microsoft.com> wrote:
> Estimated hours taken: 
> 
> Fixed some inefficiencies in the string module.
> 
> library/string.m:
> 	The following previously made use of string__first_char/3
> 	which is heinously inefficient in most cases:
> 		string__base_string_to_int/3
> 		string__is_alpha/1
> 		string__is_alpha_or_underscore/1
> 		string__is_alnum_or_underscore/1
> 		string__duplicate_char/3
> 	I have recoded them to use string__foldl/5 or a new predicate,
> 	string__all_match/2, or string__from_char_list/2, which should
> 	be far more economical.

Those changes look great.

> 	I have also added the following functions:
> 		string__det_string_to_int/1
> 		string__det_base_string_to_int/2
> 	the former using 10 as the base.

These changes look OK, but they are not quite sufficiently well
documented.

These changes are not really related to the previous ones, and
could be submitted as a separate change.

> +:- type int_acc
> +	--->	nil			% Haven't accumulated anything yet
> +	;	acc(int, int).		% Sign, NumSoFar
> +
>  string__base_string_to_int(Base, String, Int) :-
> -	( string__first_char(String, Char, String1) ->
> -		( Char = ('-') ->
> -			string__base_string_to_int_2(Base, String1, 0,
> Int1),
> -			Int is 0 - Int1
> -		; Char = ('+') ->
> -			string__base_string_to_int_2(Base, String1, 0, Int)
> -		;
> -			string__base_string_to_int_2(Base, String, 0, Int)
> -		)
> -	;
> -		Int = 0
> -	).

I think it would quite likely be more efficient to leave that call to
string__first_char, and to just recode string__base_string_to_int_2
using string__foldl.  That way, we avoid a test (for nil/0 or acc/2),
a multiplication (`Sign *') and a memory allocation (for acc/2) for
each character, and the only extra cost is one string allocation.

Alternatively, you could implement string__substring_foldl,
which is like string__foldl except that it works on a substring
of a string whose position is determined by two additional arguments
(the indices of the substring's starting and ending characters).
Oh, I see that `string__foldl2' already does that.
You could just rename `string__foldl2'.

Using string__substring_foldl would allow you to avoid the one string
allocation from the call to string__first_char.

> @@ -1765,6 +1767,12 @@
>  
>  :- func string__format(string, list(string__poly_type)) = string.
>  
> +	% Convert a base 10 string to an int.
> +	%
> +:- func string__det_string_to_int(string) = int.
> +
> +:- func string__det_base_string_to_int(int, string) = int.

You should document what `string__det_base_string_to_int' does.
And you should document what these functions do if the string
is not valid syntax for an int.  Also s/base 10/signed base 10/.

These functions should be mentioned in the NEWS file.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list