[m-rev.] Minor addition to string.m

Peter Moulder pmoulder at csse.monash.edu.au
Thu Jan 30 16:35:45 AEDT 2003


Various minor things.

Remember of course to add a NEWS entry at some point.

> +%	A synonym for index_dex/2:
                            ^^^
			    det

> +:- func string ^ unsafe_elem(int) = char.
> +%	A synonym for index_dex/2:
                      ^^^^^^^^^
		      unsafe_index

> +:- func string__chomp(string) = string.
> +%	string__chomp(String):
> +%	`String' minus any trailing newline characters; equivalent
> +%	to string__strip__suffix(pred('\n'::in) is semidet, String).

In perl, chomp removes no more than one newline.  E.g. "foo\n\n" -> "foo\n".
If the name `chomp' is retained, then I think the previously-posted
behaviour/implementation should be used.

> +:- func string__strip_whitespace_prefix(string) = string.

Information in case people wish to challenge the wordy names:

  Python and Visual Basic use lstrip,rstrip,strip.

  Java, SQL and PHP use `trim'.  Neither Java nor PHP have {l,r}strip
  equivalents that I can see.  SQL uses literally wordy syntax:
  trim([leading | trailing | both] [<characters>] from <string>)

  I couldn't find such a function for ocaml, and I don't have haskel docs.
  (In Perl one would use a regexp substitution.)

> +prefix_length(P, S) = prefix_length_2(0, length(S), P, S).

Hmm, it would be nice to avoid the `length' call in the case of C
strings (assuming that prefixes tend to be very short relative to the
string length).  Perhaps add a comment to this effect but retain the
`length' implementation until it actually affects someone.

> +prefix_length_2(I, N, P, S) =
> +	( if I < N, P(S ^ elem(I)) then prefix_length_2(I + 1, N, P, S)
> +				   else I
> +	).

On a similar vein, can use

  ( if I < N then
  	( if P(S ^ unsafe_elem(I)) then ... else I )
  else
  	I
  ).

This one I think is worth doing.

Similarly for suffix_length_2.


Btw, I notice that the C implementation of string__index calls strlen.
I think the best implementation would be to implement string__index in
terms of a string__index_len(I, N) function: with inlining, there's a
good chance that the Mercury compiler will avoid multiple calls to
string__length, whereas it's very difficult for a C compiler to
coallesce strlen calls.

pjm.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list