[m-rev.] Minor addition to string.m

Ralph Becket rafe at cs.mu.OZ.AU
Fri Jan 31 15:07:58 AEDT 2003


Peter Moulder, Thursday, 30 January 2003:
> 
> In perl, chomp removes no more than one newline.  E.g. "foo\n\n" -> "foo\n".
> If the name `chomp' is retained, then I think the previously-posted
> behaviour/implementation should be used.

I considered this.  I'm not sure the difference would be significant
since it's almost certain only to be used to snip off the "\n" after
a call to read_line_as_string.  I will change it to snip off just the
one if anybody feels strongly.

> > +:- func string__strip_whitespace_prefix(string) = string.
> 
> Information in case people wish to challenge the wordy names:

The prevailing Mercury culture is to choose unambiguous names over
concise ones :-)

I'm in favour of using shorter names, but couldn't think of any that
would raise objections.  {l,r,}strip would work for me, with overloaded
versions taking an extra predicate argument for the non-whitespace
specific versions.

> On a similar vein, can use
> 
>   ( if I < N then
>   	( if P(S ^ unsafe_elem(I)) then ... else I )
>   else
>   	I
>   ).

I've applied the s/elem/unsafe_elem/ suggestion.  I don't think adding
the extra if-then-else makes the code any more readable.  The existing
formulation, I < N, P(S ^ unsafe_elem(I)) works fine with the current
compiler, but the correct solution is for us to add an explicit ordered
conjunction which would forbid the reordering of its conjuncts for any
reason.  It's a SMOP, if anyone has the time.

> Btw, I notice that the C implementation of string__index calls strlen.
> I think the best implementation would be to implement string__index in
> terms of a string__index_len(I, N) function: with inlining, there's a
> good chance that the Mercury compiler will avoid multiple calls to
> string__length, whereas it's very difficult for a C compiler to
> coallesce strlen calls.

I'd rather have our own string representation that kept track of string
length separately and just have a utility function like

:- type string == {int, c_pointer}

:- func c_string(string) = c_pointer.
:- mode c_string(in    ) = out is det.
:- mode c_string(out   ) = in  is det.

and call strlen for the (out) = in mode.  I've been bitten by the
problem of not being able to embed '\0's in strings before because of
their representation.

That said, there are also strong arguments in favour of keeping Mercury
and C string representations identical.

	Ralph
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list