[m-rev.] for review: mercury implementation of string.m
Peter Ross
pro at missioncriticalit.com
Wed Jun 19 01:23:32 AEST 2002
rafe wrote:
> Michael Day, Monday, 17 June 2002:
> >
> > Right. How about adding a general string buffer type, containing a
string
> > and an index into that string (is anything else required?) that can
be
> > used by lex and other modules that process strings in this way so
that
> > first_char can be deprecated/removed?
>
> first_char was always a bad idea. [unsafe_]index and fold[lr] are the
> right tools for the job.
>
Agreed.
> Constructing strings is a different matter. If you use clever
> representations, such as concatenation trees of substrings, you will
> likely also have to have user-defined equality since there will be
> multiple ways to represent each string. This brings it's own set of
> problems. I'm inclined to the idea that a separate data structure is
> appropriate for building strings.
>
This is true. I came across the same problem when looking at writing a
typeclass based stream library for I/O, you really don't want to use the
string primitives for doing I/O. You just want some abstract type that
you can get the resulting string out of when finished. I also did a
similar thing for string__format where I tried to minimise the number of
strings that were allocated to try and improve its performance.
> I believe the reason for not having clever (immutable) string
> representations is that the C interface is easier to work with if you
> just use C strings, which couldn't be more basic (I've been bitten
once
> or twice by the fact that you can't have NULs in Mercury strings if
> you're using a C back-end.) I'm not 100% convinced this is the right
> approach... Pete, Tyson?
>
The reference manual states that strings are represented by the typedef
MR_String, and this is equivalent to char *. A clever string
representation would be required to enforce this at the c code boundary,
and I think it would break too much code to change this constraint.
Strings are just too fundamental a data type not to be used heavily.
However if a representation could be found that is cheap to marshall
across this boundary then I would be all for it.
Pete
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list