[m-rev.] for review: string switches using tries

Peter Wang novalazy at gmail.com
Tue Feb 24 16:43:57 AEDT 2015


On Tue, 24 Feb 2015 16:00:07 +1100 (EST), "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> 
> 
> On Tue, 24 Feb 2015 12:32:48 +1100, Peter Wang <novalazy at gmail.com> wrote:
> > The implementation will not work in general if the host compiler and the
> > target differ in the string encoding, e.g. the compiler uses UTF-8 but
> > the target uses UTF-16.
> > 
> > The fix should only require that we replace string.{to,from}_code_unit_list
> > with functions that deal in the code units of the TARGET string encoding,
> > and build tries from that.  The standard library does not yet have
> > string.{to,from}_{utf8,utf16}_code_unit_list so, for now, the safe option
> > is to disable the trie implementation when the string encodings differ.
> 
> Agreed. I have have added a line that disables the use of tries
> if --cross-compiling is set. However, I believe the existing code
> in string.m that deals with Unicode, which I think you wrote,
> assumes utf8.

string.m presents an encoding agnostic interface.  The implementation
only allows UTF-8 for the C/Erlang backends, and UTF-16 for the Java/C#
backends.

> > > +                NullCodeUnit = 0,    % Match the terminating NULL character.
> > 
> > s/NULL/NUL  or  s/NULL/null
> > 
> > This won't work for backends which do not use a NUL terminator.
> 
> No, it won't, but those systems are practically extinct, and I don't think
> we support Mercury on any of them.

The non-C backends do not use null-terminated strings.
will this code not apply to them?

Peter



More information about the reviews mailing list