[m-rev.] for review: string switches using tries

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue Feb 24 16:57:28 AEDT 2015



On Tue, 24 Feb 2015 16:43:57 +1100, Peter Wang <novalazy at gmail.com> wrote:
> string.m presents an encoding agnostic interface.  The implementation
> only allows UTF-8 for the C/Erlang backends, and UTF-16 for the Java/C#
> backends.

I wasn't aware of the second part of that. Thanks for telling me.

> > > > +                NullCodeUnit = 0,    % Match the terminating NULL character.
> > > 
> > > s/NULL/NUL  or  s/NULL/null
> > > 
> > > This won't work for backends which do not use a NUL terminator.
> > 
> > No, it won't, but those systems are practically extinct, and I don't think
> > we support Mercury on any of them.
> 
> The non-C backends do not use null-terminated strings.
> will this code not apply to them?

At the moment, the code invoking the generation of trie switches
has an explicit test requiring the C backend. I disabled the use of tries
for the other backends until the issue of handling binops properly
during target code output has been resolved, but you are right,
for non-nul-terminated backends we would also need to handle
the ends of strings differently. The obvious solution would be to
compute the length of the string at the start and then just
check whether the depth of the current trie node equals the length,
but I don't know whether the extra cost of these checks would make
tries slower than hash switches.

Zoltan.




More information about the reviews mailing list