[m-rev.] for review: Reduce memory allocation in string.to_upper, string.to_lower.
Peter Wang
novalazy at gmail.com
Thu Jun 23 14:23:52 AEST 2016
On Thu, 23 Jun 2016 12:06:06 +1000 (AEST), Julien Fischer <jfischer at opturion.com> wrote:
>
> Hi Peter,
>
> On Mon, 20 Jun 2016, Peter Wang wrote:
>
> > library/string.m:
> > Implement to_upper(in, uo) and to_lower(in, uo) with foreign
> > code, not creating intermediate character lists.
> >
> > Implement to_upper(in, in) and to_lower(in, in) modes without
> > allocating memory.
> >
> > Be more specific in documentation about which characters are
> > affected by some functions/predicates.
>
> ...
>
> > diff --git a/library/string.m b/library/string.m
> > index 55ef8b3..c13f7d4 100644
> > --- a/library/string.m
> > +++ b/library/string.m
> > @@ -768,19 +768,19 @@
> > %
> >
> > % Convert the first character (if any) of a string to uppercase.
> > - % Note that this only converts unaccented Latin letters.
> > + % Note that this only converts letters (a-z) in the ASCII range.
> > %
>
> It may be worth extending that comment to say that base letters that lie
> in the ASCII range in strings containing combining characters will also
> be converted, for example:
>
> io.write_string("a\u0301\n", !IO) ==> á
> io.write_string(to_upper("a\u0301\n") ==> Á
Here's an attempt at the wording:
to_upper
Converts a string to uppercase.
Only letters (a-z) in the ASCII range are converted.
This function transforms each code point individually.
Letters that occur within a combining sequence will be converted,
whereas the precomposed character equivalent to the combining
sequence would not be converted. For example:
to_upper("a\u0301") ==> "A\u0301" % á decomposed
to_upper("\u00e1") ==> "\u00e1" % á precomposed
Peter
More information about the reviews
mailing list