[m-rev.] for review: Make string.append(out, out, in) work with ill-formed sequences.

Mark Brown mark at mercurylang.org
Wed Oct 23 16:30:51 AEDT 2019


Hi Peter,

On Wed, Oct 23, 2019 at 3:02 PM Peter Wang <novalazy at gmail.com> wrote:

> library/string.m:
>     Simplify string.append(out, out, in) and make it work sensibly in
>     the presence of ill-formed code unit sequences, breaking the input
>     string after each code point or code unit in an ill-formed sequence.
>

This doesn't match the forwards mode, which can join together two
ill-formed sequences to make a valid code point :-(

I can think of two changes to the declarative semantics that could resolve
this:

1. Disallow the case where we make a valid code point by appending (some
part of) an ill-formed sequence at the end of the first argument with one
at the start of the second argument.

2. Disallow _any_ ill-formed sequence at the start of the second argument.

The latter would affect more programs, but is probably better as the test
would be more efficient in the commonly-used forwards mode.

Whatever the case, the documentation should clarify the semantics.

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20191023/9e34fd8e/attachment.html>


More information about the reviews mailing list