[m-rev.] for review: Make string.append(out, out, in) work with ill-formed sequences.

Mark Brown mark at mercurylang.org
Wed Oct 23 17:07:48 AEDT 2019


On Wed, Oct 23, 2019 at 5:05 PM Peter Wang <novalazy at gmail.com> wrote:
>
> On Wed, 23 Oct 2019 16:30:51 +1100, Mark Brown <mark at mercurylang.org> wrote:
> > Hi Peter,
> >
> > On Wed, Oct 23, 2019 at 3:02 PM Peter Wang <novalazy at gmail.com> wrote:
> >
> > > library/string.m:
> > >     Simplify string.append(out, out, in) and make it work sensibly in
> > >     the presence of ill-formed code unit sequences, breaking the input
> > >     string after each code point or code unit in an ill-formed sequence.
> > >
> >
> > This doesn't match the forwards mode, which can join together two
> > ill-formed sequences to make a valid code point :-(
>
> Yes, I see.
>
> > I can think of two changes to the declarative semantics that could resolve
> > this:
> >
> > 1. Disallow the case where we make a valid code point by appending (some
> > part of) an ill-formed sequence at the end of the first argument with one
> > at the start of the second argument.
> >
> > 2. Disallow _any_ ill-formed sequence at the start of the second argument.
> >
> > The latter would affect more programs, but is probably better as the test
> > would be more efficient in the commonly-used forwards mode.
> >
> > Whatever the case, the documentation should clarify the semantics.
>
> I can accept requiring a separate predicate in the case someone actually
> needs to join two ill-formed sequences to form a valid code point,
> which surely would be very rare.
>
> How about deprecating and removing the nondet mode of string.append?
> It can be supported as a separate predicate.
>

Fine by me.

Mark


More information about the reviews mailing list