[m-rev.] for review: encoding chars as uint8s or uint16s

Peter Wang novalazy at gmail.com
Wed Jul 21 16:47:39 AEST 2021


On Tue, 20 Jul 2021 21:04:30 +1000 Julien Fischer <jfischer at opturion.com> wrote:
> 
> Hi Peter,
> 
> On Tue, 20 Jul 2021, Peter Wang wrote:
> 
> > On Tue, 20 Jul 2021 14:58:56 +1000 Julien Fischer <jfischer at opturion.com> wrote:
> >>
> >> For review by anyone.
> >>
> >> There's one other thing I would like feedback on here.  All of these
> >> predicates currently fail when the input is either a surrogate (which
> >> is fine) or if the character is outside the valid Unicode code point
> >> range (0x0 .. 0x10ffff).  I think the latter case should cause an
> >> exception to be thrown.
> >>
> >
> > I don't exactly object to throwing an exception, but chars are always
> > supposed to be in the Unicode range. Obviously, it's possible to
> > introduce a bad char via the FFI.
> 
> Exactly, and having these predicates call error in that case seems like
> the more useful thing to do.

Ok.

> (If we do not do that, we should at least
> change their documentation to say that they fail if the character is a
> surrogate *or* if it is an illegal code point.)

That would be confusing given the definition of `char'.

> The revised diff below does that and also changes calls error on an
> illegal code point.

That's fine.

Peter


More information about the reviews mailing list