[m-users.] Confused by action of string.prefix_length

Peter Wang novalazy at gmail.com
Wed Jun 8 11:13:57 AEST 2022


On Tue, 07 Jun 2022 22:47:59 +0100 "Sean Charles (emacstheviking)" <objitsu at gmail.com> wrote:
> I have this code:
> 
> :- pred pd_fence(string::in, int::out) is semidet.
> 
> pd_fence(S, N) :-
>     N = string.prefix_length(is_tilde, S),
>     N > 2,
>     trace[io(!Dbg)]
>     (
>         io.format("N is %i for %s\n", [i(N), s(S)], !Dbg)
>     ).
> 
> And when I run it with some tests strings I get this output, the values seem to be exactly twice the number of tilde characters and I don’t understand why!

The tildes in your email are U+02DC SMALL TILDE (˜).
Each U+02DC takes two UTF-8 code units (i.e. bytes) to encode, and
string.prefix_length returns the length of the prefix it finds
in terms of code units.

The normal ASCII tilde character is U+007E TILDE (~) which takes
one byte to encode in UTF-8.

Peter


More information about the users mailing list