[m-users.] Confused by action of string.prefix_length
Peter Wang
novalazy at gmail.com
Wed Jun 8 11:13:57 AEST 2022
On Tue, 07 Jun 2022 22:47:59 +0100 "Sean Charles (emacstheviking)" <objitsu at gmail.com> wrote:
> I have this code:
>
> :- pred pd_fence(string::in, int::out) is semidet.
>
> pd_fence(S, N) :-
> N = string.prefix_length(is_tilde, S),
> N > 2,
> trace[io(!Dbg)]
> (
> io.format("N is %i for %s\n", [i(N), s(S)], !Dbg)
> ).
>
> And when I run it with some tests strings I get this output, the values seem to be exactly twice the number of tilde characters and I don’t understand why!
The tildes in your email are U+02DC SMALL TILDE (˜).
Each U+02DC takes two UTF-8 code units (i.e. bytes) to encode, and
string.prefix_length returns the length of the prefix it finds
in terms of code units.
The normal ASCII tilde character is U+007E TILDE (~) which takes
one byte to encode in UTF-8.
Peter
More information about the users
mailing list