[m-rev.] for review: utf-8 improvements
Julien Fischer
juliensf at csse.unimelb.edu.au
Mon Mar 26 14:58:26 AEDT 2012
On Mon, 26 Mar 2012, Peter Wang wrote:
> If necessary I'll submit just the bug fixes separately for 11.07.
I don't see any reason for the whole diff not to go on to the 11.07
branch.
> ---
>
> Branches: main, 11.07
>
> Optimise some UTF-8 routines in C grades and fix a few bugs.
>
> library/string.m:
> Avoid function calls in unsafe_index, unsafe_index_next, and
> unsafe_prev_index in the ASCII case.
>
> Handle illegal code unit at start of string in first_char(in, uo, in)
> and first_char(in, uo, uo) modes.
>
> runtime/mercury_string.c:
> runtime/mercury_string.h:
> Fix a bug where MR_utf8_next would not advance from pos 0. Fortunately
> MR_utf8_next is only rarely called, to skip past illegal code units.
>
> Delete redundant initial test in MR_utf8_prev.
>
> Add MR_utf8_get_mb to extract multibyte code points only.
> Unroll a loop.
>
> Add MR_utf8_get_next_mb to extract multibyte code points only.
>
> Make MR_utf8_prev_get avoid an extra function call in the ASCII case.
>
> Use MR_Integer consistently for string offsets instead of int.
That looks fine.
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list