[m-rev.] for review: Make generic versions of string.to_upper/lower preserve ill-formed sequences.
Mark Brown
mark at mercurylang.org
Tue Nov 5 02:08:45 AEDT 2019
This looks fine.
On Mon, Nov 4, 2019 at 4:52 PM Peter Wang <novalazy at gmail.com> wrote:
>
> library/string.m:
> Make generic implementations of string.to_upper and string.to_lower
> preserve ill-formed sequences. (The foreign language implementations
> already did so.)
> ---
> library/string.m | 38 ++++++++++++++++----------------------
> 1 file changed, 16 insertions(+), 22 deletions(-)
>
> diff --git a/library/string.m b/library/string.m
> index 275a49728..c65e857f8 100644
> --- a/library/string.m
> +++ b/library/string.m
> @@ -4468,11 +4468,15 @@ to_upper(S1) = S2 :-
> :- pragma promise_equivalent_clauses(to_upper/2).
>
> to_upper(StrIn::in, StrOut::uo) :-
> - % XXX ILSEQ to_char_list and from_char_list cannot handle ill-formed
> + % Use to_code_unit_list instead of to_char_list to preserve ill-formed
> % sequences.
> - to_char_list(StrIn, List),
> - char_list_to_upper(List, ListUpp),
> - from_char_list(ListUpp, StrOut).
> + to_code_unit_list(StrIn, CodeList0),
> + list.map(to_upper_code_unit, CodeList0, CodeList),
> + ( if from_code_unit_list_allow_ill_formed(CodeList, StrPrime) then
> + StrOut = StrPrime
> + else
> + unexpected($pred, "string.from_code_unit_list_allow_ill_formed failed")
> + ).
>
> to_upper(X::in, Y::in) :-
> length(X, LenX),
> @@ -4525,13 +4529,6 @@ to_upper(X::in, Y::in) :-
> StrOut = new String(cs);
> ").
>
> -:- pred char_list_to_upper(list(char)::in, list(char)::out) is det.
> -
> -char_list_to_upper([], []).
> -char_list_to_upper([X | Xs], [Y | Ys]) :-
> - char.to_upper(X, Y),
> - char_list_to_upper(Xs, Ys).
> -
> :- pred check_upper_loop(string::in, string::in, int::in, int::in) is semidet.
>
> check_upper_loop(X, Y, Index, End) :-
> @@ -4564,11 +4561,15 @@ to_lower(S1) = S2 :-
> :- pragma promise_equivalent_clauses(to_lower/2).
>
> to_lower(StrIn::in, StrOut::uo) :-
> - % XXX ILSEQ to_char_list and from_char_list cannot handle ill-formed
> + % Use to_code_unit_list instead of to_char_list to preserve ill-formed
> % sequences.
> - to_char_list(StrIn, List),
> - char_list_to_lower(List, ListLow),
> - from_char_list(ListLow, StrOut).
> + to_code_unit_list(StrIn, CodeList0),
> + list.map(to_lower_code_unit, CodeList0, CodeList),
> + ( if from_code_unit_list_allow_ill_formed(CodeList, StrPrime) then
> + StrOut = StrPrime
> + else
> + unexpected($pred, "string.from_code_unit_list_allow_ill_formed failed")
> + ).
>
> to_lower(X::in, Y::in) :-
> length(X, LenX),
> @@ -4621,13 +4622,6 @@ to_lower(X::in, Y::in) :-
> StrOut = new String(cs);
> ").
>
> -:- pred char_list_to_lower(list(char)::in, list(char)::out) is det.
> -
> -char_list_to_lower([], []).
> -char_list_to_lower([X | Xs], [Y | Ys]) :-
> - char.to_lower(X, Y),
> - char_list_to_lower(Xs, Ys).
> -
> :- pred check_lower_loop(string::in, string::in, int::in, int::in) is semidet.
>
> check_lower_loop(X, Y, Index, End) :-
> --
> 2.23.0
>
> _______________________________________________
> reviews mailing list
> reviews at lists.mercurylang.org
> https://lists.mercurylang.org/listinfo/reviews
More information about the reviews
mailing list