[m-rev.] for review: Avoid intermediate strings in string.replace_all.

Mark Brown mark at mercurylang.org
Thu Nov 7 20:59:41 AEDT 2019


This looks fine.

On Thu, Nov 7, 2019 at 3:22 PM Peter Wang <novalazy at gmail.com> wrote:
>
> library/string.m:
>     Implement string.replace_all using unsafe_append_string_pieces to
>     avoid intermediate strings. Use unsafe_sub_string_search_start to
>     avoid repeated range checks.
> ---
>  library/string.m | 36 +++++++++++++++++++-----------------
>  1 file changed, 19 insertions(+), 17 deletions(-)
>
> diff --git a/library/string.m b/library/string.m
> index 95d944926..149047677 100644
> --- a/library/string.m
> +++ b/library/string.m
> @@ -4826,17 +4826,19 @@ replace(Str, Pat, Subst, Result) :-
>      ],
>      unsafe_append_string_pieces(Pieces, Result).
>
> -replace_all(S1, S2, S3) = S4 :-
> -    replace_all(S1, S2, S3, S4).
> +replace_all(Str, Pat, Subst) = Result :-
> +    replace_all(Str, Pat, Subst, Result).
>
>  replace_all(Str, Pat, Subst, Result) :-
>      ( if Pat = "" then
>          replace_all_empty_pat(Str, Subst, Result)
>      else
> -        PatLength = length(Pat),
> -        replace_all_loop(Str, Pat, Subst, PatLength, 0, [], ReversedChunks),
> -        Chunks = list.reverse(ReversedChunks),
> -        Result = append_list(Chunks)
> +        % Using substring instead of string avoids two calls to string.length
> +        % every time that SubstPiece appears in Pieces.
> +        SubstPiece = substring(Subst, 0, length(Subst)),
> +        replace_all_loop(Str, Pat, length(Pat), SubstPiece, 0, [], RevPieces),
> +        list.reverse(RevPieces, Pieces),
> +        unsafe_append_string_pieces(Pieces, Result)
>      ).
>
>  :- pred replace_all_empty_pat(string::in, string::in, string::uo) is det.
> @@ -4883,19 +4885,19 @@ prepend_code_units(Str, FirstIndex, Index, Codes0, Codes) :-
>          prepend_code_units(Str, FirstIndex, Index - 1, Codes1, Codes)
>      ).
>
> -:- pred replace_all_loop(string::in, string::in, string::in,
> -    int::in, int::in, list(string)::in, list(string)::out) is det.
> +:- pred replace_all_loop(string::in, string::in, int::in, string_piece::in,
> +    int::in, list(string_piece)::in, list(string_piece)::out) is det.
>
> -replace_all_loop(Str, Pat, Subst, PatLength, BeginAt,
> -        RevChunks0, RevChunks) :-
> -    ( if sub_string_search_start(Str, Pat, BeginAt, Index) then
> -        Initial = unsafe_between(Str, BeginAt, Index),
> -        Start = Index + PatLength,
> -        replace_all_loop(Str, Pat, Subst, PatLength, Start,
> -            [Subst, Initial | RevChunks0], RevChunks)
> +replace_all_loop(Str, Pat, PatLength, SubstPiece, BeginAt,
> +        RevPieces0, RevPieces) :-
> +    ( if unsafe_sub_string_search_start(Str, Pat, BeginAt, PatStart) then
> +        InitialPiece = substring(Str, BeginAt, PatStart),
> +        RevPieces1 = [SubstPiece, InitialPiece | RevPieces0],
> +        replace_all_loop(Str, Pat, PatLength, SubstPiece, PatStart + PatLength,
> +            RevPieces1, RevPieces)
>      else
> -        EndString = unsafe_between(Str, BeginAt, length(Str)),
> -        RevChunks = [EndString | RevChunks0]
> +        TailPiece = substring(Str, BeginAt, length(Str)),
> +        RevPieces = [TailPiece | RevPieces0]
>      ).
>
>  %---------------------%
> --
> 2.23.0
>
> _______________________________________________
> reviews mailing list
> reviews at lists.mercurylang.org
> https://lists.mercurylang.org/listinfo/reviews


More information about the reviews mailing list