[m-rev.] for post-commit review: optimize more computed_gotos
Peter Wang
novalazy at gmail.com
Mon Jul 15 11:49:53 AEST 2024
On Mon, 15 Jul 2024 01:10:16 +1000 "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> Add a guard to an optimization of computed gotos.
>
> compiler/llds.m:
> Add a field to the computed_goto LLDS instruction that specifies
> the maximum value the index rval, if that maximum is known.
>
> compiler/peephole.m:
> I recently add an optimization that replaces a computed_goto that has
> two possible targets with an if-then-else, with the condition being
> a bitmap in which the index rval selects a bit position. As Peter
> pointed out, this is unsafe if the index rval can exceed the length
> of the bitmap, which is 32 bits, so apply this optimization only if
> (a) the maximum possible value of the index rval is known, and
> (b) it is less than 32.
>
> In cases where this extra test makes that optimization inapplicable,
> try to apply a simple X = val1 or X = val2 test.
> diff --git a/compiler/llds.m b/compiler/llds.m
> index 5beac50b5..86a83695e 100644
> --- a/compiler/llds.m
> +++ b/compiler/llds.m
> @@ -363,8 +363,8 @@
> % do_redo, etc can get optimized into invocations of the macros
> % fail(), redo(), etc.
>
> - ; computed_goto(rval, list(maybe(label)))
> - % Evaluate rval, which should be an integer, and jump to the
> + ; computed_goto(rval, maybe(int), list(maybe(label)))
> + % Evaluate the rval, which should be an integer, and jump to the
> % (rval+1)th label in the list. e.g. computed_goto(2, [A, B, C, D])
> % will branch to label C. A label that isn't there implicitly means
> % "not reached".
The comment doesn't mention the second argument.
> diff --git a/compiler/peephole.m b/compiler/peephole.m
> index 94ad9f510..71756c686 100644
> --- a/compiler/peephole.m
> +++ b/compiler/peephole.m
> @@ -193,12 +197,13 @@ peephole_match(InvalidPatterns, Instr0, Instrs0, Instrs) :-
> % can be replaced with a conditional branch followed by an unconditional
> % goto.
> %
> -:- pred peephole_match_computed_goto(rval::in, list(maybe(label))::in,
> - string::in, list(instruction)::in, list(instruction)::out) is semidet.
> -:- pragma inline(pred(peephole_match_computed_goto/5)).
> +:- pred peephole_match_computed_goto(rval::in, maybe(int)::in,
> + list(maybe(label))::in, string::in,
> + list(instruction)::in, list(instruction)::out) is semidet.
> +:- pragma inline(pred(peephole_match_computed_goto/6)).
>
> -peephole_match_computed_goto(SelectorRval, MaybeLabels, Comment0,
> - Instrs0, Instrs) :-
> +peephole_match_computed_goto(SelectorRval, MaybeMaxIndex, MaybeLabels,
> + Comment0, Instrs0, Instrs) :-
> build_computed_goto_target_map(MaybeLabels, 0,
> one_or_more_map.init, LabelMap),
> one_or_more_map.to_assoc_list(LabelMap, LabelValsList),
> @@ -208,7 +213,22 @@ peephole_match_computed_goto(SelectorRval, MaybeLabels, Comment0,
> GotoInstr = llds_instr(goto(code_label(Label)), Comment0),
> Instrs = [GotoInstr | Instrs0]
> else if
> - LabelValsList = [LabelValsA, LabelValsB]
> + LabelValsList = [LabelValsA, LabelValsB],
> + peephole_pick_fewer_vals_label(LabelValsA, LabelValsB,
> + FewerValsLabel, FewerOoMVals, OtherLabel),
> + FewerOoMVals = one_or_more(FirstVal, LaterVals),
> + ( if
> + MaybeMaxIndex = yes(MaxIndex),
> + MaxIndex < 32
Add a comment that 32 is the length of the bitmap, on all platforms.
> + then
> + Method = method_bitmap
> + else if
> + LaterVals = [LaterVal0]
> + then
> + Method = method_or(LaterVal0)
> + else
> + fail
> + )
> then
> % The only info we have about which label execution will go to
> % more often at runtime is the number of values associated with
> @@ -226,24 +246,31 @@ peephole_match_computed_goto(SelectorRval, MaybeLabels, Comment0,
> % of one conditional and one unconditional branch with just one
> % conditional branch effectively trumping the pipeline considerations
> % above.
> - peephole_pick_fewer_vals_label(LabelValsA, LabelValsB,
> - FewerValsLabel, FewerOoMVals, OtherLabel),
> - FewerOoMVals = one_or_more(FirstVal, LaterVals),
> (
> LaterVals = [],
> CondRval = binop(int_cmp(int_type_int, eq),
> SelectorRval, const(llconst_int(FirstVal)))
> ;
> LaterVals = [_ | _],
> - build_offset_mask([FirstVal | LaterVals], 0u, Mask),
> - SelectorRvalUint = cast(lt_int(int_type_uint), SelectorRval),
> - QueryRval = binop(
> - unchecked_left_shift(int_type_uint, shift_by_uint),
> - const(llconst_uint(1u)), SelectorRvalUint),
> - SelectedBitUintRval = binop(bitwise_and(int_type_uint),
> - const(llconst_uint(Mask)), QueryRval),
> - CondRval = binop(int_cmp(int_type_uint, ne),
> - SelectedBitUintRval, const(llconst_uint(0u)))
> + (
> + Method = method_bitmap,
> + build_offset_mask([FirstVal | LaterVals], 0u, Mask),
> + SelectorRvalUint = cast(lt_int(int_type_uint), SelectorRval),
> + QueryRval = binop(
> + unchecked_left_shift(int_type_uint, shift_by_uint),
> + const(llconst_uint(1u)), SelectorRvalUint),
> + SelectedBitUintRval = binop(bitwise_and(int_type_uint),
> + const(llconst_uint(Mask)), QueryRval),
> + CondRval = binop(int_cmp(int_type_uint, ne),
> + SelectedBitUintRval, const(llconst_uint(0u)))
Mention that SelectorRval =< MaxIndex, which is less than 32,
so we won't encounter undefined behaviour on the platforms we target.
> + ;
> + Method = method_or(LaterVal),
> + CondRvalA = binop(int_cmp(int_type_int, eq),
> + SelectorRval, const(llconst_int(FirstVal))),
> + CondRvalB = binop(int_cmp(int_type_int, eq),
> + SelectorRval, const(llconst_int(LaterVal))),
> + CondRval = binop(logical_or, CondRvalA, CondRvalB)
> + )
> ),
> CommentInstr = llds_instr(comment(Comment0), ""),
> BranchUinstr = if_val(CondRval, code_label(FewerValsLabel)),
> MaybeInstr = yes(Instr)
> ;
That looks fine, otherwise.
Peter
More information about the reviews
mailing list