[m-rev.] for review: better lookahead for switch detection
Julien Fischer
jfischer at opturion.com
Sat Nov 15 01:07:52 AEDT 2025
On Fri, 14 Nov 2025 at 18:50, Zoltan Somogyi <zoltan.somogyi at runbox.com> wrote:
> Let switch detection handle deeper disjunctions.
>
> compiler/switch_detection.m:
> The existing switch detection algorithm does a single forward
> traversal of the procedure body, doing only very limited lookahead.
> This prevents it from recognizing some switches. To fix this, add
> a new prepass that provides unlimited lookahead.
>
> Add one use of this lookahead information. Later diffs should add
> more uses.
>
> compiler/print_help.m:
> Fix the code that served as motivation for this change.
> It was a block of unifications that was intended to supply
> the lookahead that the old algorithm needed but did not have,
> but it actually lied: it said that the the following disjunction
> covered cons_ids that it actually did not cover. The fact that
> the compiler did not diagnose this lie until now was a bug.
> (After this diff, the compiler now reports an error.)
>
> deep_profiler/message.m:
> Avoid the limitation of the new algorithm that was discussed today
> on m-dev.
...
> diff --git a/compiler/switch_detection.m b/compiler/switch_detection.m
> index 458ed7193..cdceef692 100644
> --- a/compiler/switch_detection.m
> +++ b/compiler/switch_detection.m
...
> @@ -1583,6 +1696,837 @@ count_covered_cons_ids([Case | Cases]) = CaseCount + CasesCount :-
> CaseCount = 1 + list.length(OtherConsIds),
> CasesCount = count_covered_cons_ids(Cases).
>
> +%---------------------------------------------------------------------------%
> +%---------------------------------------------------------------------------%
> +%
> +% The data structures we use in scout_disjunctions_in_goal,
Finish this first sentence here ...
> which is
> +% a pre-pass before the main procedure body traversal that actually decides
> +% which disjunctions can be turned into switches, and they can possibly
... and turn this into a separate sentence.
Also: *if* they can possibly be turned ...
> +% be turned into more than one kind of switch, which one we should choose.
> +% The result of scouting is information about the terrain ahead, which
> +% in this case means information about deconstruction unifications
> +% and disjunctions that the main traversal has not yet seen.
> +%
> +% At the moment, we use scouting results at only point in the main traversal.
> +% However, this may change in the future.
> +%
> +
> +:- type scout_disj_info
> + ---> scout_disj_info(
> + % Conceptually, both of these are read-only, though in
> + % actuality, we update module_info when we handle cases
> + % inside switches.
> + sdi_module_info :: module_info,
> + sdi_var_table :: var_table,
> +
> + % These are the data structures we are constructing.
> + % The one we really want is the disjunction_info_map;
> + % we build the disjunct_info_map as a stepping stone to it.
> + sdi_disjunction_info_map :: disjunction_info_map,
> + sdi_disjunct_info_map :: disjunct_info_map
> + ).
> +
> +%---------------------%
> +
> + % Maps the id of a disjunction to information about that disjunction.
> +:- type disjunction_info_map == map(disjunction_id, disjunction_info).
> +
> + % Map the id of a disjunct to information about that disjunct.
> +:- type disjunct_info_map == map(disjunct_id, disjunct_info).
> +
> +%---------------------%
> +
> + % The type whose values we use to identify a disjunction.
> + % The goal specified by the given goal_id will be a disj(...) goal.
> +:- type disjunction_id
> + ---> disjunction_id(goal_id).
> +
> + % The type whose values we use to identify a disjunct in a disjunction.
> + % The goal specified by the given goal_id will have a disj(...) goal
> + % as its immediate parent.
> +:- type disjunct_id
> + ---> disjunct_id(goal_id).
> +
> +%---------------------%
> +
> + % The information that scouting finds about a disjunction.
> + % There should be one of these in the disjunction_info_map
> + % for every disjunction in the procedure body.
> +:- type disjunction_info
> + ---> disjunction_info(
> + dni_arms :: one_or_more(disjunct_id_info),
> + % The list of disjuncts in the disjunction.
> + % This field is not yet used.
> +
> + dni_summary_map :: all_arms_summary_map
> + ).
> +
> +:- type all_arms_summary_map == map(prog_var, var_all_arms_summary).
> +
> +:- type var_all_arms_summary
> + ---> var_all_arms_summary(
> + set(switchable_cons_id),
> + is_sub_disj_needed
> + ).
> +
> +:- type one_arm_summary_map == map(prog_var, var_one_arm_summary).
> +
> +:- type var_one_arm_summary
> + ---> voas_deconstruct(deconstruct_info)
> + ; voas_sub_disjunction(var_all_arms_summary).
Document the types related to arms here.
> +:- type is_sub_disj_needed
> + ---> sub_disj_is_not_needed
> + ; sub_disj_is_needed.
> +
> +:- type switchable_cons_id =< cons_id
> + ---> du_data_ctor(du_ctor)
> + ; some_int_const(some_int_const)
> + ; float_const(float)
> + ; char_const(char)
> + ; string_const(string).
> +
> +%---------------------%
> +
> +:- type maybe_in_zone
> + ---> in_zone(disjunct_id)
> + % We are in one of the disjuncts of a disjunction; the argument
> + % specifies the disjunct. And we are within the initial sequence
> + % of unifications within that disjunct. (We treat calls from the
> + % clause head as unifications for this purpose.)
> + %
> + % As soon as we leave this initial part of a disjunct,
> + % we switch to not_in_zone. The only deconstruction unifications
> + % we consider for switch detection are the ones that occur
> + % "in the zone".
> + %
> + % We originally adopted this rule to reduce the cost (in compile
> + % time) of searching for deconstruction unifications that denote
> + % switch arms. However, converting such a deconstruction
> + % unification into the test for a switch arm can also change
> + % the order execution of the disjunct's conjuncts, and restricting
> + % the reordering to happend only among unifications eliminates
s/happend/happen/
> + % any concerns about changing the operational semantics of the
> + % procedure in terms of exceptions being raised or nontermination
> + % being introduced. (Function calls from clause heads do not have
> + % a clearly specified order with respect to goals in the clause
> + % body, which is why we allow reordering with respect to them.)
> + %
> + % The effect on compile times is no longer meaningful, but the
> + % effect on operational semantics is still relevant.
> + %
> + % There is also an ergonomic argument here: requiring unifications
> + % that effectively serve as case constants in C switch statements
> + % to be near the start of their switch arms keeps code readable,
> + % compared to a hypothetical alternative arrangement in which
> + % we allow unifications from the ends of possibly-long disjuncts
> + % to provide the cons_id that identifies a switch arm.
> + ; not_in_zone.
> +
> +:- type disjunct_id_info
> + ---> disjunct_id_info(disjunct_id, disjunct_info).
> +
> + % The information that scouting finds about a disjunct.
> + % There should be one of these in the disjunct_info_map
> + % for every disjunct in the procedure body.
> +:- type disjunct_info
> + ---> disjunct_info(
> + di_iz_deconstruct_map :: in_zone_deconstruct_map,
> + % We record info about at most one disjunction, since
> + % a disjunction ends the zone.
> + di_iz_sub_disjunctions :: maybe(disjunction_id)
> + ).
> +
> + % Note that we map not just the deconstructed variable
> + % to a deconstruct_info, but also all variables equivalent to it.
> +:- type in_zone_deconstruct_map == map(prog_var, deconstruct_info).
> +
> +:- type deconstruct_info
> + ---> deconstruct_info(
> + % This goal ...
> + goal_id,
> +
> + % ... deconstructs this variable ...
> + prog_var,
> +
> + % ... which is part of this equivalence class ...
> + set(prog_var),
> +
> + % ... with this cons_id.
> + switchable_cons_id
> + ).
> +
> +%---------------------------------------------------------------------------%
> +
> +:- pred scout_disjunctions_in_goal(hlds_goal::in, instmap::in, instmap::out,
> + maybe_in_zone::in, maybe_in_zone::out, subst_db::in, subst_db::out,
> + scout_disj_info::in, scout_disj_info::out) is det.
> +
> +scout_disjunctions_in_goal(Goal, InstMap0, InstMap,
> + !InZone, !SubstDb, !ScoutInfo) :-
> + Goal = hlds_goal(GoalExpr, GoalInfo),
> + (
> + GoalExpr = unify(_, _, _, _, _),
> + scout_disjunctions_in_unify_expr(GoalExpr, GoalInfo, InstMap0,
> + !.InZone, !SubstDb, !ScoutInfo)
> + ;
> + ( GoalExpr = generic_call(_, _, _, _, _)
> + ; GoalExpr = plain_call(_, _, _, _, _, _)
> + ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
> + ),
> + ( if goal_info_has_feature(GoalInfo, feature_from_head) then
> + true
> + else
> + !:InZone = not_in_zone
> + )
> + ;
> + GoalExpr = conj(ConjType, Conjuncts),
> + (
> + ConjType = plain_conj,
> + scout_disjunctions_in_conjuncts(Conjuncts, InstMap0,
> + !InZone, !SubstDb, !ScoutInfo)
> + ;
> + ConjType = parallel_conj,
> + (
> + Conjuncts = []
> + ;
> + Conjuncts = [HeadConjunct | TailConjuncts],
> + % The first parallel conjunct can be in the zone;
> + % any later conjuncts cannot be in the zone.
> + scout_disjunctions_in_goal(HeadConjunct, InstMap0, InstMap1,
> + !.InZone, _, !SubstDb, !ScoutInfo),
> + scout_disjunctions_in_conjuncts(TailConjuncts, InstMap1,
> + not_in_zone, _, !SubstDb, !ScoutInfo),
> + !:InZone = not_in_zone
> + )
> + )
> + ;
> + GoalExpr = disj(Disjuncts),
> + (
> + Disjuncts = []
> + ;
> + Disjuncts = [HeadDisjunct | TailDisjuncts],
> + scout_disjunctions_in_disjuncts(HeadDisjunct, TailDisjuncts,
> + InstMap0, !.SubstDb,
> + HeadDisjunctIdInfo, TailDisjunctIdInfos, !ScoutInfo),
> +
> + OoMDisjunctIdsInfos =
> + one_or_more(HeadDisjunctIdInfo, TailDisjunctIdInfos),
> + construct_disjunction_info(!.ScoutInfo, OoMDisjunctIdsInfos,
> + DisjunctionInfo),
> +
> + DisjunctionGoalId = goal_info_get_goal_id(GoalInfo),
> + DisjunctionId = disjunction_id(DisjunctionGoalId),
> + DisjunctionInfoMap0 = !.ScoutInfo ^ sdi_disjunction_info_map,
> + map.det_insert(DisjunctionId, DisjunctionInfo,
> + DisjunctionInfoMap0, DisjunctionInfoMap),
> + !ScoutInfo ^ sdi_disjunction_info_map := DisjunctionInfoMap,
> +
> + (
> + !.InZone = in_zone(DisjunctId),
> + DisjunctInfoMap0 = !.ScoutInfo ^ sdi_disjunct_info_map,
> + map.lookup(DisjunctInfoMap0, DisjunctId, DisjunctInfo0),
> + DisjunctInfo0 =
> + disjunct_info(DeconstructMap0, SubDisjunctions0),
> + expect(unify(SubDisjunctions0, no), $pred,
> + "SubDisjunctions0 != no"),
> + SubDisjunctions = yes(DisjunctionId),
> + DisjunctInfo = disjunct_info(DeconstructMap0, SubDisjunctions),
> + map.det_update(DisjunctId, DisjunctInfo,
> + DisjunctInfoMap0, DisjunctInfoMap),
> + !ScoutInfo ^ sdi_disjunct_info_map := DisjunctInfoMap
> + ;
> + !.InZone = not_in_zone
> + ),
> + !:InZone = not_in_zone
> + )
> + ;
> + GoalExpr = switch(Var, _CanFail, Cases),
> + scout_disjunctions_in_cases(Var, Cases, InstMap0,
> + !.SubstDb, !ScoutInfo),
> + !:InZone = not_in_zone
> + ;
> + GoalExpr = if_then_else(_Vars, Cond, Then, Else),
> + scout_disjunctions_in_goal(Cond, InstMap0, InstMap1,
> + not_in_zone, _, !.SubstDb, SubstDbCond, !ScoutInfo),
> + scout_disjunctions_in_goal(Then, InstMap1, _,
> + not_in_zone, _, SubstDbCond, _, !ScoutInfo),
> + scout_disjunctions_in_goal(Else, InstMap0, _,
> + not_in_zone, _, !.SubstDb, _, !ScoutInfo),
> + !:InZone = not_in_zone
> + ;
> + GoalExpr = negation(SubGoal),
> + scout_disjunctions_in_goal(SubGoal, InstMap0, _,
> + not_in_zone, _, !.SubstDb, _, !ScoutInfo),
> + !:InZone = not_in_zone
> + ;
> + GoalExpr = scope(Reason, SubGoal),
> + (
> + Reason = from_ground_term(_, FgtKind),
> + (
> + FgtKind = from_ground_term_deconstruct,
> + SubGoal = hlds_goal(SubGoalExpr, _),
> + ( if
> + SubGoalExpr = conj(plain_conj, [HeadSubGoal | _]),
> + HeadSubGoal = hlds_goal(HeadSubGoalExpr, HeadSubGoalInfo),
> + HeadSubGoalExpr = unify(_, _, _, _, _)
> + then
> + scout_disjunctions_in_unify_expr(HeadSubGoalExpr,
> + HeadSubGoalInfo, InstMap0, !.InZone,
> + !SubstDb, !ScoutInfo)
> + % Ignore the goals after HeadSubGoal; nothing in them
> + % could interest us.
> + else
> + unexpected($pred, "unexpected goal in fgt scope")
> + )
> + ;
> + ( FgtKind = from_ground_term_initial
> + ; FgtKind = from_ground_term_construct
> + ; FgtKind = from_ground_term_other
> + )
> + % Ignore the scope; nothing in it could interest us.
> + )
> + ;
> + ( Reason = exist_quant(_, _)
> + ; Reason = disable_warnings(_, _)
> + ; Reason = promise_solutions(_, _)
> + ; Reason = promise_purity(_)
> + ; Reason = require_detism(_)
> + ; Reason = commit(_)
> + ; Reason = barrier(_)
> + ; Reason = trace_goal(_, _, _, _, _)
> + ; Reason = loop_control(_, _, _)
> + ),
> + scout_disjunctions_in_goal(SubGoal, InstMap0, _,
> + !.InZone, _, !.SubstDb, _, !ScoutInfo)
> + ;
> + ( Reason = require_complete_switch(_RequiredVar)
> + ; Reason = require_switch_arms_detism(_RequiredVar, _)
> + ),
> + scout_disjunctions_in_goal(SubGoal, InstMap0, _,
> + not_in_zone, !:InZone, !.SubstDb, _, !ScoutInfo),
> + expect(unify(!.InZone, not_in_zone), $pred,
> + "in_zone after switch-related reason")
> + )
> + ;
> + GoalExpr = shorthand(ShortHand),
> + (
> + ShortHand = atomic_goal(_GoalType, _Outer, _Inner,
> + _MaybeOutputVars, MainGoal, OrElseGoals, _OrElseInners),
> + scout_disjunctions_in_goal(MainGoal, InstMap0, _,
> + not_in_zone, _, !.SubstDb, _, !ScoutInfo),
> + scout_disjunctions_in_orelse_goals(OrElseGoals, InstMap0,
> + !.SubstDb, !ScoutInfo),
> + !:InZone = not_in_zone
> + ;
> + ShortHand = try_goal(_MaybeIO, _ResultVar, SubGoal),
> + scout_disjunctions_in_goal(SubGoal, InstMap0, _,
> + not_in_zone, _, !.SubstDb, _, !ScoutInfo),
> + !:InZone = not_in_zone
> + ;
> + ShortHand = bi_implication(_, _),
> + % These should have been expanded out by now.
> + unexpected($pred, "bi_implication")
> + )
> + ),
> + apply_goal_instmap_delta(Goal, InstMap0, InstMap).
> +
> +:- pred scout_disjunctions_in_unify_expr(hlds_goal_expr::in(goal_expr_unify),
> + hlds_goal_info::in, instmap::in, maybe_in_zone::in,
> + subst_db::in, subst_db::out,
> + scout_disj_info::in, scout_disj_info::out) is det.
> +
> +scout_disjunctions_in_unify_expr(GoalExpr, GoalInfo, InstMap0,
> + InZone0, !SubstDb, !ScoutInfo) :-
> + GoalExpr = unify(XVar, RHS, _, Unification, _),
> + % For both rhs_var and rhs_functor, we record the effect of the
> + % unification on the substitution database even when we are
> + % outside the zone. This extra info won't help us find more
> + % aliases for in-zone deconstructions this disjunct (since there
*in* this disjunct
> + % aren't any more past the end of the zone), but the extra information
> + % in the substitution database may help us find more aliases inside
> + % nested disjunctions.
> + (
> + RHS = rhs_lambda_goal(_, _, _, _, VarsModes, _, LambdaGoal),
> + % We need to insert the initial insts for the lambda variables
> + % in the instmap before processing the lambda goal.
s/in/into/
> + ModuleInfo = !.ScoutInfo ^ sdi_module_info,
> + instmap.pre_lambda_update(ModuleInfo, VarsModes, InstMap0, InstMap1),
> + % LambdaGoal is may be in_zone from the point of view of the code
Delete "is".
> + % outside this unification, but the proper perspective for
> + % this call is the code *inside* the lambda goal. And from that
> + % point of view, LambdaGoal is not inside *any* disjunction.
> + scout_disjunctions_in_goal(LambdaGoal, InstMap1, _,
> + not_in_zone, _, !.SubstDb, _, !ScoutInfo)
> + ;
> + RHS = rhs_var(YVar),
> + record_var_var_unify(XVar, YVar, !SubstDb)
> + ;
> + RHS = rhs_functor(ConsId, _IsExistConstr, YVars),
> + (
> + ( ConsId = du_data_ctor(_)
> + ; ConsId = some_int_const(_)
> + ; ConsId = float_const(_)
> + ; ConsId = char_const(_)
> + ; ConsId = string_const(_)
> + ),
> + (
> + Unification = assign(_, _),
> + unexpected($pred, "assign")
> + ;
> + Unification = simple_test(_, _),
> + unexpected($pred, "simple_test")
> + ;
> + Unification = construct(_, _, _, _, _, _, _)
> + ;
> + Unification = deconstruct(_, _, _, _, _, _),
> + (
> + InZone0 = in_zone(DisjunctId),
> + GoalId = goal_info_get_goal_id(GoalInfo),
> + record_deconstruct(GoalId, XVar, coerce(ConsId),
> + !.SubstDb, DisjunctId, !ScoutInfo)
> + ;
> + InZone0 = not_in_zone
> + )
> + ;
> + Unification = complicated_unify(_, _, _),
> + unexpected($pred, "test")
s/test/complicated_unify/
That looks fine otherwise.
Julien.
More information about the reviews
mailing list