[m-rev.] for review: Fix bug with typechecking of coercions.

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue Jul 2 14:51:47 AEST 2024


On 2024-07-01 16:52 +10:00 AEST, "Peter Wang" <novalazy at gmail.com> wrote:
> Fix a problem where the order that the coercions are typechecked
> can cause type variables to be bound differently.
> 
> For example, assuming citrus =< fruit:
> 
>     X = [] : list(T),
>     coerce(X) = _ : list(fruit),
>     coerce(X) = _ : list(citrus)
> 
> T could end up bound to either 'fruit' or 'citrus' depending on which
> coercion is checked first, due to unification of the type variable in
> the coerce argument type, and the target type of the coercion.

That seems like a *very* rare situation. Did you come across it "in the wild",
or did you construct it when you found something that made you go "hang on ..."?

>  :- pred type_assign_get_var_types(type_assign::in,
> diff --git a/compiler/typecheck.m b/compiler/typecheck.m
> index 74ed33ad8..73750e2b4 100644
> --- a/compiler/typecheck.m
> +++ b/compiler/typecheck.m
> @@ -1065,9 +1065,13 @@ report_coercion(TypeAssign, Coercion, !Info) :-
>          Status = need_to_check,
>          unexpected($pred, "need to check")
>      ;
> -        Status = unsatisfiable,
> +        Status = unsatisfiable_invalid,
>          Spec = report_invalid_coerce_from_to(ClauseContext, Context, TVarSet,
>              FromType, ToType)
> +    ;
> +        Status = unsatisfiable_unresolved,
> +        Spec = report_unresolved_coerce_from_to(ClauseContext, Context,
> +            TVarSet, FromType, ToType)
>      ;

As your function name suggests, "unsatisfiable_unresolved"
is not describing a unsatisfiable coercion; it is reporting a coercion
that can be satisfied in more than one way. I would rename that status
to either just "unresolved", or even better, to "not_yet_resolved",
since it can get resolved with more typechecking. This turns into an error
only if no more typechecking can be done.

> --- a/compiler/typecheck_clauses.m
> +++ b/compiler/typecheck_clauses.m

> @@ -2503,8 +2502,8 @@ compare_types_corresponding(TypeTable, TVarSet, Comparison,
>  
>  %---------------------------------------------------------------------------%
>  
> -    % Remove satisfied coerce constraints from each type assignment,
> -    % then drop any type assignments with unsatisfied coerce constraints
> +    % Check coerce constraints in each type assignment to see if they can be
> +    % satisfied. Drop any type assignments with unsatisfied coerce constraints
>      % if there is at least one type assignment that does satisfy coerce
>      % constraints.
>      %

"that DOES satisfy" or "that CAN satisfy"? Either way, that comment should
explain *why* you answer that question the way you do.

> -check_pending_coerce_constraints(_TypeTable, [], [], !TypeAssign).
> -check_pending_coerce_constraints(TypeTable, [Coercion0 | Coercions0],
> -        KeepCoercions, !TypeAssign) :-
> +check_pending_coerce_constraints_to_fixpoint(TypeTable, Coercions0, Coercions,
> +        !TypeAssign) :-
> +    check_pending_coerce_constraints_loop(TypeTable, Coercions0,
> +        KeepCoercions, DelayedCoercions, !TypeAssign),
> +    ( if
> +        KeepCoercions = [],
> +        list.length(Coercions0) = list.length(DelayedCoercions) : int

I would call length/2 twice and unify the vars containing the lengths;
it is longer, but clearer. Especially if you take the length of Coercions0
*before* the call to the loop.

Alternatively, I would just pass a flag through the loop that says
"did the loop do anything other than delay?". That's probably
even simpler.

> @@ -2605,12 +2651,10 @@ type_assign_has_only_satisfied_coerce_constraints(TypeAssign) :-
>  coerce_constraint_is_satisfied(Coercion) :-
>      Coercion = coerce_constraint(_FromType, _ToType, _Context, Status),
>      require_complete_switch [Status]
> -    (
> -        Status = need_to_check, fail
> -    ;
> -        Status = unsatisfiable, fail
> -    ;
> -        Status = satisfied_but_redundant
> +    ( Status = need_to_check, fail
> +    ; Status = unsatisfiable_invalid, fail
> +    ; Status = unsatisfiable_unresolved, fail
> +    ; Status = satisfied_but_redundant
>      ).

The "fails" are hard to see here.
I would use a switch with two arms, one with "fail" as the action
and one with no action, each listing just the function symbols
with that action.
  
>  %---------------------------------------------------------------------------%
> diff --git a/compiler/typecheck_errors.m b/compiler/typecheck_errors.m
> index 035e3671a..c8234e550 100644
> --- a/compiler/typecheck_errors.m
> +++ b/compiler/typecheck_errors.m

> @@ -280,10 +283,11 @@ report_invalid_coerce_from_to(ClauseContext, Context, TVarSet,
>      % XXX TYPECHECK_ERRORS
>      % This code can generate some less-than-helpful diagnostics.
>      %
> -    % - For tests/invalid/coerce_infer.m and some others, it says that
> +    % - For tests/invalid/coerce_unify_tvars.m and some others, it says that
>      %   you cannot coerce from one anonymous type variable to another.
>      %
>      % Is there something we can report that would be more helpful?
> +    % In most cases, we report that the coerced argument type is unresolved.

Would that make more sense if you switched the order of the last two lines?

> +report_unresolved_coerce_from_to(ClauseContext, Context, TVarSet,
> +        FromType, ToType) = Spec :-
> +    InClauseForPieces = in_clause_for_pieces(ClauseContext),
> +    FromTypeStr = mercury_type_to_string(TVarSet, print_num_only, FromType),
> +    ToTypeStr = mercury_type_to_string(TVarSet, print_num_only, ToType),
> +    ErrorPieces = [words("error: cannot coerce from")] ++
> +        [quote(FromTypeStr), words("to"), quote(ToTypeStr), suffix("."), nl] ++
> +        [words("The type of the coerced argument is")] ++
> +        color_as_incorrect([words("unresolved.")]) ++ [nl],
> +    Spec = spec($pred, severity_error, phase_type_check, Context,
> +        InClauseForPieces ++ ErrorPieces).

I think this error message text very much embodies your view as the
compiler implementor. From a user's point of view, the "cannot coerce" part
is just the consequence; the actual *error* is a type being unresolved.
That being so, I would (a) put that part first in the message, and, more important,
I would (b) name the variable whose type is unresolved (at least if it actually
*has* a name).

Apart from that, the diff seems fine.

Zoltan.


More information about the reviews mailing list