[m-rev.] for tryout: type_spec_constrained_preds

Zoltan Somogyi zoltan.somogyi at runbox.com
Thu Feb 1 00:41:51 AEDT 2024


On 2024-01-31 20:08 +11:00 AEDT, "Julien Fischer" <jfischer at opturion.com> wrote:
> Compiling with the pragma above
> results in the following error:
> 
>     Making Mercury/asm_fast.gc/x86_64-pc-linux-gnu/Mercury/opts/csv.typed_reader.opt
>     In `:- pragma type_spec' declaration for function `csv.make_error_message'/1:
>       error: variables `V_1, V_2' do not occur in the `:- func' declaration.
> 
> Changing apply_to_superclasses to do_not_apply_to_superclasses, gives
> the following error:
> 
>     In `:- pragma type_spec' declaration for function
>       `csv.record_parser.get_client_quotation_mark_in_unquoted_field'/1:
>        error: variable `V_3' does not occur in the `:- func' declaration.

Thanks for the tryout; I will look into these.

Would you mind if I copied some parts of your csv and/or json libraries
to the test case for this diff? I am aware that the existing test case only
scratches the surface, and needs to be significantly expanded.

> That said, I'm not convinced that most of them are actually called,
> since there are number of class method calls remaining in the HLDS
> dumps. 

That may or may not be a problem. The higher order specialization pass,
which should replace non-type-specialized calls with type-specialized calls,
does not itself delete any predicates that have become dead as a result,
and until dead pred/proc elimination is called, those will remain in HLDS dumps.
Also, exported unspecialized versions cannot be deleted.

> (In the case of read_from_file/6, the call to the get/4
> method is still present and that would presuambly call the unspecialised
> versions.)

The pred id on the call should tell you which version is being called.
( don't think the code that redirects the call to the specialized version
 changes the *name* of the callee in the plain_call goal_expr, since the
name is not used by any of the later passes, but it has to change the pred_id.

To make HLDS dumps easier to read, I will look into making this code
update the name field as well, as a separate change.

>> I can see four possible approaches to solving this problem.
>>
>> Approach one would simply disallow the occurrence of such variables
>> on the LHSs of the third arg, and generate an error message if they
>> nevertheless occur.
> 
> Making such variables an error is my preference.
> 
>> Approach two would allow such variables, but ignore them.
> 
> No!

I brought that up only as a device to introduce the following approaches.

>> I would prefer approach four, though I could also live with approach one.
>> I view approach two as violating the law of least astonishment. As for
>> approach three, it could work, but any documentation of its semantics
>> would be signicantly more complicated than a documentation of the
>> semantics of approach four, and its use in practice would require
>> more attention as well. With approach four, the compiler can help
>> diagnose e.g. unintentional spelling mismatches between the first arg
>> and the later args; with approach three, it cannot.
> 
> I am struggling to see the use case for approaches (3) and (4).
> Is it that they also allow you to specialise unconstrained type
> variables?

Yes. The problem I am trying to address is this: if a predicate
has three type vars in its signature, say A, B, and C, then having three
type_spec pragmas, each specializing exactly one of these type vars,
will generate three specialized versions of the predicate, but there
will be no version that specializes any two or all three type vars.
In this sense, type_spec pragmas don't *compose*.

The motivation for the "ignore if not applicable" rule is this:

- Say the constraints in a type_spec_constrained_preds pragma's
  first arg may include type var A, but not B or C.

- Some predicate we want to specialize includes all of A, B and C
  in its type signature, with A occurring in a position that matches
  the type_spec_constrained_preds pragma's constraints.
  In this case, we want B and C to be specialized alongside A.

- Some other predicates are like that one, but they have only
  one of B and C. We want those to be specialized on either
  both A and B, or both A and C, depending on which they have,
  but we don't want an error message about the other one
  not being there.

Another possible solution, which would go with approach one
above but which would be more complicated to implement,
would be to redefine type_spec pragmas to MAKE them compose.
We have never needed this before, because *if you know which pred
you want to type-specialize*, then you know what type vars occur in it,
and it is simpler to list them all in a single type_spec pragma than
to write e.g. one type_spec pragma for each type variable.
The issue arises now because type_spec_constrained_preds
can apply to many different predicates, which may have different
sets of type vars in their signatures.

Composition is also complicated by the question of what we should do
when two type_spec pragmas specify that the same type var should be
specialized when it has two distinct but unifiable shapes? For example,
if one says "specialize if A = map(int, _)", and another says "specialize
if A = map(_, string)". Do you generate just those two versions, or
do you also specialize separately for A = map(int, string)?

>> I believe that
>>
>> - we should report duplicates when both type_spec pragmas are user-provided,
>> - but not if one or both pragmas are compiler generated.
> 
> That seems sensible.

I did not anticipate objections that to that one :-)

>> - The second args of two type_spec pragmas differ if and only if they differ
>>  after
>>
>>  - sorting the type var substitutions of which they consist
>>    on the name of the LHS variable, and
>>
>>  - replacing all type variables that occur in the types on the RHSs
>>    of those type var substitutions with distinct anonymous type vars.
>>    (In fact, we should consider whether we should allow ANY non-anonymous
>>    type variables to occur in the RHS types.)
> 
> Looking at what the standard library and compiler do, the RHS are nearly
> always ground types.

Which is why I though this proposal was worth looking into.

> (The one exception is var(_) in the standard
> library.)  In practice, we do use non-anonymous type variables in the RHS types.)

Do you know of any  such use where the named type var occurs more than once?
If it doesn't, then the type var may as well be anonymous, and the compiler
could, and should, accept it on that basis.

> The above seems reasonable.

Thank you.

Zoltan.


More information about the reviews mailing list