[m-rev.] for review: add "did you mean" reminders for mistyped predicate names
Julien Fischer
jfischer at opturion.com
Tue Sep 26 01:22:23 AEST 2023
On Mon, 25 Sep 2023, Zoltan Somogyi wrote:
> On 2023-09-25 21:14 +10:00 AEST, "Julien Fischer" <jfischer at opturion.com> wrote:
>>
>> On Mon, 25 Sep 2023, Zoltan Somogyi wrote:
>>
>
>>>> compiler/typecheck_error_undef.m:
>>> If we are reporting an error for a reference an undefined predicate,
>>
>> *to* an undefined predicate
>>
>>> check whether any predicates exist whose names are "close enough"
>>> to the name of the referenced predicate, and if yes, add to the
>>> error message a line containing "(Did you mean x, y or z?)".
>>>
>>> (Doing the same for references to undefined functions is future work.
>>> The two are separate because things that look like function references
>>> can also refer to e.g. data constructors.)
>>
>> There's quite a bit of scope for doing something similar for other undefined
>> entities. (Although, I'm sure you are aware of that.)
>
> Yep, though it is worth doing only for the entities that people generate
> undefined references to reasonably often. Besides predicates and function
> symbols, I think only type names, and maybe module names, may be
> frequent enough. I am pretty sure that inst, mode and event names are not.
> (Field names count as function names, due to field access syntax.)
> Can you think of any others?
Typeclass names.
>>> + ( if
>>> + AddeddumPieces = [],
>>> + KindQualMsgs = []
>>> + then
>>> + KindQualMsgs = [],
>>> + % It seems that regardless of missing qualifications or equal signs,
>>> + % the reference is to the wrong name. See if we can mention some
>>> + % similar names that could be the one they intended.
>>> + get_known_pred_names(PredicateTable, KnownPredNames),
>>> + BaseName = unqualify_name(SymName),
>>> + % Note: name_is_close_enough below depends on all costs here
>>> + % except for case changes being 2u.
>>> + Params = edit_params(2u, 2u, case_sensitive_replacement_cost, 2u),
>>> + string.to_char_list(BaseName, BaseNameChars),
>>> + list.map(string.to_char_list, KnownPredNames, KnownPredNamesChars),
>>> + find_closest_seqs(Params, BaseNameChars, KnownPredNamesChars,
>>> + Cost, HeadBestNameChars, TailBestNamesChars),
>>> + BestNamesChars = [HeadBestNameChars | TailBestNamesChars],
>>> + list.map(string.from_char_list, BestNamesChars, BestNames),
>>> + ( if
>>> + % Don't offer a string as a replacement for itself.
>>> + Cost > 0u,
>>> + % Don't offer a string as a replacement if it is too far
>>> + % from the original, either.
>>> + list.filter(name_is_close_enough(Cost, BaseName),
>>> + BestNames, CloseEnoughBestNames),
>>> + CloseEnoughBestNames = [_ | _]
>>> + then
>>
>> I suggest imposing a limit on the number of close enough names that are
>> printed, around a dozen should suffice; I can't imagine the resulting error
>> messages being particularly meaningful if there are more.
>
> The code at the moment prints all the candidate strings that have
> the same edit distance from the undefined reference. This means that
> *if* there were ever more than 12 of them, we would have no useful
> way to pick the 12 most likely to be the one the programmer wants.
> I would punt this problem to the first time we ever encounter it for real.
> I don't think that will ever happen on non-contrived code.
It seems like the kind of thing you could stumble into with machine
generated Mercury code. (Whether that counts as contrived or not ...)
...
>>> +%---------------------------------------------------------------------------%
>>> diff --git a/tests/invalid/qual_basic_test2.err_exp b/tests/invalid/qual_basic_test2.err_exp
>>> index 65e27c4a1..249d29c01 100644
>>> --- a/tests/invalid/qual_basic_test2.err_exp
>>> +++ b/tests/invalid/qual_basic_test2.err_exp
>>> @@ -1,2 +1,3 @@
>>> qual_basic_test2.m:019: In clause for predicate `main'/2:
>>> qual_basic_test2.m:019: error: undefined predicate `io.io__write_string'/3.
>>> +qual_basic_test2.m:019: (Did you mean `write_string'?)
>>
>> That seems a little confusing; the actual error is that the module io.io
>> doesn't exist.
>
> I agree, but the problem is not in the new code; it is given the sym_name
> as qualified(unqualified(io"), "io__write_string"), i.e. it is given a string with
> a double underscore in the *base* name. Given that __s as module separators
> have gone the way of the dodo in current code, I don't really feel like
> chasing down the reason for that.
I don't think mixing module qualifiers has ever worked. We should
consider dropping support for __ as a module qualifier.
...
> There was a question about edit_distance.m that I forgot to ask in my
> original email. Should we add a version of the find_closest_seq predicate
> that explicitly works on strings? It would do all the required conversions
> between strings and char lists, so users wouldn't have to. We could call it
> find_closest_string.
I don't see a problem with adding.
Julien.
More information about the reviews
mailing list