[m-dev.] two bugs in rotd compilers

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri Mar 29 02:00:46 AEDT 2024


A few hours ago, I found out that the trie string switch code
I recently committed generates code does the wrong thing.
I of course tested it before commit, but the testing itself had a bug,
which I have fixed. (The test case that was supposed to test trie string switches
is compiled in a way that is intended to force the use of trie switches,
but due to a cut-and-paste error, the code in switch_gen.m was looking at
the value of the wrong option.) That's why I committed a small diff
disabling trie switches until I fix the issue.

The bug does not seem to be in the new trie switch code itself, but in
the code in c_util.m that it invokes to construct an escaped string from a code unit list.
When that code unit list ends in the middle of a multi-code-unit code point,
that code returned a string constant that was not what I expected.

In preparation for fixing this bug, I started by clarifying the suspect code,
replacing the use of reversed lists with cords. I tested it in asm_fast.gc,
which worked, and in the java grade, which did not, but the error had
nothing to do with my change. The output from the java bootcheck is
attached, and so are the two files involved in the two errors reported
by the Java compiler.

I went to testing to see how long this bug has been there. I found no reports
of java tests, but I did see that the csharp grade last worked in the rotd for march 13,
and stopped working in the rotd for march 14. The only commit between
those two rotds is 5e687dd6afdf554e9216e280e5f6a7e809c71afb.
I think that this diff did not add the bug to the compiler, but only tickled it.
I think it did so by adding a new subtype, whose particulars the compiler
couldn't handle properly.

The issue involves the subtype whole_ptag_info, which is a subtype of
whole_ptagS_info. (The subtype requires the second arg to be the empty list
of ptags, so it corresponds to just ONE ptag.) The definition of the supertype,
in backend_libs__tag_switch_util.java, has four NAMED fields, while the
references to values of the subtype use F1 and F4 to reference fields.
This seems to be because the subtype did not specify the field names.
In fact, as I am writing this, a bootcheck in java grade in which I modified
the subtype to repeat the same field names as in the supertype
has just starting making stage 3, which means it got past the error.
I am commiting this workaround, whose diff is attached.

I don't think this issue has come up, since the subsection in the manual (4.2.4)
says nothing about field names.

I can see three ways forward.

Way one: forbid field names in subtype definitions.
Way two: allow field names in subtype definitions, but ONLY if they
repeat exactly the field names in the supertype (and hence by transitivity,
in the base type).
Way three: allow field names in subtype definitions, even if they differ
from the supertype.

I strongly dislike way three, but I am ok with ways one and two.
However, whatever we pick

- has to be documented in the manual, and
- has to be enforced by the compiler.

I can do both those things, once we have consensus on the
way selection.

All three ways also need the MLDS code generator to be fixed.
I can see two possible approaches for the fix. One is ensure that we use
the base type's field names to get access to the values of the fields
in subtypes. The other is to ensure that the value of every field
can be accessed using the protocol we use for unnamed fields
even if the field has a name. Not knowing Java or C# well,
I don't know which approach would be easier.

Peter, since you did the original subtype implementation,
you know the code involved here the best. Do you have some time
you can spare to work on this fix?

Zoltan.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Out.mar28a.gz
Type: application/x-gzip
Size: 19427 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/developers/attachments/20240329/887d6e91/attachment-0003.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ll_backend__tag_switch.java.gz
Type: application/x-gzip
Size: 27091 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/developers/attachments/20240329/887d6e91/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: backend_libs__tag_switch_util.java.gz
Type: application/x-gzip
Size: 25105 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/developers/attachments/20240329/887d6e91/attachment-0005.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DIFF.wti
Type: application/octet-stream
Size: 1369 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/developers/attachments/20240329/887d6e91/attachment-0001.obj>


More information about the developers mailing list