[m-rev.] for review: replace item_blocks with parse trees during pre-HLDS passes

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri Mar 13 08:18:00 AEDT 2020



On Fri, 13 Mar 2020 01:38:07 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote: 
> On Tue, 10 Mar 2020, Zoltan Somogyi wrote:
> 
> > For review by Julien, since he kindly volunteered.
> > Thank you in advance. Note that some of the XXXs
> > in the log message are questions for you. I would
> > also appreciate it if, besides reviewing the diff, you
> > also tried it out on the programs you are working on.
> 
> I have done so and have not encountered any problems.  One thing I did
> encounter due to the changes in warnings for unused modules is:
> 
>      search.m:043:   warning: module `fd' is imported in the interface, but it is
>      search.m:043:   not used in the interface.
>      search.m:091: Warning: this `:- import_module' declaration for module `fd' in
>      search.m:091:   the implementation section is redundant, given the
>      search.m:091:   `:- import_module' declaration for the same module in the
>      search.m:091:   interface section.
>      search.m:043:   The previous declaration is here.
> 
> We could argubly omit the second warning in cases like the above.

Agreed, but the problem is that those two warnings are generated by two different
modules, and the second one printed is actually generated first, so we can't use
a simple "have we warned about the interface import being unused" test to
avoid generating the warning about the implementation import being shadowed.
The only way to handle this would be to delay the generation of the warnings about
shadowing until *after* module qualification discovers that an interface import
is not needed, which is not a simple change. I will look into whether it is possible
to do that at reasonable cost after I commit this diff, but my guess is "no",
given that any improvement in usability would be minor.

> > Make the pre-HLDS passes use file-kind-specific parse trees.
> > 
> > Replacing item blocks file-kind-specific kinds of section markers with
> > file-kind-specific parse trees has several benefits.
> > 
> > - It allows us to encode the structural invariants of each kind of file
> >   we read in in the type of its representation. This the detection of
> 
>     ... using the type of its representation ...
> 
>     This *makes* the detection of ...

Done, with slightly different wording in the first case.

> >   any accidental violations of those invariants trivial.
> > 
> > - Since each file-kind-specific parse tree has separate lists for separate
> >   kinds of items, code that wants to operate on one or a few kinds of items
> >   can just operate on those kinds of items, without having to traverse
> >   item blocks containing many other kinds of items as well. The most
> >   important consequence of this is not the improved efficiency, though
> >   that is nice, but the increased clarity of the code.
> > 
> > - The new design is much more flexible. For example, it should be possible
> >   to record that e.g. an interface file we read in as a indirect dependency
> >   (i.e. a file we read not because its module was imported by the module
> >   we are compiling, but because its module was imported by *another* imported
> >   module) should be used *only* for the purpose it was read in for. This should
> >   avoid situations where deleting an import of A from a module, because it
> >   is not needed anymore, leads the compiler to generate an error message
> >   about a missing import of module B. This can happen if (a) module B
> >   always *should* have been imported, since it is used, but (b) module A's
> >   import of module B lead to module B's interface being available *without*
> >   an import of B.
> >
> >   Specifically, this flexility should enable us to establish each module's
> 
> s/flexility/flexibility/

Done.

> >   .int file as the single source of truth about how values of each type
> >   defined in that module should be represented. When compiling each source
> >   file, this approach requires the compiler to read in that module's .int file
> >   but using only the type_repn items from that .int file, and nothing else.
> 
> ...
> 
> > This change is big and therefore hard to review. Therefore in many files,
> > this change adds "XXX CLEANUP" comments to draw attention to places that
> > have issues that should be fixed, but whose fixes should come later, in
> > separate diffs.
> 
> You've got at least one of the comments labelled YYY; is that supposed
> to be an XXX CLEANUP one as well?

No, that was supposed to be "cleanup before sending out for review" :-(,
but at least it has now been cleaned up, before commit.

> >     - The impl (short for implementation) pragmas are named so
> >       precisely because they may appear only in implementation sections.
> >       They give the compiler information information that is private
> 
> Doubled-up "information".
> 
> > compiler/get_dependencies.m:
> >     Add operations to gather implicit references to builtin modules
> >     (which there have to be made available even without an explicit
> 
> 
> Delete "there".

Both done.
 
> > compiler/split_parse_tree_src.m:
> >     Rearchitect how this module separates out nested submodules from within
> >     the main module in a file.
> >
> >     Another of the jobs of this module is to generate error messages for
> >     when module A includes module B twice, with special exception for
> 
>      ... with *a* special exception ...
> 
> >     the case where A's interface contains nested submodule A.B's interface, and
> >     A's implementation contains nested submodule A.B's implementation.
> >     Other than that, has been split_parse_tree_src.m any duplicate inclusions
> >     of submodules, whether via nesting or via include_module declarations.
> 
> What is that last sentence trying to say?

Yep, that paragraph was sent unfinished. Have a look to see whether the updated
version in the attached updated log file is readable.

> > compiler/error_util.m:
> >     Add an id field to all error_specs, which by convention should be
> >     filled in with $pred. This allows a person debugging the compiler
> >     where in the code an undesired error message is coming from
> >     significantly easier than was previously possible.
> 
> There appear to be some words missing from that last sentence.

Fixed.

> > 
> > compiler/comp_unit_interface.m:
> >     Operate on as specific kinds of parse trees as the interface of this
> >     module will allow. (We could operate on more speficic parse trees
> 
> s/speficic/specific/

Ditto (and spellchecked the log).

> > compiler/options.m:
> >     Add an option to control whether errors and/or warnings detecting
> >     when deciding what should go into a .intN file be printed,
> >     thus (potentially) preventing the creation of that file.
> >
> >     XXX Should this option be documented for users?
> 
> Yes.
>
> > NEWS:
> >     Mention that we now generate warnings for unused import_module and
> >     use_module declarations in the interface even if the module has
> >     submodules.
> >
> >     XXX Should we report this as a breaking change?
> 
> Yes.

OK, please have a look at the attached DIFF.NEWS.

> > tests/invalid/foreign_enum_invalid.err_exp:
> >     Expect a different wording for an error message. It is more "standard"
> >     but slightly less informative.
> >     XXX Is this ok?
> 
> Yes, that's fine.  Knowing the type the foreign_enum is for isn't particularly
> relevant for the error in question.

My thoughts exactly.

> > +% just put a breakpoint on do_write_error_spec, wait for it to be invoked
> > +% to print the error_spec whose origin you want to track down, and then
> > +% look its id field. Even if the predicate this identifies has several pieces
> > +% of code that construct specs, the scope in which you have to search for it
> > +% will be easily manageable.
> 
> I think a better alternative here would be to have a developer-only option
> that cause the compiler to print the id at the end of each error message.

That is an excellent idea. I have implemented it, named the option
--print-error-spec-id, and replaced the above with a description
of its use.

I followed all your other suggestions.

> The change is fine otherwise.

Thank you very much.

Zoltan.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Log.pt2
Type: application/octet-stream
Size: 28197 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20200313/03e480d9/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DIFF.NEWS
Type: application/octet-stream
Size: 3086 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20200313/03e480d9/attachment-0003.obj>


More information about the reviews mailing list