[m-dev.] making .trans_opt files in parallel

Peter Wang novalazy at gmail.com
Wed Dec 14 13:00:55 AEDT 2022


On Tue, 13 Dec 2022 17:18:58 +1100 "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> 
> 2022-12-13 16:00 GMT+11:00 "Peter Wang" <novalazy at gmail.com>:
> >> What we want to specify is effectively a dag; if module B is reachable
> >> from module A, then module A may depend on B.trans_opt; otherwise,
> >> it may not. Each no_trans_opt pragma would put only one constraint
> >> on the dag, and there would be no easy way to check that two such
> >> pragmas in different modules basically rule out both A using B.trans_opt
> >> and B using A.trans_opt, which would not be a correctness problem,
> >> but may be a performance problem.
> >> 
> > 
> > Yes, I don't really mind that.
> 
> That is certainly a valid position.
> 
> > (I don't think .trans_opt files provide
> > much value as it is, correct me if I'm wrong.)
> 
> They allow intermodule inlining of small but possibly often-used
> predicates and functions. Whether that is "much value" is in the eye
> of the beholder. I happen to think it is borderline useful. I agree that
> the termination and other analysis results in .trans_opt files were
> much more useful for writing papers than they are in practice :-(

I only see termination and exception info in .trans_opt, those being the
analyses that are enabled for the standard library. Would either affect
intermodule inlining? I haven't checked.

> >> I think it would be better to specify the whole dag all at once,
> >> probably as the contents of a file, whose name can be passed
> >> to mmc in an option. Unfortunately, I don't know of a simple way
> >> to encode dag in a text file in a way that is
> >> 
> >> - unambiguous,
> >> - easy to read and understand,
> >> - but does not reduce all dags to a total order.
> >> 
> > 
> > An obvious place to start is something like the dot language, e.g.
> > 
> >     bar -> foo;
> >     baz -> bar;
> 
> I believe the write_graph predicate in generate_dep_d_files.m
> already writes out the imports graph using that language.

Right, --imports-graph generates dot files, but filter_imports_graph
deliberately omits all standard library modules, for the reason you
discovered.

> I think we are talking at cross purposes because we have two different dags
> in mind. Your GRAPH contains an edge for every single import, use and include
> declaration in the program; that indeed would be a pain to maintain. I am talking
> about a smaller dag derived from something like GRAPH, which just decides
> "is the compilation of A.m allowed to read B.trans_opt?".

Ok. The GRAPH file was just to show what the syntax might look like in
practice.

To clarify, when generating target code, the compiler requires that
.trans_opt files for any directly or indirectly imported modules to have
been created/updated, so there is no need to restrict which .trans_opt
files are allowed to be read.

It is only when making a .trans_opt file, when .trans_opt files for
other modules within the same cycle may not have been created/updated
yet, where we need to limit which .trans_opt files are allowed to be
read.

> > 2. Do we require modules that don't participate in a cycle to be
> > included in the dag? There's no technical reason to do so, so it's
> > just busywork.
> 
> From this point of view, I would divide modules into three categories, not two.
> These would be:
> 
> 1. Modules that neither depend on other modules, nor do any other modules
>     depend on them.
> 
> 2. Modules that either depend on other modules or have other modules
>    depend on them, but are not in a cycle.
> 
> 3. Modules that are part of a cyclic dependency.
> 
> We agree that type 3 modules should be in the dag. I think type 2 modules
> should also be in the dag, because including their position in the dag
> in a file that is available to compiler invocations is a simple and effective way
> to specify which module can read which other module's .trans_opt file.
> 

I don't see any reason to prevent reading a type 2 module's .trans_opt
file.

> > 3. Do we necessarily require a dag? Some smaller cycles may be
> > acceptable, letting the compiler impose its arbitrary ordering
> > as it does now.
> 
> That depends on which "we" you are talking about.
> 
> An invocation of mmc --intermod-opt A.m definitely needs to know

(I think you mean mmc --make-trans-opt A.m )

> whether it is allowed to read B.trans_opt, which means it needs to know
> the part of the dag connecting A and B. So mmc needs access to arbitrary
> pieces of the dag on demand, which means it needs to be able to access
> the whole dag.
>
> This is a different question from whether we require users to fully
> specify a dag. Given a user-specified set of modules which all depend
> on one another, the compiler could impose its own order on them
> without requiring the user to do so. The order could be alphabetical
> (a smaller version of the system we have now), or it could be based on
> the order in which they are written in a file, such as the list of sccs above.
> The latter would have the advantage that it would allow sufficiently
> knowledgeable users to specify the total order derived from the dag.

My question was actually: should the compiler report an error if the
user provides a graph containing cycles (not a dag), or work with the
graph as best it can? No need to answer that, though.


> scc 1: m1
> scc 2: m2 m3
> scc 3: m4
> etc
>
> This implicitly specifies a dag. (The "scc N:" part is just my explanation;
> it wouldn't be part of the file.) Unfortunately, the dag it specifies is not
> suited for making .trans_opt files in parallel, your original goal.

The --generate-module-order option creates the same sort of file.
I think it *would* be suited to making .trans_opt files in parallel
actually, and would be easier to manage. You just need to break up large
SCCs into smaller SCCs, and choose an ordering. Modules higher in the
order can read from .trans_opt files lower in the order. I will look
into it.

Peter


More information about the developers mailing list