[m-dev.] question about mode representation

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri Mar 30 17:02:31 AEDT 2018

I have been thinking about changing how we attach mode information to goals
as a way of preparing the compiler for constraint based mode analysis.

At the moment, we attach mode information to goals in the form of
instmap_deltas. These are NOT self-contained. For example, if the delta
says that the new inst of X is "bound(f(ground, ground))", you don't know
whether this goal is X's producer or consumer. It is its producer
if X's previous inst was "free", but its consumer if X's previous inst
was "ground". This is why compiler passes that care about such distinctions
have to have the current instmap threaded through them.

A constraint based mode analysis algorithm would make decisions about
"does this goal produce X, or consume it?" directly, so translating
its decisions back into the form of instmap_deltas would be a giant step
backwards. We need a new type to record, in each goal's goal_info,
information such as

- which variables (or parts of variables) did this goal produce?
- which variables (or parts of variables) had the range of their possible
  values reduced (e.g. from "f or g or h" to just "f or h")?

Like instmap_deltas, it will have a way to record that the program point
after the goal is unreachable. It will also need to have ways to record
which variables become non-unique (when their alias is taken), with a later
extension to record which variables become unique again (when their last alias
goes out of scope).

I have two questions for everyone.

The first is: what should this new goal_info field, and its type, be called?
The obvious name, mode, is taken (it is a keyword), and we already use
the next obvious, mer_mode. I have three names that I think would work,
though I am not particularly enthusiastic about any of them: goal_mode,
prod_con_mode and pc_mode. Unlike "instmap_delta", "goal_mode" puts the
emphasis on the goal, not the insts. The other two are *slightly* misleading:
while producer-consumer information is the most important kind of information
the field/type would contain, it would not be the *only* kind. The same
problem would arise for gen_con_mode (generator-consumer), while its
abbreviation gc_mode would be just too confusing, since gc usually means
something totally different.

Any opinions?

The second question concerns practicalities. Any system like this has to be
implemented gradually, in several steps, probably over months:

- First adding a goal_mode type, hiding its implementation behind
  an abstraction barrier.

- Adding a pass, invoked after mode analysis (and after each mode reanalysis),
  to translate a subset of the information in instmap_deltas to values
  of the new type, and attach this info to each goal.

- Changing some parts of the compiler to use the new field to make decisions
  instead of instmap_deltas.

- Repeating the previous two steps over and over again, expanding the "subset"
  each time.

For all that time, the compiler will need more memory to represent programs
than it does right now, and initially, the time needed to fill in the
new field won't generate any returns in terms of performance. (There will be 
such returns later, when switching to use the new field allows a compiler
pass to avoid threading the current instmap though.)

Are people ok with that?


More information about the developers mailing list