[m-rev.] for preliminary review: bug 493 & 532 alterative fix

Zoltan Somogyi zoltan.somogyi at runbox.com
Thu May 13 18:07:23 AEST 2021


2021-05-13 17:10 GMT+10:00 "Peter Wang" <novalazy at gmail.com>:
> I think this diff is complete except for updating the (very extensive)
> comments. I will do that if you think this is the right way to go.

I think your fix will work, just as mine does. However, they do generate
different code in MLDS grades, because your fix allows common.m
to replace X = f(...) with X = Y, which affects the ability of the next
invocation of mark_static_terms to mark users of X static.

The immediate question is, which is better, and I don't think that question
can be answered without data. Specifically,

- which fix yields faster code, and
- which fix can make more data static.

The longer term question, which may make the performance question
moot, is whether we should switch from *marking* static terms to
*making* static terms. Both the code in var_locn.m to recognize
static terms on the fly in the LLDS backend, and mark_static_terms.m
for the MLDS backend, predate the ability of the compiler to
record static terms using the mechanisms in const_struct.m,
and refer to them via X = ground_term_const(N, ...), since I added
that capability in 2012 as part of a series of changes to make
the compiler able to handle very large programs.

We could do this by

- First changing the code of mark_static_terms.m, so that when it finds
  a unification X = f(Arg1, ..., ArgM) where the rhs is static, it does not just set
  its how_to_construct field, but replaces it with X = ground_term_const(N),
  after putting f(...) into the const struct db (if it isn't already there).
  To make this possible, it should carry along a map from vars V
  to constants C, to which it adds when it finds V = C. One of these
  Vs would be added to the const struct db only when used as an arg
  in a constant term.

- Second, renaming the module to reflect its updated function.

- Third, invoking it regardless of the backend being targeted.
  This should allow us to rip out from the LLDS backend the code
  now there for discovering static cells.

Exactly *when* this pass should be done is not obvious.
Doing it at the start of the middle end has the potential
to make the compiler faster by reducing the size of the procedure
bodies that every later pass will traverse, but doing it late would
allow it to catch any static data *created* by optimization passes.

Since loop invariants wants to know what data is static, doing it
earlier seems unavoidable. We could then repeat it just before
code generation, either as a matter of course, or if some compiler
pass that may have generated static data has in the meantime set a flag
requesting it. We could even make this functionality just one of the
tasks of simplification, which it can execute when requested.

This approach solves both 493 and 532, by not leaving a time gap
between mark_static_terms making a decision and the MLDS code
generator acts on it, a gap in which the decision's record could be
screwed up by common.m. It also simplifies common.m's job,
since it would be obvious that X = ground_term_const(...) is NOT
worth replacing with X = Y, even if Y is the same constant.
(Because X = Y would never be faster than X = ground_term_const(...),
but could be slower if it requires Y to be stored across a call or
a resume point setup.)

The downside is that we would need to teach constant propagation,
and maybe come other compiler components such as format_call.m,
to process not just terms created by construction unifications,
but also terms in the const struct db. (format_call.m needs to know
e.g. whether X ++ Y, where X and Y are possibly static strings,
when looked at as a format string, matches a given list of poly_types.)

Opinions?

Zoltan.


More information about the reviews mailing list