[m-dev.] question about data representation decisions

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri Mar 30 17:31:02 AEDT 2018

At the moment, each invocation of the compiler decides
the representation of every du type visible in the current module,
whether or not the type is *defined* in the current module.

This means that if a type foo is visible in five modules,
the compilations of all five modules must *independently* make
the same decision. They obviously use the same algorithm,
but they need the same data as well. This is why we sometimes
bust abstraction boundaries. If a module abstract exports a type bar
and privately defines bar as an equivalence to float, then any du type
foo it defines that contains a field of type bar will cause this module
to reserve *two* words for that field on 32 bit machines, while
compilations of other modules will reserve only the standard *one* word
for that field, *unless* the equivalence is exported.

I propose that we switch to a different approach. Instead of
recording in interface files the *information needed to reach
the right decisions* about data type representations, we should
record *the decisions themselves*. This would require adding
a new kind of item that can occur only in compiler generated
interface files, but we agreed several months ago that that is ok.

I see two advantages of the approach I am proposing.

First, it has a much simpler correctness argument about
ensuring consistency: the compiler decides the representation
of a du type when it compiles the module that defines that type,
and every compiler invocation on any other module uses the
record of that decision. If the defining module changes,
the decision may change, but then so will the interface file
containing that decision, which will require a recompile
of all other modules dependent on it.

Second, and more subtly, in the scenario above, we *won't*
have to export the bar == float equivalence as a global equivalence;
we would export it only as something that applies to the fields of foo
that are bars. Occurrences of bars in other contexts would not be
affected. By allowiing us to reduce the amount of module-private
information exposed in interface files, which should allow a reduction
in the number of unnecessary recompilations. The simpler correctness
argument should make this easier to implement.



More information about the developers mailing list