[m-users.] Single file deployment (was: Link-time optimization)

Massimo Dentico m.dentico at virgilio.it
Mon Jun 15 08:15:11 AEST 2020


On 13/06/2020 22:00, Zoltan Somogyi wrote:
> 
> No, the ratio is not too high, and hasn't changed all that much since that paper.
> I just checked: the compiler directory in the Mercury system has about
> 500,000 lines of Mercury code (which occupies about 21 megabytes),
> which in the hlc.gc grade is translated into just shy of 2,600,000 lines of
> C code (which occupies about 110 megabytes).

I see now what motivated your objection. Well no, writing (manually)
half a million lines of Mercury code would defeat my purpose.

The objective is to reduce the size of code one has to write manually.
It is known from the software engineering literature that the number of
defects of a code base is strongly correlated with its size, in fact a
metrics often used is number of defects per kLOC (thousands of lines of
source code). Conversely, the numbers of LOC produced§ per day is
roughly independent of the programming language used.

§ Or better, as Edsger W. Dijkstra would say, "spent": «My point today
is that, if we wish to count lines of code, we should not regard them as
"lines produced" but as "lines spent": the current conventional wisdom
is so foolish as to book that count on the wrong side of the ledger.»
 From "On the cruelty of really teaching computing science"
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/EWD1036.html

So I anticipate to not go beyond about 50.000 LOC of Mercury, an order
of magnitude less than the Mercury compiler code base. Let's adopt the
ratio of Mercury code base that you mentioned of 5.2 of C LOC per 1 LOC
of Mercury as a reasonable estimate. We have about 260.000 LOC of C, in
the range of SQLite amalgamation.


> ............................................ So the ratio is definitely
> not dramatically high.

This is what impressed me about your paper: a translation of high level
features in a quite straightforward way, gaining very competitive
performances.


> [...]
> 
>>> .......................................... This will mean that you
>>> wouldn't be able to compile that .c file with any C compiler options
>>> that call for the use of any algorithm whose complexity is O(n^2) or
>>> worse in the number of functions in the file. The last time I looked,
>>> this meant that even -O2 was off the table (though admittedly
>>> that was more than a decade ago). So the resulting executable
>>> code may be compact, but would also be quite slow.
>>
>> The assumption here is that -O3 is always beneficial.
> 
> I was talking about the effect of losing -O2, not -O3. -O2 is useful
> much more often that -O3.

Sorry, I was misled by that "even". However, now I think it is clear
that I was referring to a much smaller C file than you thought, which
would be within the reach of current hardware and compilers.

--
Massimo A. Dentico



More information about the users mailing list