[m-users.] Link-time optimization

Massimo Dentico m.dentico at virgilio.it
Sun Jun 14 04:48:14 AEST 2020

On 13/06/2020 18:21, Zoltan Somogyi wrote:
> 2020-06-13 12:24 GMT+10:00 Peter Wang<novalazy at gmail.com>:
>> On Fri, 12 Jun 2020 22:27:28 +0200 Massimo Dentico <m.dentico at virgilio.it> wrote:
>>> b. How much effort would be needed to modify the Mercury compiler
>>>      (specifically the back-end that produces high-level C code)
>>>      to produce a single stand-alone C file? That is, a single source file
>>>      easy to include in other C projects, which has *only* the necessary
>>>      machinery from the run-time library and refers to C standard
>>>      libraries only, with an option to avoid them too.
>> A lot of effort.
> As Peter says, with a lot of effort, you could implement what you propose.
> However, the result will very likely *not* be all that useful. Putting all
> the C code generated by the Mercury compiler into a single .c file
> will generate a huge .c file for any Mercury program of a nontrivial size,
> especially when compiled in a debug grade.

Obviously this single C file would be for distribution only, not
for development. So no debug grade involved (I presume here that the
developers using the library would not be interested in learning

I have read Henderson & Somogyi, "Compiling Mercury to high-level C
code" (2002) and I was under the impression that the ratio of lines of
C code produced for one line of Mercury code was not dramatically high.
Do you care/have time to explain why this is not (or is no more)
the case?

Anyway I'm not particularly worried by this because I would like to use
Mercury only for the inference engine part of an application. The rest
will be in another language.

> .......................................... This will mean that you
> wouldn't be able to compile that .c file with any C compiler options
> that call for the use of any algorithm whose complexity is O(n^2) or
> worse in the number of functions in the file. The last time I looked,
> this meant that even -O2 was off the table (though admittedly
> that was more than a decade ago). So the resulting executable
> code may be compact, but would also be quite slow.

The assumption here is that -O3 is always beneficial. I beg to differ:
I have examined the assembly output of GCC (to a less extent of Clang)
and I have often found that -Os (enables all -O2 optimizations that do
not typically increase code size) is more than adequate for a lot of use
cases. Of course hot spots, especially in numerical code, require
special care. But I would not use Mercury for numerical code, of course.

Anyway consider that:

1. header-only libraries [1] and Single Compilation Unit [2] are not
    uncommon nowadays;
2. in my reply to Peter Wang I mentioned CMI, the Cross-Module Inliner
    [3], which was developed in early 2000s and does more or less what
    I asked for above (it's unmaintained and written in Haskell; I'll
    see what I can do);
3. the SQLite developers distribute an easy to use amalgamation file [4]
    that is nearly 230,000 lines long and is 7.73 MiBs in size (version

If compilation time was unbearable for SCU, there would have been no
such developments.


1. https://en.wikipedia.org/wiki/Header-only
2. https://en.wikipedia.org/wiki/Single_Compilation_Unit
3. https://www.cs.utah.edu/flux/knit/cmi.html
4. https://www.sqlite.org/amalgamation.html

Massimo A. Dentico

