[m-users.] Link-time optimization

Zoltan Somogyi zoltan.somogyi at runbox.com
Sun Jun 14 06:00:53 AEST 2020


2020-06-14 04:48 GMT+10:00 Massimo Dentico<m.dentico at virgilio.it>:
>> Putting all
>> the C code generated by the Mercury compiler into a single .c file
>> will generate a huge .c file for any Mercury program of a nontrivial size,
>> especially when compiled in a debug grade.
> 
> Obviously this single C file would be for distribution only, not
> for development.

That would help, since a long compile time wouldn't be as much of as problem.

> So no debug grade involved (I presume here that the
> developers using the library would not be interested in learning
> Mercury.)

And this would help too, since debug grades generate much bigger .c files
than non-debug grades.

> I have read Henderson & Somogyi, "Compiling Mercury to high-level C
> code" (2002) and I was under the impression that the ratio of lines of
> C code produced for one line of Mercury code was not dramatically high.
> Do you care/have time to explain why this is not (or is no more)
> the case?

No, the ratio is not too high, and hasn't changed all that much since that paper.
I just checked: the compiler directory in the Mercury system has about
500,000 lines of Mercury code (which occupies about 21 megabytes),
which in the hlc.gc grade is translated into just shy of 2,600,000 lines of
C code (which occupies about 110 megabytes). So the ratio is definitely
not dramatically high. I was just drawing attention to the point that C compilers
are tuned for acceptable compilation times on hand-written C code, and a
single C source file containing over two million lines of code would be
much, much bigger than anything they are usually asked to compile.

As a compiler writer myself, I know that I prefer to avoid quadratic (or worse)
algorithms when N is unbounded, but I don't mind using them when
N tends to be naturally limited by the way people program. Since good
programming practice frowns on such huge amounts of code in a single
source file, I don't expect compiler writers to avoid algorithms that are
quadratic in this respect.

> Anyway I'm not particularly worried by this because I would like to use
> Mercury only for the inference engine part of an application. The rest
> will be in another language.

This will help by keeping down N as well, so (depending on how big
that inference part is) my warning may not apply to your use case.

>> .......................................... This will mean that you
>> wouldn't be able to compile that .c file with any C compiler options
>> that call for the use of any algorithm whose complexity is O(n^2) or
>> worse in the number of functions in the file. The last time I looked,
>> this meant that even -O2 was off the table (though admittedly
>> that was more than a decade ago). So the resulting executable
>> code may be compact, but would also be quite slow.
> 
> The assumption here is that -O3 is always beneficial.

I was talking about the effect of losing -O2, not -O3. -O2 is useful
much more often that -O3.

> 3. the SQLite developers distribute an easy to use amalgamation file [4]
>    that is nearly 230,000 lines long and is 7.73 MiBs in size (version
>    3.32.2).

I did not know that, but note that this is still only about one tenth
the size of the .c files of the Mercury compiler. Automatic tools such as
the Mercury compiler can generate much more stuff (in this case C code)
than programmers can do by hand.

Zoltan.


More information about the users mailing list