[m-rev.] for review: start using --generate-dependencies-ints

Julien Fischer jfischer at opturion.com
Sun Oct 29 13:24:09 AEDT 2023


On Sat, 21 Oct 2023, Zoltan Somogyi wrote:

> On 2023-10-21 15:20 +11:00 AEDT, "Julien Fischer" <jfischer at opturion.com> wrote:
>>> On 2023-10-20 22:27 +11:00 AEDT, "Julien Fischer" <jfischer at opturion.com> wrote:
>>>> mmc --make (which is used to build the C# library) supports a no-op
>>>> depend target, but doesn't recongise depend_ints. So either need
>>>> similarly support depend_ints as a no-op target for mmc --make or
>>>> make the above target in the Mmakefile dependent on the grade and not
>>>> use depend_ints in the non-C grades.
>>>
>>> I think the former is preferable. Can you test whether just extending the filter
>>> on line 108 of make.top_level.m to cover ".depend_ints" would work?
>>
>> I have committed a diff that does this and fixes the problem. There's
>> another issue with this change: it is orders of magnitude slower on
>> Windows.
>>
>> On my Linux machine a top-level mmake depend is now ~17 seconds; on
>> Windows (using MSYS2) it is ~21 minutes.
>
> That is a LOT bigger multiplier than what I was expecting,
> especially given that --generate-dependencies-ints saves
> a LOT of process creation compared to the old approach,
> and I know process creation is much slower on Windows than on linux.

I think we've just found out how much slower ... :-(

>> The primary culprit for this
>> seems to be all the file copying that is going on.
>
> How certain is that assertion?

Very certain. Each call to module_cmds.copy_file/7 from
module_cmds.copy_dot_tmp_to_base_file_create_file/8 is taking on
average 500ms on my Windows machine.  Indeed, using the "slow"
path in copy_file/7, i.e via do_copy_file/5, is *much* faster
(~7ms a copy) on the same machine.

My guess would be that we actually want to use of the Windows API's
file copying functions here; I'll give that a go.

> Do you have profiling data?

I did try building a hlc.gc compiler with gprof style profiling enabled.
Getting mercury_compile.exe to link in such a situation is beyond the
amount of time I have available to devote to it.  (It should be
possible, since I can profile C programs on the same machine with
gprof.)

> If not, could you get some without undue effort?

In the end, I just the instrumenting the relevant code in module_cmds.m
and writing the times to the progress stream.

> The code for writing all the .intN files initially writes to a tmp file,
> so it can compare the new .intN file with its old version, and decide
> whether to update the corresponding timestamp file accordingly.
> If there is no old version to compare with, then on linux, we just
> rename the tmp file to the non-tmp file. Does this rename require
> a file copy on windows? If yes, then it would make sense to
> avoid this two step by writing to the non-tmp file directly
> if it does not exist initially (which should be a common case
> when making the dependencies). I think it should be possible
> to do this in a way that wouldn't incur extra cost on linux,
> and would avoid the copy on windows. The code required
> would have to be significantly more complex than the current code,
> though.

I wouldn't bother; the excess time is all in file copying.
The above adds nothing measurable to it.

Julien.


More information about the reviews mailing list