[m-rev.] for review: start using --generate-dependencies-ints

Julien Fischer jfischer at opturion.com
Sat Oct 21 16:25:47 AEDT 2023


On Sat, 21 Oct 2023, Zoltan Somogyi wrote:

>
> On 2023-10-21 15:20 +11:00 AEDT, "Julien Fischer" <jfischer at opturion.com> wrote:
>>> On 2023-10-20 22:27 +11:00 AEDT, "Julien Fischer" <jfischer at opturion.com> wrote:
>>>> mmc --make (which is used to build the C# library) supports a no-op
>>>> depend target, but doesn't recongise depend_ints. So either need
>>>> similarly support depend_ints as a no-op target for mmc --make or
>>>> make the above target in the Mmakefile dependent on the grade and not
>>>> use depend_ints in the non-C grades.
>>>
>>> I think the former is preferable. Can you test whether just extending the filter
>>> on line 108 of make.top_level.m to cover ".depend_ints" would work?
>>
>> I have committed a diff that does this and fixes the problem. There's
>> another issue with this change: it is orders of magnitude slower on
>> Windows.
>>
>> On my Linux machine a top-level mmake depend is now ~17 seconds; on
>> Windows (using MSYS2) it is ~21 minutes.
>
> That is a LOT bigger multiplier than what I was expecting,
> especially given that --generate-dependencies-ints saves
> a LOT of process creation compared to the old approach,
> and I know process creation is much slower on Windows than on linux.
>
>> The primary culprit for this
>> seems to be all the file copying that is going on.
>
> How certain is that assertion?

At the moment it's based on me turning on --verbose and watching the
output of "tail -f mer_std.dep_err".

> Do you have profiling data? If not, could you get some without undue
> effort?

If you mean mprof style profiling data, no, time profiling is not
supported on Windows (no interval timer). I will instrument the compiler
to record some timings.

> The code for writing all the .intN files initially writes to a tmp file,
> so it can compare the new .intN file with its old version, and decide
> whether to update the corresponding timestamp file accordingly.
> If there is no old version to compare with, then on linux, we just
> rename the tmp file to the non-tmp file. Does this rename require
> a file copy on windows? If yes, then it would make sense to
> avoid this two step by writing to the non-tmp file directly
> if it does not exist initially (which should be a common case
> when making the dependencies). I think it should be possible
> to do this in a way that wouldn't incur extra cost on linux,
> and would avoid the copy on windows. The code required
> would have to be significantly more complex than the current code,
> though.

File renaming on Windows should just be a call to rename() (actually,
_wrename, but that difference is unimportant here.)  I'll need to
take a look at what exactly the compiler is doing on Windows here, since
based on the output of --verbose, we are apparently copying them:

% Updating interface:
% `array.int3' has been CREATED.
% Invoking system command `cp   array.int3.tmp   array.int3'...
% done.
% Touching `array.date3'...  done.
% Writing output to array2d.int3.tmp... done
% Updating interface:
% `array2d.int3' has been CREATED.
% Invoking system command `cp   array2d.int3.tmp   array2d.int3'...
% done.
% Touching `array2d.date3'...  done.
% Writing output to assoc_list.int3.tmp... done
% Updating interface:
% `assoc_list.int3' has been CREATED.
% Invoking system command `cp   assoc_list.int3.tmp
% assoc_list.int3'...
% done.
% Touching `assoc_list.date3'...  done.
% Writing output to backjump.int3.tmp... done
% Updating interface:
% `backjump.int3' has been CREATED.
% Invoking system command `cp   backjump.int3.tmp   backjump.int3'...

>> While I suspect
>> the times are, more or less, equivalent to what was required for
>> building all the inteface files under the old approach,
>
> The amount of work required should be less than the work
> required under the old approach, since we save having to read
> the .int3 and .int0 files we wrote. The amount of time may be more,
> since --generate-dependencies-ints generates and writes the
> interface files in sequence, while mmake and mmc --make
> can do so in parallel.
>
> On linux, the work saved is so much, and the cost of writing
> all the interface files is so small, that the loss of parallelism
> has no real noticable effect. On windows, with the base cost
> being enormously higher, ... you get the effect you saw.
>
> One way to fix this would be to keep using --generate-dependencies
> (no -ints at the end) on Windows.

That may end up being the short-term workaround.

>> the main
>> problem is that there very little visual feedback on what is happening.
>> (I initially though that the compiler had gone into a loop, until I
>> turned on --verbose.)
>
> Fixing that would be easy. We could
>
> - print a message at the start about "making dependencies and all interface files",
> - print a message "about to make all .intN files" at the start of each phase, and/or
> -print "made/about to make x.intN" for each interface file.
>
> I would be fine with any combination, and we could even pick different combinations
> on different OSs. What do other people prefer?

I think it should just mimic the output of --make when it is creating the
interface files as closely as practicable.

Julien.


More information about the reviews mailing list