[m-rev.] for review: parsing .used files
Julien Fischer
jfischer at opturion.com
Tue Apr 20 17:02:01 AEST 2021
Hi Zoltan,
On Mon, 19 Apr 2021, Zoltan Somogyi wrote:
> 2021-04-19 16:35 GMT+10:00 "Julien Fischer" <jfischer at opturion.com>:
>>> + % Alternatively, the .used file could contain two terms, the version
>>> + % number info, and everything else. We could then select the predicate
>>> + % we use to read in everything else based on the version number.
>>
>> Using a separate initial term for the version number would be my preference.
>
> OK. This happens to be relevant right now, because since I sent that diff,
> I have started work on changing how recompilation.usage.m works.
> It has traditionally also interleaved (a) figuring out what information
> should go into the .used file, and (b) actually writing it out. My current
> incomplete diff changes this so that it first constructs a single term
> that represents the information to be written out, and then just writes it out.
> This incomplete diff now bootchecks with one obvious exception:
> since the code writing out .used files has switched to a new format
> but the code reading them in hasn't, all the recompilation tests fail.
>
> The type that defines my proposed new file format is attached.
> I think it is a reasonable compromise between the our needs
> as compiler writers for expressive function symbol names,
> without making those names too long when we have to look at
> .used files themselves. (Neither of which users have to do.)
> I look forward to your feedback on this design.
I think that's fine. I would reorder the definitions so that
the type used_classes/0 occurs immediately after used_file_items/0.
I would also suggest that the new type be defined in its own module
rather that in recompilation.usage.
> A couple of other questions about next steps.
Taking a step back, there are a couple of issues with smart
recompilation. The first is that it does not work with mmc --make.
As a consequence I suspect few, if any, users actually use it.
(At least I think it doesn't work with mmc --make and the compiler
certainly generates warnings that say that is the case ...)
Second, it does not work with --intermodule-optimization.
(Again, the compiler generates warnings to this effect.)
The compiler's usage message (and XXXs in the code itself) suggest that
it could be supported, although I wonder whether that's useful.
> First, my diff cannot be committed without updating the code
> that reads in and uses the contents of .used files. That second
> part means updating the code in recompilation.check.m to work
> based on the terms of the type in the attachment. Supporting
> backward compatibility would require adding code to transform
> the existing data structure to this one, which is a nontrivial amount
> of work. Is there any point in doing this work?
No.
> I don't think so, since the recompilation package is set up so that
> any syntax error in a .used file is handled by simply rebuilding
> everything, and having the .used file contain a new format is a
> guaranteed syntax error (since the version number will change, even if
> nothing else does). But I am willing to be persuaded that backward
> compatibility is needed.
>
> The second question is about timestamps. We currently represent them
> as strings that have to fit a strict format: yyyy-mm-dd hh:mm:ss.
> I was thinking about a du representation, with six integer fields,
> most of which can be uint8. I see no downside to this change;
> does anyone else?
I don't see the need for it; see below.
> Another aspect of this is that a du type has room for an OS native
> representation of time. For unix, that would be seconds since 1970 jan 1;
> I don't know what Windows and other OSs use.
Seconds is too coarse a grained resolution. All the platforms we
support can do better. (Some of them make it a little more difficult
than others, but the support is there.)
> For .used files, we could
> write out both the OS native representation, which would mean nothing
> to humans, and the yyyy-mm-dd hh:mm:ss that they can read, and
> then pay attention to only the native representation after reading it
> back in, unless that reading takes place on a different OS, in which case
> we would compute the native representation from the human-readable
> one just the way we do now.
I think we only need to the native representation; the only people who
are going to read .used files are developers and converting an epoch
timestamp to a readable time isn't that difficult.
> And yet another aspect that we should think about is the inclusion
> of sub-second-resolution time information. Some OSs now support
> nanosecond resolution in e.g. file modification times, though of course
> not all of that resolution is useful yet.
I suggest using a representation of timestampe based on Java's Instant
type (java.time.Instant). That should cover all of the points above,
except readability.
In Mercury, that would be something like the following:
:- type timestamp
---> timestamp(
seconds :: int64, % Seconds from the epoch.
nanos :: uint32 % [0, 999,999,999]
).
where nanos is the number of nanoseconds further along the time line
from the seconds field.
This will work with clocks of resolutions from a second down to
nanosecond resolution clocks (i.e it will be portable).
(I think ISO 8601 has readable represention of this format, e.g.
2021-04-20T11:07:22.956087, but for this use case you could probably
just write the raw components of the timestamp and be done with it.)
Julien.
More information about the reviews
mailing list