[m-rev.] for review: parsing .used files

Julien Fischer jfischer at opturion.com
Mon Apr 19 16:35:43 AEST 2021


Hi Zoltan,

On Sun, 18 Apr 2021, Zoltan Somogyi wrote:

> For review by anyone.

> Separate out the parsing of .used files.
> 
> compiler/recompilation.check.m:
>     The main predicate of this module, should_recompile_3, used to interleave
>     the parsing the contents of the .used file, and checking which modules,

Delete the first "the" on that line.

>     if any, need to recompiled. And the parsing code itself was split up

    ... to be recompiled.

>     among more predicates than was necessary.
>
>     This diff arranges for all the parsing of .used files to happen
>     *before* we try to figure out what needs to recompiled, which allows
>     each predicate to handle one task, not two.
>
>     Make the parsing code both simpler and more efficient by reading in
>     the whole file as a string at the start.
>
>     Add XXXs about simplifications that would require some changes to
>     the file format.
>
>     Make the parsing code indicate errors using error returns,
>     instead of indicating errors by throwing exceptions. (The rest of the code
>     still uses exceptions.)
>
>     Replace an if-then-else chain with a switch, to improve maintainability.
>
>     Consistently use Used, not Usage, as the variable name prefix
>     referring to things related .used files.
> 
> compiler/recompilation.usage.m:
>     s/usage/used/ here as well.

...

> diff --git a/compiler/recompilation.check.m b/compiler/recompilation.check.m
> index b35d12e29..a1c9ca62f 100644
> --- a/compiler/recompilation.check.m
> +++ b/compiler/recompilation.check.m


> @@ -369,22 +326,163 @@ require_recompilation_if_not_up_to_date(RecordedTimestamp, TargetFile,
>          record_recompilation_reason(Reason, !Info)
>      ).
> 
> -:- pred parse_name_and_arity_to_used(term::in, item_name::out) is semidet.
> +%---------------------------------------------------------------------------%
> +
> +:- type used_file
> +    --->    used_file(
> +                % XXX document the meanings of these fields.
> +                module_timestamp,
> +                list(module_name),
> +                resolved_used_items,
> +                list(item_name),
> +                list(recomp_used_module)
> +            ).
> +
> +:- pred read_and_parse_used_file(string::in, string::in, int::in,
> +    line_context::in, line_posn::in, recomp_parse_term(used_file)::out) is det.
> +
> +read_and_parse_used_file(UsedFileName, UsedFileString, MaxOffset,
> +        !.LineContext, !.LinePosn, ParseUsedFile) :-
> +    % XXX
> +    % The contents of the *entire .used file* should be a simple large term
> +    % of a type that is designed to represent its entire contents, since
> +    % that would allow a single call to io.read to read it all in, allowing us
> +    % to dispense with pretty much all of the code handling syntax errors
> +    % in this module. (Since the contents of .used files are machine generated,
> +    % we do not need to strive to generate user-friendly error messages;
> +    % a simple indication of the error's presence and location will do.)
> +    %
> +    % That type, used_file, should depend *only* on public type definitions,
> +    % unlike e.g. the representation of sets, which is hidden behind
> +    % an abstraction barrier. This is because we don't want changes hidden by
> +    % those abstraction barriers to affect the file format.
> +    %
> +    % We could then handle transitions between file format versions either
> +    %
> +    % - by trying the parse the contents of the .used file first as a member
> +    %   of the new type, and if that failed, as a member of the old type, or
> +    %
> +    % - by having both formats represented by two different function symbols
> +    %   in the same type.
> +    %
> +    % Alternatively, the .used file could contain two terms, the version
> +    % number info, and everything else. We could then select the predicate
> +    % we use to read in everything else based on the version number.

Using a separate initial term for the version number would be my preference.

The diff itself looks fine.

Julien.


More information about the reviews mailing list