[m-dev.] foldl2 and recursive_fold2 in dir.m have the wrong result type

Julien Fischer jfischer at opturion.com
Sat May 15 15:30:50 AEST 2021


Hi Zoltan,

On Fri, 14 May 2021, Zoltan Somogyi wrote:

> The attached diff makes module_cmds not abort if dir.m has an error on a file
> that is not relevant to the question module_cmds.m is trying to answer. I expected
> this to to reduce the number of test case failures in the java grade, but it does not do so.
> Based on a few spot checks, test cases that used to fail due to unrelated files
> not being found now fail due to the class loader not being able to find
> .class files that (to my naive mind) it *should* be able to find :-(

Hmmm, that's annoying.  I had a quick search about, but I think we will
have to dig into the class loader source code to see exactly what it's
doing.

> Despite that, this diff should be a useful step forward.

Agreed.

> For review by Julien.

> Look for only relevant .class files.

I suggest:

     Look for only relevant .class files when creating Java archives.
> 
> library/dir.m:
>     Add a new predicate general_foldl2. Unlike foldl2 and recursive_foldl2,
>     this predicate can be asked to continue going after finding errors,

Delete "going".

>     in which case it will return a structured description of those errors.
>
>     To make this possible, generalize the infrastructure shared by foldl2,
>     recursive_foldl2 and now general_foldl2.
> 
> NEWS:
>     Announce the new predicate.
> 
> compiler/module_cmds.m:
>     Use the new predicate instead of recursive_foldl2.

Append:

       when creating Java archives.

> Doing so means that
>     when e.g. bootcheck runs several test cases at the same time in the same
>     directory, the traversal in one test case will NOT stop when a different
>     test case deletes one of its .class files between the time the first test
>     case has read a directory, and the time it gets around to processing those
>     entries. Instead, general_foldl2 will simply return an error description
>     in each such case. If an error description is for a file that is not
>     relevant to the current test case, we will now ignore it.
>
>     This should eliminate many spurious test case failures in Java grades
>     that were caused by such interference between test cases.

...

> diff --git a/library/dir.m b/library/dir.m
> index 1fcf5d2e8..f47b67ca5 100644
> --- a/library/dir.m
> +++ b/library/dir.m
> @@ -198,18 +198,19 @@
>      pred(string, string, io.file_type, bool, T, T, io, io).
>  :- inst foldl_pred == (pred(in, in, in, out, in, out, di, uo) is det).
> 
> -    % foldl2(P, DirName, InitialData, Result, !IO).
> +    % foldl2(Pred, DirName, InitialData, Result, !IO):
>      %
> -    % Apply `P' to all files and directories in the given directory.
> +    % Apply `Pred' to all files and directories in the given directory.
>      % Directories are not processed recursively.
> -    % Processing will stop if the boolean (Continue) output of P is bound
> +    % Processing will stop if the boolean (Continue) output of Pred is bound
>      % to `no'.
>      % The order in which the entries are processed is unspecified.
>      %
>  :- pred foldl2(foldl_pred(T)::in(foldl_pred), string::in,
>      T::in, io.maybe_partial_res(T)::out, io::di, io::uo) is det.
> 
> -    % recursive_foldl2(P, DirName, FollowSymLinks, InitialData, Result, !IO).
> +    % recursive_foldl2(Pred, DirName, FollowSymLinks, InitialData, Result,
> +    %   !IO):
>      %
>      % As above, but recursively process subdirectories.
>      % Subdirectories are processed depth-first, processing the directory itself
> @@ -220,6 +221,73 @@
>      string::in, bool::in, T::in, io.maybe_partial_res(T)::out,
>      io::di, io::uo) is det.
> 
> +:- type fold_params
> +    --->    fold_params(
> +                fp_subdirs      :: maybe_subdirs,
> +                fp_on_error     :: on_error
> +            ).
> +
> +:- type maybe_subdirs
> +    --->    do_not_enter_subdirs
> +    ;       enter_subdirs(maybe_follow_symlinks).
> +
> +:- type maybe_follow_symlinks
> +    --->    do_not_follow_symlinks
> +    ;       follow_symlinks.
> +
> +:- type on_error
> +    --->    on_error_stop
> +    ;       on_error_keep_going.
> +
> +:- type file_error
> +    --->    file_error(string, file_operation, io.error).
> +            % file_errorx(PathName, Op, Error) means that

There's a stray 'x' there.

Also, there's no shortage of space so I suggest s/Op/Operation/

> +            % when we tried to perform Op on PathName, the result was Error.
> +            % PathName should specify a file name relative to the

Should?  Why not:

    PathName specifies a file name relative to ....

?

> +            % directory name given to general_foldl2.
> +
> +:- type file_operation
> +    --->    file_open
> +    ;       file_close
> +    ;       file_get_id
> +    ;       file_get_type
> +    ;       file_check_accessibility

To address an XXX below: maybe file_check_accessibility should
encode which access types were tested for?  (i.e. the contents
of the second argument of io.check_file_accessibility.)

> +    ;       file_read_dir_entry.
> +
> +    % general_foldl2(Params, Pred, DirName, Data0, Data, Errors, !IO).
> +    %
> +    % A generalised version of the above, whose behavior is controlled
> +    % by setting up Params.
> +    %
> +    % Whether we recursively process subdirectories depends on whether
> +    % the fp_subdirs field of Params is do_not_enter_subdirs or enter_subdirs.
> +    % If it is do_not_enter_subdirs, then we do not process subdirectories
> +    % at all. If it is enter_subdirs, then we process subdirectories depth
> +    % first, processing each subdirectory before its contents.

    processing each subdirectory in a directory before its other contents.

> +    %
> +    % Whether we recursively process subdirectories referenced by symlinks
> +    % depends on the argument of enter_subdirs.
> +    %
> +    % When we encounter an error, such as a failure to open a directory
> +    % for reading, we record an error, but what happens after that

s/an/that/

> +    % depends on the fp_on_error field of Params. If this field is
> +    % on_error_stop, then we stop the traversal then and there,
> +    % Therefore with that setting, we will return at most one error.
> +    % If it is on_error_keep_going, we continue with the traversal.
> +    % Therefore with that setting, we can return more than one error.

That last bit is missing some punctuation however I suggest the
following

     If the field is on_error_stop, then we stop the traversal and will
     return at most one error.
     If the field is on_error_keep_oing, we continue the traversal and
     may return more than one error.

  +    % Regardless of the setting of fp_on_error, we stop the traversal
> +    % if Pred returns Continue = `no'.
> +    %
> +    % In all of these cases, the value of Data will reflect the

Delete "of these".

> +    % the results of all the invocations of Pred during the traversal
> +    % up to the time the traversal was stopped either by an error,
> +    % by Continue = no, or by running out of files to traverse.

In the previous paragraph no was quoted; here it isn't.

...

> -:- pred check_dir_readable(string::in, io.res::out, io::di, io::uo) is det.
> +:- pred check_dir_readable(string::in, maybe_file_error::out,
> +    io::di, io::uo) is det.
> 
> -check_dir_readable(DirName, Res, !IO) :-
> -    io.file_type(yes, DirName, FileTypeRes, !IO),
> +check_dir_readable(DirName, Result, !IO) :-
> +    io.file_type(yes, DirName, FileTypeResult, !IO),
>      (
> -        FileTypeRes = ok(FileType),
> +        FileTypeResult = ok(FileType),
>          (
>              FileType = directory,
> -            io.check_file_accessibility(DirName, [read, execute], Res, !IO)
> +            io.check_file_accessibility(DirName, [read, execute],
> +                CheckResult, !IO),
> +            (
> +                CheckResult = ok,
> +                Result = mfe_ok(unit)
> +            ;
> +                CheckResult = error(IOError),
> +                Error = file_error(DirName, file_check_accessibility, IOError),
> +                Result = mfe_error(Error)
> +            )
>          ;
>              ( FileType = regular_file
>              ; FileType = symbolic_link
> @@ -1682,47 +1812,54 @@ check_dir_readable(DirName, Res, !IO) :-
>              ; FileType = shared_memory
>              ; FileType = unknown
>              ),
> -            Res = error(make_io_error(
> -                "dir.foldl2: pathname is not a directory"))
> +            % XXX The top level caller may not be dir.foldl2.

That's an existing problem but we should fixe by passing the identity of the
top-level caller down.

> +            % XXX The message is too verbose for use in a full file_error.

I'm not sure what you mean by that one.

> +            IOError = make_io_error("dir.foldl2: pathname is not a directory"),
> +            % XXX Should file_check_accessibility be something else?

See my comment above.

> +            Error = file_error(DirName, file_check_accessibility, IOError),
> +            Result = mfe_error(Error)
>          )
>      ;
> -        FileTypeRes = error(Error),
> -        Res = error(Error)
> +        FileTypeResult = error(IOError),
> +        Result = mfe_error(file_error(DirName, file_get_type, IOError))
>      ).

The change is otherwise fine.
(I would note, in passing, that the Java code in this module could
probably revised in light of Java library additions; I will take a look
at that after this is committed.)

Julien.


More information about the developers mailing list