[m-rev.] for review: proper mmc --make speedup

Julien Fischer juliensf at csse.unimelb.edu.au
Mon Jul 7 00:12:32 AEST 2008


On Fri, 27 Jun 2008, Peter Wang wrote:

> Estimated hours taken: 12
> Branches: main
>
> Speed up dependency computation in `mmc --make'.  It was spending the majority
> of its time taking unions of sets.  Therefore change to a set representation
> which is fast for unions, sparse_bitset.  To make use of bitsets, maintain a
> mapping between module_names and integers, and dependency_files and integers.
> Use version_hash_tables and version_arrays to hold the forward and reverse
> mappings.

> (We don't really need version types.)

Why are you using them then?  Presumably to avoid having to muck about
with the insts on real hash tables and arrays?

> Also cache the results of `non_intermod_direct_imports' and `get_file_name'.
>
> In one workspace, the time for `mmc --make' to check that all the .c files in
> the `compiler' directory were up-to-date dropped from ~120 seconds to ~20
> seconds after these changes.

Great!

> compiler/make.m:
> 	Add fields to the `make_info' structure to hold the module_name- and
> 	dependency_file-to-index mappings.
>
> 	Change the dependency_status map from a tree234 map to a
> 	version_hash_table.
>
> 	Add fields to cache results for `get_file_name' and
> 	`non_intermod_direct_imports'.
>
> compiler/make.dependencies.m:
> 	Add the equivalence type `deps_set(T)' to make it easier to try
> 	different set representations in the future.  Change the set
> 	representation as above.
>
> 	Add procedures to convert between indices and the real values, and
> 	index sets and the real sets.
>
> 	Conform to the data representation changes.
>
> 	Add caching to `non_intermod_direct_imports'.
>
> 	Replace two generate-and-test sites with deterministic code.
>
> compiler/make.util.m:
> 	Add hash functions for `module_name' and `dependency_file' values.
>
> 	Add caching to `get_file_name' (when Search = yes).
>
> 	Delete an unused predicate.
>
> compiler/make.module_target.m:
> compiler/make.program_target.m:
> 	Conform to changes.
>
> 	Minor cleanups.
>
> Index: compiler/make.dependencies.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/make.dependencies.m,v
> retrieving revision 1.47
> diff -u -p -r1.47 make.dependencies.m
> --- compiler/make.dependencies.m	27 Mar 2008 02:29:41 -0000	1.47
> +++ compiler/make.dependencies.m	27 Jun 2008 02:00:54 -0000
> @@ -17,18 +17,43 @@
> :- module make.dependencies.
> :- interface.
>
> +:- import_module enum.
> +
> %-----------------------------------------------------------------------------%
>
> +    % Dependency computation does a lot of unions so we use a set
> +    % representation suited to that purpose, namely bitsets.  We can't store
> +    % module_names and dependency_files in those sets, so we keep a mapping
> +    % between module_name <-> module_index and dependency_file <->
> +    % dependency_file_index in the make_info structure and work with sets of
> +    % indices instead.
> +    %
> +    % sparse_bitset is faster than tree_bitset by my tests.
> +    %
> +:- type deps_set(T) == sparse_bitset(T).
> +% :- type deps_set(T) == tree_bitset(T).
> +
> +:- type module_index.
> +:- instance enum(module_index).
> +
> +:- type dependency_file_index.
> +:- instance enum(dependency_file_index).
> +
>     % find_module_deps(ModuleName, Succeeded, Deps, !Info, !IO).
>     %
>     % The reason we don't return maybe(Deps) is that with `--keep-going'
>     % we want to do as much work as possible.
>     %
> :- type find_module_deps(T) ==
> -    pred(module_name, bool, set(T), make_info, make_info, io, io).
> +    pred(module_index, bool, deps_set(T), make_info, make_info, io, io).
> :- inst find_module_deps ==
>     (pred(in, out, out, in, out, di, uo) is det).
>
> +:- type find_module_deps_plain_set(T) ==
> +    pred(module_index, bool, set(T), make_info, make_info, io, io).
> +:- inst find_module_deps_plain_set ==
> +    (pred(in, out, out, in, out, di, uo) is det).
> +
> :- type dependency_file
>     --->    dep_target(target_file)
>                         % A target which could be made.
> @@ -41,7 +66,7 @@
>     % a target type given a module name.
>     %
> :- func target_dependencies(globals::in, module_target_type::in) =
> -    (find_module_deps(dependency_file)::out(find_module_deps)) is det.
> +    (find_module_deps(dependency_file_index)::out(find_module_deps)) is det.
>
>     % Union the output set of dependencies for a given module
>     % with the accumulated set. This is used with
> @@ -49,9 +74,34 @@
>     % module_names to find all target files for those modules.
>     %
> :- pred union_deps(find_module_deps(T)::in(find_module_deps),
> -    module_name::in, bool::out, set(T)::in, set(T)::out,
> +    module_index::in, bool::out, deps_set(T)::in, deps_set(T)::out,
>     make_info::in, make_info::out, io::di, io::uo) is det.
>
> +:- pred union_deps_plain_set(
> +    find_module_deps_plain_set(T)::in(find_module_deps_plain_set),
> +    module_index::in, bool::out, set(T)::in, set(T)::out,
> +    make_info::in, make_info::out, io::di, io::uo) is det.
> +
> +:- pred deps_set_foldl3_maybe_stop_at_error(bool::in,
> +    foldl3_pred_with_status(T, Acc, Info, IO)::in(foldl3_pred_with_status),
> +    deps_set(T)::in, bool::out, Acc::in, Acc::out, Info::in, Info::out,
> +    IO::di, IO::uo) is det <= enum(T).

Why are the last two arguments polymorphic there?


...


> +%-----------------------------------------------------------------------------%
> +
> +:- type deps_result(T) == pair(bool, deps_set(T)).

I would make that into a proper d.u. type.

 	:- type deps_result(T) ---> deps_result(bool, deps_set(T)).

> +:- type module_deps_result == deps_result(module_index).
> +
> +union_deps(FindDeps, ModuleIndex, Success, Deps0, Deps, !Info, !IO) :-
> +    FindDeps(ModuleIndex, Success, Deps1, !Info, !IO),
> +    Deps = union(Deps0, Deps1).

That looks okay otherwise.

Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list