[m-rev.] for review: proper mmc --make speedup
Julien Fischer
juliensf at csse.unimelb.edu.au
Mon Jul 7 00:12:32 AEST 2008
On Fri, 27 Jun 2008, Peter Wang wrote:
> Estimated hours taken: 12
> Branches: main
>
> Speed up dependency computation in `mmc --make'. It was spending the majority
> of its time taking unions of sets. Therefore change to a set representation
> which is fast for unions, sparse_bitset. To make use of bitsets, maintain a
> mapping between module_names and integers, and dependency_files and integers.
> Use version_hash_tables and version_arrays to hold the forward and reverse
> mappings.
> (We don't really need version types.)
Why are you using them then? Presumably to avoid having to muck about
with the insts on real hash tables and arrays?
> Also cache the results of `non_intermod_direct_imports' and `get_file_name'.
>
> In one workspace, the time for `mmc --make' to check that all the .c files in
> the `compiler' directory were up-to-date dropped from ~120 seconds to ~20
> seconds after these changes.
Great!
> compiler/make.m:
> Add fields to the `make_info' structure to hold the module_name- and
> dependency_file-to-index mappings.
>
> Change the dependency_status map from a tree234 map to a
> version_hash_table.
>
> Add fields to cache results for `get_file_name' and
> `non_intermod_direct_imports'.
>
> compiler/make.dependencies.m:
> Add the equivalence type `deps_set(T)' to make it easier to try
> different set representations in the future. Change the set
> representation as above.
>
> Add procedures to convert between indices and the real values, and
> index sets and the real sets.
>
> Conform to the data representation changes.
>
> Add caching to `non_intermod_direct_imports'.
>
> Replace two generate-and-test sites with deterministic code.
>
> compiler/make.util.m:
> Add hash functions for `module_name' and `dependency_file' values.
>
> Add caching to `get_file_name' (when Search = yes).
>
> Delete an unused predicate.
>
> compiler/make.module_target.m:
> compiler/make.program_target.m:
> Conform to changes.
>
> Minor cleanups.
>
> Index: compiler/make.dependencies.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/make.dependencies.m,v
> retrieving revision 1.47
> diff -u -p -r1.47 make.dependencies.m
> --- compiler/make.dependencies.m 27 Mar 2008 02:29:41 -0000 1.47
> +++ compiler/make.dependencies.m 27 Jun 2008 02:00:54 -0000
> @@ -17,18 +17,43 @@
> :- module make.dependencies.
> :- interface.
>
> +:- import_module enum.
> +
> %-----------------------------------------------------------------------------%
>
> + % Dependency computation does a lot of unions so we use a set
> + % representation suited to that purpose, namely bitsets. We can't store
> + % module_names and dependency_files in those sets, so we keep a mapping
> + % between module_name <-> module_index and dependency_file <->
> + % dependency_file_index in the make_info structure and work with sets of
> + % indices instead.
> + %
> + % sparse_bitset is faster than tree_bitset by my tests.
> + %
> +:- type deps_set(T) == sparse_bitset(T).
> +% :- type deps_set(T) == tree_bitset(T).
> +
> +:- type module_index.
> +:- instance enum(module_index).
> +
> +:- type dependency_file_index.
> +:- instance enum(dependency_file_index).
> +
> % find_module_deps(ModuleName, Succeeded, Deps, !Info, !IO).
> %
> % The reason we don't return maybe(Deps) is that with `--keep-going'
> % we want to do as much work as possible.
> %
> :- type find_module_deps(T) ==
> - pred(module_name, bool, set(T), make_info, make_info, io, io).
> + pred(module_index, bool, deps_set(T), make_info, make_info, io, io).
> :- inst find_module_deps ==
> (pred(in, out, out, in, out, di, uo) is det).
>
> +:- type find_module_deps_plain_set(T) ==
> + pred(module_index, bool, set(T), make_info, make_info, io, io).
> +:- inst find_module_deps_plain_set ==
> + (pred(in, out, out, in, out, di, uo) is det).
> +
> :- type dependency_file
> ---> dep_target(target_file)
> % A target which could be made.
> @@ -41,7 +66,7 @@
> % a target type given a module name.
> %
> :- func target_dependencies(globals::in, module_target_type::in) =
> - (find_module_deps(dependency_file)::out(find_module_deps)) is det.
> + (find_module_deps(dependency_file_index)::out(find_module_deps)) is det.
>
> % Union the output set of dependencies for a given module
> % with the accumulated set. This is used with
> @@ -49,9 +74,34 @@
> % module_names to find all target files for those modules.
> %
> :- pred union_deps(find_module_deps(T)::in(find_module_deps),
> - module_name::in, bool::out, set(T)::in, set(T)::out,
> + module_index::in, bool::out, deps_set(T)::in, deps_set(T)::out,
> make_info::in, make_info::out, io::di, io::uo) is det.
>
> +:- pred union_deps_plain_set(
> + find_module_deps_plain_set(T)::in(find_module_deps_plain_set),
> + module_index::in, bool::out, set(T)::in, set(T)::out,
> + make_info::in, make_info::out, io::di, io::uo) is det.
> +
> +:- pred deps_set_foldl3_maybe_stop_at_error(bool::in,
> + foldl3_pred_with_status(T, Acc, Info, IO)::in(foldl3_pred_with_status),
> + deps_set(T)::in, bool::out, Acc::in, Acc::out, Info::in, Info::out,
> + IO::di, IO::uo) is det <= enum(T).
Why are the last two arguments polymorphic there?
...
> +%-----------------------------------------------------------------------------%
> +
> +:- type deps_result(T) == pair(bool, deps_set(T)).
I would make that into a proper d.u. type.
:- type deps_result(T) ---> deps_result(bool, deps_set(T)).
> +:- type module_deps_result == deps_result(module_index).
> +
> +union_deps(FindDeps, ModuleIndex, Success, Deps0, Deps, !Info, !IO) :-
> + FindDeps(ModuleIndex, Success, Deps1, !Info, !IO),
> + Deps = union(Deps0, Deps1).
That looks okay otherwise.
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list