[m-rev.] for review: inst_results
Julien Fischer
juliensf at csse.unimelb.edu.au
Mon Apr 23 00:48:45 AEST 2012
On Wed, 18 Apr 2012, Zoltan Somogyi wrote:
> A large part of the cost of a large ground term is incurred not when the
> term is constructed, but when it is used. The inst of the term will be huge,
> and will typically have to be traversed many times. Some of those traversals
> would be linear if not for the fact that, in order to avoid infinite loops
> on recursive insts, the predicate doing the traversal has to keep a set
> of the insts visited so far. When the traversal is in the middle of the ground
> term's inst, it is looking up that inst in a set of the insts of its containing
> terms all the way up to the root. When the ground term contains a list with
> many repeated elements near the start, the cost of the traversal is cubic
> in the length of the list: a linear number of set membership tests, each
> of which tests the current inst against a linear number of large insts,
> the test itself being linear.
>
> This diff aims to totally sidestep all that. It extends the mer_inst type
> to allow (but not require) the creator of an inst to record what the outcome
> of some tests on the inst would be. Is it ground? Does it contain "any"?
> What inst names and types may it contain? If the creator records this answer,
> which the code that creates ground terms does, then many tests will now run
> in CONSTANT time, not linear, quadratic or cubic.
>
> We do this only for bound insts. While the concept can apply to all insts,
> for small insts it can cost more to interpret the results term than to
> do the test directly. Insts cannot be large without being composed mostly
> of bound insts, so by recording this info only for bound insts, we can
> speed up the handling of all large insts.
>
> This also has the side benefit that in many cases, a traversal that operates
> on an inst will often do so in order to compute an updated version of that
> inst. In many cases, the updated version is the same as the original version,
> but since the traversal has to be prepared for updates, it makes a copy
> of the inst anyway. The result of the traversal is thus an inst that has
> the same value as the original inst but not the same address. This makes
> it useless to try to do equality checks of related insts in constant time
> by looking at the pointers. With this diff, many such traversals can be
> avoided, allowing the updated inst to keept the address as well as the value
s/keept/keep/
> of the corresponding original inst.
>
> Without this diff, the compiler takes more than 10 seconds to compile
> zm_rcpsp_cpx.m, with most of that time being spent in mode checking.
> With this diff, it takes less than 5 seconds. Basically, mode checking
> went from 6+ seconds to 1. i
Nicely done!
> The profile of the compiler is now flat on this input; no single pass
> takes much more time than the others.
How extensively did you check this with different combinations of grade
and optimization options?
> The speed of the compiler is unaffected on tools/speedtest. (Actually, it
> gets a very slight speedup, but it is in the noise.)
>
> compiler/prog_data.m:
> Change the bound/2 functor of the mer_inst type to bound/3, adding
> a field that gives the outcome of some common tests performed on insts.
> When we attach insts to the variables representing parts of ground
> terms, we mark the insts accordingly. This allows us to perform many
> tests on insts in constant time, not in a time that is linear,
> quadratic or worse in the size of the inst.
>
...
> Index: compiler/hlds_code_util.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/hlds_code_util.m,v
> retrieving revision 1.44
> diff -u -b -r1.44 hlds_code_util.m
> --- compiler/hlds_code_util.m 16 Jun 2011 06:42:14 -0000 1.44
> +++ compiler/hlds_code_util.m 16 Apr 2012 10:53:33 -0000
> @@ -157,17 +157,28 @@
> :- pred is_valid_mutable_inst_2(module_info::in, mer_inst::in,
> set(inst_name)::in) is semidet.
>
> -is_valid_mutable_inst_2(_, any(shared, _), _).
> -is_valid_mutable_inst_2(ModuleInfo, bound(shared, BoundInsts), Expansions) :-
> - list.member(bound_functor(_, Insts), BoundInsts),
> - list.member(Inst, Insts),
> - is_valid_mutable_inst_2(ModuleInfo, Inst, Expansions).
> -is_valid_mutable_inst_2(_, ground(shared, _), _).
> -is_valid_mutable_inst_2(ModuleInfo, defined_inst(InstName), Expansions0) :-
> +is_valid_mutable_inst_2(ModuleInfo, Inst, Expansions0) :-
> + (
> + ( Inst = any(Uniq, _)
> + ; Inst = ground(Uniq, _)
> + ),
> + Uniq = shared
> + ;
> + Inst = bound(shared, _, BoundInsts),
> + % XXX This looks wrong to me. It says that Inst is a valid mutable inst
> + % if SOME ArgInst inside Inst is a valid mutable inst, whereas I think
> + % ALL arguments of ALL functors should be required to be valid mutable
> + % insts.
> + list.member(bound_functor(_, ArgInsts), BoundInsts),
> + list.member(ArgInst, ArgInsts),
> + is_valid_mutable_inst_2(ModuleInfo, ArgInst, Expansions0)
Yes, it is wrong. It should indeed check that all arguments of all fuctors are
valid mutable insts.
The change appears fine otherwise.
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list