[m-rev.] for review: inst_results

Julien Fischer juliensf at csse.unimelb.edu.au
Mon Apr 23 00:48:45 AEST 2012

On Wed, 18 Apr 2012, Zoltan Somogyi wrote:

> A large part of the cost of a large ground term is incurred not when the
> term is constructed, but when it is used. The inst of the term will be huge,
> and will typically have to be traversed many times. Some of those traversals
> would be linear if not for the fact that, in order to avoid infinite loops
> on recursive insts, the predicate doing the traversal has to keep a set
> of the insts visited so far. When the traversal is in the middle of the ground
> term's inst, it is looking up that inst in a set of the insts of its containing
> terms all the way up to the root. When the ground term contains a list with
> many repeated elements near the start, the cost of the traversal is cubic
> in the length of the list: a linear number of set membership tests, each
> of which tests the current inst against a linear number of large insts,
> the test itself being linear.
> This diff aims to totally sidestep all that. It extends the mer_inst type
> to allow (but not require) the creator of an inst to record what the outcome
> of some tests on the inst would be. Is it ground? Does it contain "any"?
> What inst names and types may it contain? If the creator records this answer,
> which the code that creates ground terms does, then many tests will now run
> in CONSTANT time, not linear, quadratic or cubic.
> We do this only for bound insts. While the concept can apply to all insts,
> for small insts it can cost more to interpret the results term than to
> do the test directly. Insts cannot be large without being composed mostly
> of bound insts, so by recording this info only for bound insts, we can
> speed up the handling of all large insts.
> This also has the side benefit that in many cases, a traversal that operates
> on an inst will often do so in order to compute an updated version of that
> inst. In many cases, the updated version is the same as the original version,
> but since the traversal has to be prepared for updates, it makes a copy
> of the inst anyway. The result of the traversal is thus an inst that has
> the same value as the original inst but not the same address. This makes
> it useless to try to do equality checks of related insts in constant time
> by looking at the pointers. With this diff, many such traversals can be
> avoided, allowing the updated inst to keept the address as well as the value


> of the corresponding original inst.
> Without this diff, the compiler takes more than 10 seconds to compile
> zm_rcpsp_cpx.m, with most of that time being spent in mode checking.
> With this diff, it takes less than 5 seconds. Basically, mode checking
> went from 6+ seconds to 1. i

Nicely done!

> The profile of the compiler is now flat on this input; no single pass
> takes much more time than the others.

How extensively did you check this with different combinations of grade
and optimization options?

> The speed of the compiler is unaffected on tools/speedtest. (Actually, it
> gets a very slight speedup, but it is in the noise.)
> compiler/prog_data.m:
> 	Change the bound/2 functor of the mer_inst type to bound/3, adding
> 	a field that gives the outcome of some common tests performed on insts.
> 	When we attach insts to the variables representing parts of ground
> 	terms, we mark the insts accordingly. This allows us to perform many
> 	tests on insts in constant time, not in a time that is linear,
> 	quadratic or worse in the size of the inst.

> Index: compiler/hlds_code_util.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/hlds_code_util.m,v
> retrieving revision 1.44
> diff -u -b -r1.44 hlds_code_util.m
> --- compiler/hlds_code_util.m	16 Jun 2011 06:42:14 -0000	1.44
> +++ compiler/hlds_code_util.m	16 Apr 2012 10:53:33 -0000
> @@ -157,17 +157,28 @@
> :- pred is_valid_mutable_inst_2(module_info::in, mer_inst::in,
>     set(inst_name)::in) is semidet.
> -is_valid_mutable_inst_2(_, any(shared, _), _).
> -is_valid_mutable_inst_2(ModuleInfo, bound(shared, BoundInsts), Expansions) :-
> -    list.member(bound_functor(_, Insts), BoundInsts),
> -    list.member(Inst, Insts),
> -    is_valid_mutable_inst_2(ModuleInfo, Inst, Expansions).
> -is_valid_mutable_inst_2(_, ground(shared, _), _).
> -is_valid_mutable_inst_2(ModuleInfo, defined_inst(InstName), Expansions0) :-
> +is_valid_mutable_inst_2(ModuleInfo, Inst, Expansions0) :-
> +    (
> +        ( Inst = any(Uniq, _)
> +        ; Inst = ground(Uniq, _)
> +        ),
> +        Uniq = shared
> +    ;
> +        Inst = bound(shared, _, BoundInsts),
> +        % XXX This looks wrong to me. It says that Inst is a valid mutable inst
> +        % if SOME ArgInst inside Inst is a valid mutable inst, whereas I think
> +        % ALL arguments of ALL functors should be required to be valid mutable
> +        % insts.
> +        list.member(bound_functor(_, ArgInsts), BoundInsts),
> +        list.member(ArgInst, ArgInsts),
> +        is_valid_mutable_inst_2(ModuleInfo, ArgInst, Expansions0)

Yes, it is wrong.  It should indeed check that all arguments of all fuctors are
valid mutable insts.

The change appears fine otherwise.

mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au

More information about the reviews mailing list