[m-rev.] for review: from_ground_term_construct
Zoltan Somogyi
zs at csse.unimelb.edu.au
Fri Dec 12 16:55:05 AEDT 2008
As mentioned on mercury-developers, I would like everyone to review their
own modules. I am also looking for volunteers for reviewing the modules
that noone who is around these days owns.
Read the diff to hlds_goal.m first; without it, the rest won't be full sense.
The diff was done with -b, so the indentation may not seem like it is ok,
but it is. (Without -b, the diff would be even bigger.) You can't apply
a -b diff and get something readable, so the full workspace is available
at /home/taura/workspaces/zs/ws74.
There is one issue marked XXX below; I will fix it before commit.
Zoltan.
-----------------------------------------------------------------------------
Speed up the compiler's handling of code that constructs large ground terms
by specializing the treatment of such code.
This diff reduces the compilation time for training_cars_full.m from 97.5
seconds to 31.5 seconds on alys, my laptop. The time on tools/speedtest
stays pretty much the same.
compiler/hlds_goal.m:
Record the classification of from_ground_term scopes as purely
constructing terms, purely deconstructing them or something other.
Fix an old potential bug: variables inside the construct_how fields
of unifications weren't being renamed along with other variables.
This is a bug if any part of the compiler later looks at those
variables. (I am not sure whether or not this happens.)
compiler/superhomogenous.m:
Provisionally mark newly constructed static terms as being
from_ground_term_construct. Mode checking will either confirm this
or change the scope kind.
compiler/options.m:
compiler/handle_options.m:
Add a new option, from_ground_term_threshold, that allows the user to
set the boundary between ground terms that get scopes and ground terms
do not. I plan to experiment with different settings later.
compiler/modes.m:
Make this classification. For scopes that construct ground terms,
use a specialized algorithm that avoids quadratic behavior.
(It does not access the unify_inst_table, which is where the
factor of N other than the length of the goal list came from.)
The total size of the instmap_deltas, if printed out, still looks like
O(N^2) in size, but due to structure sharing it needs only O(N) memory.
For scopes that construct ground terms, set the determinism information
so that det_analysis.m doesn't have to traverse such scopes.
When handling disjunctions, check whether some nonlocals of the
disjunctions are constructed by from_ground_term_construct scopes.
For any such nonlocals, set their insts to just ground, throwing away
the precise information we have about exactly what function symbols
they and ALL their subterms are bound to. This is HUGE win, since
it allows us avoid spending a lot of time building a huge merge_inst
table, which later passes of the compiler (e.g. equiv_type_hlds) would
then have to spend similarly huge times traversing.
This approach does have a down side. If lots of arms of a disjunction
bind a nonlocal to a large ground term, but a few bind it to a SMALL
ground term, a term below the from_ground_term_threshold, this
optimization won't kick in. That could be one purpose of the new
option. It isn't documented yet; I will seek feedback about its
usefulness first.
compiler/modecheck_unify.m:
Handle the three different kinds of right hand sides separately.
This yields a small speedup, because now we don't test rhs_vars and
rhs_functors (the common right hand sides) for a special case
(goals containing "any" insts) that is applicable only to
rhs_lambda_goals.
compiler/unique_modes.m:
Don't traverse scopes that construct ground terms, since modes.m has
already done everything that needs to be done.
compiler/det_analysis.m:
Don't traverse scopes that construct ground terms, since modes.m has
already done the needed work.
compiler/instmap.m:
Add a new predicate for use by modes.m.
Many predicate names in this module were quite uninformative; give them
informative names.
compiler/polymorphism.m:
If this pass invalidates the from_ground_term_construct invariants,
then mark the relevant scope as from_ground_term_other.
Delete two unused access predicates.
compiler/equiv_type_hlds.m:
Don't traverse scopes that construct ground terms, since modes.m
ensures that their instmap deltas do not contain typed insts, and
thus the scope cannot contain types that need to be expanded.
Convert some predicates to single clauses.
compiler/goal_form.m:
compiler/goal_util.m:
In predicates that test goals for various properties, don't traverse
scopes that construct ground terms when the outcome of the test
is the same for all such scopes.
Convert some predicates to single clauses.
compiler/simplify.m:
Do not look for common structs in from_ground_term_construct scopes,
both because this speeds up the compiler, and because retaining
references to ground terms is in fact a pessimization, not an
optimization. This is because (a) those references need to be stored in
stack slots across calls, and (b) the C code generators ensure that
the cells representing ground terms will be shared as needed.
If all arms of a switch are from_ground_term_construct scopes,
do not merge the instmap_deltas from those arms, since this is
both very time-consuming (XXX not anymore) and extremely unlikely
to improve the instmap_delta.
Disable common_struct in from_ground_term_construct scopes,
since for these scopes, it is actually a pessimization.
Do not delete from_ground_term_construct scopes, since many
compiler passes can now use them.
Do some manual deforestation, break up some large predicates,
and give better names to some.
compiler/liveness.m
Special-case the handling from_ground_term_construct scopes. This
allows us to traverse them just once instead of three times, and this
traversal is simpler and faster than any of the three.
In some traversals, we were switching on the goal type twice; once
in e.g. detect_liveness_in_goal_2, and once by calling
goal_expr_has_subgoals. Eliminate the double switching by merging
the relevant predicates. (The double-switching structure was easier
to work with before we had multi-cons-id switches.)
compiler/typecheck.m:
Move a lookup after a test, so we don't have to do it if the test
fails.
Provide a specialized mode for a predicate. This should allow the
compiler to eliminate an argument and a test in the common case.
Note a possible chance for a speedup.
compiler/typecheck_info.m:
Don't apply empty substitutions to the types of a possibly very large
set of variables.
compiler/quantification.m:
Don't quantify from_ground_term_construct scopes. They are created
correctly quantified, and any compiler pass that invalidates that
quantification also removes the from_ground_term_construct mark.
Don't apply empty renamings to a possibly very large set of variables.
Move the code for handling scopes to its own predicate, to avoid
overwhelming the code that handles other kinds of goals. Even from
this, factor out the renaming code, since it is needed only for
some kinds of scopes.
Make some predicate names better reflect what the predicate does.
compiler/pd_cost.m:
For from_ground_term_construct scopes, instead of computing their cost
by adding up the costs of the goals inside, make their cost a constant,
since binding a variable to a static term takes constant time.
compiler/pd_info.m:
Add prefixes on field names to avoid ambiguities.
compiler/add_heap_ops.m:
compiler/add_trail_ops.m:
compiler/closure_analysis.m:
compiler/constraint.m:
compiler/cse_detection.m:
compiler/dead_proc_elim.m:
compiler/deep_profiling.m:
compiler/deforest.m:
compiler/delay_construct.m:
compiler/delay_partial_inst.m:
compiler/dep_par_conj.m:
compiler/distance_granularity.m:
compiler/exception_analysis.m:
compiler/follow_code.m:
compiler/follow_vars.m:
compiler/format_call.m:
compiler/granularity.m:
compiler/higher_order.m:
compiler/implicit_parallelism.m:
compiler/inlining.m:
compiler/interval.m:
compiler/lambda.m:
compiler/lco.m:
compiler/live_vars.m:
compiler/loop_inv.m:
compiler/middle_rec.m:
compiler/mode_util.m:
compiler/parallel_to_plain_conj.m:
compiler/saved_vars.m:
compiler/stm_expand.m:
compiler/store_alloc.m:
compiler/stratify.m:
compiler/structure_reuse.direct.detect_garbage.m:
compiler/structure_reuse.lbu.m:
compiler/structure_sharing.analysis.m:
compiler/switch_detection.analysis.m:
compiler/trail_analysis.m:
compiler/term_pass1.m:
compiler/tupling.m:
compiler/unneeded_code.m:
compiler/untupling.m:
compiler/unused_args.m:
These passes have nothing to do in from_ground_term_construct scopes,
so don't traverse them.
In some modules (e.g. dead_proc_elim), some traversals had to be kept.
In loop_inv.m, replace a code structure that updated accumulators
with functions (which prevented the natural use of state variables),
that in lots of places reconstructed the term it had just
deconstructed, and obscured the identical handling of different kinds
of goals, with a structure based on predicates, state variables and
shared code for different goal types where possible.
In store_alloc.m, avoid some double switching on the same value.
In stratify.m, unneeded_code.m and unused_args.m, rename predicates
to avoid ambiguities.
compiler/goal_path.m:
compiler/goal_util.m:
compiler/implementation_defined_literals.m:
compiler/intermode.m:
compiler/mark_static_terms.m:
compiler/ml_code_gen.m:
compiler/mode_ordering.m:
compiler/ordering_mode_constraints.m:
compiler/prop_mode_constraints.m:
compiler/purity.m:
compiler/rbmm.actual_region_arguments.m:
compiler/rbmm.add_rbmm_goal_infos.m:
compiler/rbmm.condition_renaming.m:
compiler/rbmm.execution_path.m:
compiler/rbmm.region_transformation.m:
compiler/structure_reuse.direct.choose_reuse.m:
compiler/structure_reuse.indirect.m:
compiler/structure_reuse.lfu.m:
compiler/structure_reuse.versions.m:
compiler/term_const_build.m:
compiler/term_traversal.m:
compiler/unused_imports.m:
Mark places where we cannot (yet) special case
from_ground_term_construct scopes.
In structure_reuse.lfu.m, turn nested if-then-elses into a switch in.
compiler/size_prof.m:
Turn from_ground_term_construct scopes into from_ground_term_other
scopes, since in term size profiling grades, we need to attach sizes to
terms.
Give predicates better names.
compiler/*.m:
Minor changes to conform to the changes above.
compiler/make_hlds_passes.m:
With -S, print statistics after the third pass over items, since
this is the time-consuming one.
compiler/mercury_compile.m:
Conform to the new names of some predicates.
When declining to output a HLDS dump because it would be identical to
the previous dump, don't confuse the user either by being silent about
the decision, or by leaving an old dump laying around that could be
mistaken for a new one.
tools/binary:
tools/binary_step:
Bring these tools up to date.
compiler/Mmakefile:
Add an int3s target for use by the new code in the tools. The
Mmakefiles in the other directories with Mercury code already have
such a target.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list