[m-rev.] for prelim review: fix compiler performance problems on large programs
Julien Fischer
juliensf at cs.mu.OZ.AU
Tue Mar 28 12:07:16 AEDT 2006
On Mon, 27 Mar 2006, Zoltan Somogyi wrote:
> This is a request for a preliminary review. It containms some pieces of code
> marked by ZZZ that need minor cleanups, and the final set of changes to
> expected out files is not yet complete (the required bootcheck started
> just now.) If I find a need for significant fixes, I will put up an interdiff
> for review, but I hope that won't be necessary.
>
> Zoltan.
>
> Fix several performance bugs that showed up when the compiler was invoked on
> Douglas Auclair's training_cars example. Also fix some minor problems that
> made it harder to find the information needed to localize those problems.
>
> training_cars.m is hard to compile quickly because it is big in two dimensions:
> it has lots of clauses, and each clause has big terms.
>
> My laptop still tries to swap itself to death on the full version of
> training_cars.m (it has only 512 Mb), but the compiler now works fine
> on a version containing about 20% of its clauses, whereas previously
> it couldn't compile it at all.
That should be added as a test case, but whether it it is run will need to be
controlled by an environment variable, e.g. ENABLE_BIG_TEST_CASES or
something. Since, it's causing your laptop to thrash, I'd hate to think about
what it might do to mine.
> In most cases, the changes convert N^2 algorithms to NlogN algorithms.
> They probably have higher constant factors and may yield small slowdowns
> for small N, but this is probably not noticeable. Avoiding bad worst case
> behavior is more important.
>
> compiler/superhomogeneous.m:
> Record the number of goals inserted in each goal being converted
> to superhomogeneous form. If this exceeds a threshold, wrap a
> from_ground_term scope around it.
>
> Put the predicates into a more cohesive sequence.
>
> compiler/field_access.m:
> Work with the code in superhomogeneous to record the number of inserted
> goals. Reorder the arguments of some performances to be consistent
> with the predicates in superhomogeneous.m.
>
> compiler/modes.m:
> Use the from_ground_term scope to reverse the list of inserted
> unifications if necessary. It is much more efficient to do this here
> than to let it happen by sequences of delays and wakeups. That would
> have quadratic complexity; this is linear.
>
> This is what I originally introduced from_ground_term scopes for.
> Then, the overhead was too high, because I added one scope per function
> symbol. This version should be fine, since there is at most one scope
> added per argument of an atom (clause head or call).
>
> compiler/modes.m:
> compiler/unique_modes.m:
> When we are processing goals inside a from_ground_term scope, record
> this fact.
>
> compiler/mode_info.m:
> Make it possible to record this fact.
>
> compiler/modecheck_unify.m:
> When we are inside a from_ground_term scope, don't try to update the
> insts of vars on the right hand sides of construction unifications.
> Since these variables came from expansion to superhomogeneous form,
> those variables won't occur in any following code, so updating their
> state is useless, and the algorithm we used to do so is linear in the
> size of the inst. Since the size of an inst of a variable that results
> from superhomogeneous expansion is itself on average proportional to
> the size of the original term, this change turns a quadratic algorithm
> into a linear one.
>
> compiler/inst_match.m:
> Use balanced trees instead of ordered lists to represents sets of
> expansions, since these sets can be large.
>
> Note an opportunity for further improvement.
>
> compiler/inst_util.m:
> Note another opportunity for further improvement.
>
> compiler/instmap.m:
> Rename several predicates to avoid ambiguities.
>
> compiler/cse_detection.m:
> We used to print statistics for the processing of each procedure
> without saying which procedure it is for; fix this.
>
> compiler/switch_detection.m:
> Don't print progress messages for predicates with no procedures,
> since they would be misleading.
>
> compiler/higher_order.m:
> Change an algorithm that was quadratic in the number of arms
> for merging the information from the different arms of disjunctions
> and switches to an NlogN algorithm.
>
> Change the algorithm for merging the info from two branches
> that quadratic in the number of variables per arm to an NlogN
> algorithm.
>
> Changed some type equivalences to notag types to aid robustness.
>
> compiler/quantification.m:
> Rename several predicates to avoid ambiguities.
>
> The sets of variables in different arms of disjunctions and switches
> tend to have relatively small intersections. Yet the algorithms we
> used to compute the set of variables free in the disjunction or switch
> included the variables from the already processed arms in the sets
> being accumulated when processing later arms, leading the quadratic
s/leading the/leading to the/
> behavior. This diff changes the algorithm to process each arm
> independently, and then use a more balanced algorithm to summarize
> the result.
>
> Specialize the predicates that compute sets of free vars in various
> HLDS fragments to work either with ordinary_nonlocals or
> code_gen_nonlocals without making the same decision repeatedly.
>
> Move some code out of large predicates into predicates of their own.
>
> compiler/Mercury.options:
> Specify the compiler option that can exploit this specialization
> to make the code run faster.
>
> compiler/simplify.m:
> Use a more efficient data structure for recording the parameters
> of an invocation of simplification.
>
> Change some predicate names and function symbol names to avoid
> ambiguity.
You should check that my changes to the warnings for obsolete pragmas
conform to the any new naming scheme as well (I think I committed them
after you made this diff).
> compiler/common.m:
> compiler/deforest.m:
> compiler/deforest.m:
> compiler/make_hlds_warn.m:
> compiler/mercury_compile.m:
> compiler/pd_util.m:
> compiler/stack_opt.m:
> compiler/term_constr_build.m:
> Conform to the changes in simplify.m and/or instmap.m.
>
> compiler/mercury_compile.m:
> Fix a bug in progress messages for polymorphism.m.
>
> compiler/equiv_type_hlds.m:
> Most of the time, substitutions inside insts have no effect, because
> very few insts include any reference to a types. Instead of the old
> approach of building new insts and then throwing them away if they
> are the same as the old ones, don't build new insts at all if the
> old inst contains no types.
>
> compiler/common.m:
> Change some predicate names to make them clearer.
>
> compiler/hlds_clauses.m:
> Record the number of clauses so far, to allow a more informative
> progress message to be printed.
>
> compiler/add_clause.m:
> Print this more informative progress message.
>
> Conform to the changes in superhomogeneous.m.
>
> compiler/code_gen.m:
> Use the context of the predicate's first clause (which will be the
> context of the first clause head) as the context of the predicate's
> interface events. Unlike the context of the body goal, this won't
> be affected by program transformations such as wrapping a
> from_ground_term scope around some goals. It is better for users
> anyway, since the old policy lead to contexts in the middle of
> procedure bodies if the top level goal was a disjunction, switch or
> if-then-else.
>
> tests/debugger/*.exp:
> Update the expected outputs to conform to the change to code_gen.m.
...
> Index: compiler/add_clause.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/add_clause.m,v
> retrieving revision 1.19
> diff -u -b -r1.19 add_clause.m
> --- compiler/add_clause.m 24 Mar 2006 03:03:37 -0000 1.19
> +++ compiler/add_clause.m 27 Mar 2006 02:44:47 -0000
> @@ -44,12 +44,13 @@
> % the goal, to rename it apart from the other clauses.
> %
> :- pred transform_goal(goal::in, prog_substitution::in, hlds_goal::out,
> - prog_varset::in, prog_varset::out,
> + int::out, prog_varset::in, prog_varset::out,
> module_info::in, module_info::out, qual_info::in, qual_info::out,
> svar_info::in, svar_info::out, io::di, io::uo) is det.
Would it be possible to use an equivalence type for the int argument
here (and below)?
> -:- pred qualify_lambda_mode_list(list(mer_mode)::in, list(mer_mode)::out,
> - prog_context::in, qual_info::in, qual_info::out, io::di, io::uo) is det.
> +:- pred qualify_lambda_mode_list_if_not_opt_imported(
> + list(mer_mode)::in, list(mer_mode)::out, prog_context::in,
> + qual_info::in, qual_info::out, io::di, io::uo) is det.
...
> % Switches are treated in exactly the same way as disjunctions.
> %
> :- pred traverse_cases(list(case)::in, list(case)::out,
> higher_order_info::in, higher_order_info::out) is det.
>
> -traverse_cases([], [], !Info).
> -traverse_cases([case(ConsId, Goal0) | Cases0], [case(ConsId, Goal) | Cases],
> - !Info) :-
> - get_pre_branch_info(PreInfo, !Info),
> - traverse_goal_2(Goal0, Goal, !Info),
> - get_post_branch_info(PostInfo0, !Info),
> - traverse_cases_2(PreInfo, Cases0, Cases, PostInfo0, PostInfo, !Info),
> - set_post_branch_info(PostInfo, !Info).
> +traverse_cases(Cases0, Cases, !Info) :-
> + % We handle empty lists separately because merge_post_branch_infos_into_one
> + % works only on nonempty lists.
> + (
> + Cases0 = [],
> + error("traverse_cases: empty list of cases")
Call unexpected/2 here.
...
> Index: compiler/pd_util.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/pd_util.m,v
> retrieving revision 1.49
> diff -u -b -r1.49 pd_util.m
> --- compiler/pd_util.m 21 Mar 2006 22:25:27 -0000 1.49
> +++ compiler/pd_util.m 25 Mar 2006 13:57:05 -0000
> @@ -45,7 +45,7 @@
>
...
> +:- pred case_vars(nonlocals_to_recompute, list(case),
> + set_of_var, set_of_var, set_of_var, set_of_var).
> +:- mode case_vars(in(ordinary_nonlocals), in, in, out, in, out) is det.
> +:- mode case_vars(in(code_gen_nonlocals), in, in, out, in, out) is det.
> +
> +case_vars(NonLocalsToRecompute, Cases, !Set, !LambdaSet) :-
> + compute_case_vars(NonLocalsToRecompute, Cases,
> + [], CaseSets, [], CaseLambdaSets),
> + (
> + CaseSets = [],
> + error("case_vars: no cases")
call unexpected/2 here.
Otherwise, that looks okay so far.
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list