[m-rev.] for prelim review: fix compiler performance problems on large programs

Julien Fischer juliensf at cs.mu.OZ.AU
Tue Mar 28 12:07:16 AEDT 2006


On Mon, 27 Mar 2006, Zoltan Somogyi wrote:

> This is a request for a preliminary review. It containms some pieces of code
> marked by ZZZ that need minor cleanups, and the final set of changes to
> expected out files is not yet complete (the required bootcheck started
> just now.)  If I find a need for significant fixes, I will put up an interdiff
> for review, but I hope that won't be necessary.
>
> Zoltan.
>
> Fix several performance bugs that showed up when the compiler was invoked on
> Douglas Auclair's training_cars example. Also fix some minor problems that
> made it harder to find the information needed to localize those problems.
>
> training_cars.m is hard to compile quickly because it is big in two dimensions:
> it has lots of clauses, and each clause has big terms.
>
> My laptop still tries to swap itself to death on the full version of
> training_cars.m (it has only 512 Mb), but the compiler now works fine
> on a version containing about 20% of its clauses, whereas previously
> it couldn't compile it at all.

That should be added as a test case, but whether it it is run will need to be
controlled by an environment variable, e.g. ENABLE_BIG_TEST_CASES or
something.  Since, it's causing your laptop to thrash, I'd hate to think about
what it might do to mine.

> In most cases, the changes convert N^2 algorithms to NlogN algorithms.
> They probably have higher constant factors and may yield small slowdowns
> for small N, but this is probably not noticeable. Avoiding bad worst case
> behavior is more important.
>
> compiler/superhomogeneous.m:
> 	Record the number of goals inserted in each goal being converted
> 	to superhomogeneous form. If this exceeds a threshold, wrap a
> 	from_ground_term scope around it.
>
> 	Put the predicates into a more cohesive sequence.
>
> compiler/field_access.m:
> 	Work with the code in superhomogeneous to record the number of inserted
> 	goals. Reorder the arguments of some performances to be consistent
> 	with the predicates in superhomogeneous.m.
>
> compiler/modes.m:
> 	Use the from_ground_term scope to reverse the list of inserted
> 	unifications if necessary. It is much more efficient to do this here
> 	than to let it happen by sequences of delays and wakeups. That would
> 	have quadratic complexity; this is linear.
>
> 	This is what I originally introduced from_ground_term scopes for.
> 	Then, the overhead was too high, because I added one scope per function
> 	symbol. This version should be fine, since there is at most one scope
> 	added per argument of an atom (clause head or call).
>
> compiler/modes.m:
> compiler/unique_modes.m:
> 	When we are processing goals inside a from_ground_term scope, record
> 	this fact.
>
> compiler/mode_info.m:
> 	Make it possible to record this fact.
>
> compiler/modecheck_unify.m:
> 	When we are inside a from_ground_term scope, don't try to update the
> 	insts of vars on the right hand sides of construction unifications.
> 	Since these variables came from expansion to superhomogeneous form,
> 	those variables won't occur in any following code, so updating their
> 	state is useless, and the algorithm we used to do so is linear in the
> 	size of the inst. Since the size of an inst of a variable that results
> 	from superhomogeneous expansion is itself on average proportional to
> 	the size of the original term, this change turns a quadratic algorithm
> 	into a linear one.
>
> compiler/inst_match.m:
> 	Use balanced trees instead of ordered lists to represents sets of
> 	expansions, since these sets can be large.
>
> 	Note an opportunity for further improvement.
>
> compiler/inst_util.m:
> 	Note another opportunity for further improvement.
>
> compiler/instmap.m:
> 	Rename several predicates to avoid ambiguities.
>
> compiler/cse_detection.m:
> 	We used to print statistics for the processing of each procedure
> 	without saying which procedure it is for; fix this.
>
> compiler/switch_detection.m:
> 	Don't print progress messages for predicates with no procedures,
> 	since they would be misleading.
>
> compiler/higher_order.m:
> 	Change an algorithm that was quadratic in the number of arms
> 	for merging the information from the different arms of disjunctions
> 	and switches to an NlogN algorithm.
>
> 	Change the algorithm for merging the info from two branches
> 	that quadratic in the number of variables per arm to an NlogN
> 	algorithm.
>
> 	Changed some type equivalences to notag types to aid robustness.
>
> compiler/quantification.m:
> 	Rename several predicates to avoid ambiguities.
>
> 	The sets of variables in different arms of disjunctions and switches
> 	tend to have relatively small intersections. Yet the algorithms we
> 	used to compute the set of variables free in the disjunction or switch
> 	included the variables from the already processed arms in the sets
> 	being accumulated when processing later arms, leading the quadratic

s/leading the/leading to the/

> 	behavior. This diff changes the algorithm to process each arm
> 	independently, and then use a more balanced algorithm to summarize
> 	the result.
>
> 	Specialize the predicates that compute sets of free vars in various
> 	HLDS fragments to work either with ordinary_nonlocals or
> 	code_gen_nonlocals without making the same decision repeatedly.
>
> 	Move some code out of large predicates into predicates of their own.
>
> compiler/Mercury.options:
> 	Specify the compiler option that can exploit this specialization
> 	to make the code run faster.
>
> compiler/simplify.m:
> 	Use a more efficient data structure for recording the parameters
> 	of an invocation of simplification.
>
> 	Change some predicate names and function symbol names to avoid
> 	ambiguity.

You should check that my changes to the warnings for obsolete pragmas
conform to the any new naming scheme as well (I think I committed them
after you made this diff).

> compiler/common.m:
> compiler/deforest.m:
> compiler/deforest.m:
> compiler/make_hlds_warn.m:
> compiler/mercury_compile.m:
> compiler/pd_util.m:
> compiler/stack_opt.m:
> compiler/term_constr_build.m:
> 	Conform to the changes in simplify.m and/or instmap.m.
>
> compiler/mercury_compile.m:
> 	Fix a bug in progress messages for polymorphism.m.
>
> compiler/equiv_type_hlds.m:
> 	Most of the time, substitutions inside insts have no effect, because
> 	very few insts include any reference to a types. Instead of the old
> 	approach of building new insts and then throwing them away if they
> 	are the same as the old ones, don't build new insts at all if the
> 	old inst contains no types.
>
> compiler/common.m:
> 	Change some predicate names to make them clearer.
>
> compiler/hlds_clauses.m:
> 	Record the number of clauses so far, to allow a more informative
> 	progress message to be printed.
>
> compiler/add_clause.m:
> 	Print this more informative progress message.
>
> 	Conform to the changes in superhomogeneous.m.
>
> compiler/code_gen.m:
> 	Use the context of the predicate's first clause (which will be the
> 	context of the first clause head) as the context of the predicate's
> 	interface events. Unlike the context of the body goal, this won't
> 	be affected by program transformations such as wrapping a
> 	from_ground_term scope around some goals. It is better for users
> 	anyway, since the old policy lead to contexts in the middle of
> 	procedure bodies if the top level goal was a disjunction, switch or
> 	if-then-else.
>
> tests/debugger/*.exp:
> 	Update the expected outputs to conform to the change to code_gen.m.

...

> Index: compiler/add_clause.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/add_clause.m,v
> retrieving revision 1.19
> diff -u -b -r1.19 add_clause.m
> --- compiler/add_clause.m	24 Mar 2006 03:03:37 -0000	1.19
> +++ compiler/add_clause.m	27 Mar 2006 02:44:47 -0000
> @@ -44,12 +44,13 @@
>      % the goal, to rename it apart from the other clauses.
>      %
>  :- pred transform_goal(goal::in, prog_substitution::in, hlds_goal::out,
> -    prog_varset::in, prog_varset::out,
> +    int::out, prog_varset::in, prog_varset::out,
>      module_info::in, module_info::out, qual_info::in, qual_info::out,
>      svar_info::in, svar_info::out, io::di, io::uo) is det.

Would it be possible to use an equivalence type for the int argument
here (and below)?

> -:- pred qualify_lambda_mode_list(list(mer_mode)::in, list(mer_mode)::out,
> -    prog_context::in, qual_info::in, qual_info::out, io::di, io::uo) is det.
> +:- pred qualify_lambda_mode_list_if_not_opt_imported(
> +    list(mer_mode)::in, list(mer_mode)::out, prog_context::in,
> +    qual_info::in, qual_info::out, io::di, io::uo) is det.

...

>      % Switches are treated in exactly the same way as disjunctions.
>      %
>  :- pred traverse_cases(list(case)::in, list(case)::out,
>      higher_order_info::in, higher_order_info::out) is det.
>
> -traverse_cases([], [], !Info).
> -traverse_cases([case(ConsId, Goal0) | Cases0], [case(ConsId, Goal) | Cases],
> -        !Info) :-
> -    get_pre_branch_info(PreInfo, !Info),
> -    traverse_goal_2(Goal0, Goal, !Info),
> -    get_post_branch_info(PostInfo0, !Info),
> -    traverse_cases_2(PreInfo, Cases0, Cases, PostInfo0, PostInfo, !Info),
> -    set_post_branch_info(PostInfo, !Info).
> +traverse_cases(Cases0, Cases, !Info) :-
> +    % We handle empty lists separately because merge_post_branch_infos_into_one
> +    % works only on nonempty lists.
> +    (
> +        Cases0 = [],
> +        error("traverse_cases: empty list of cases")

Call unexpected/2 here.

...

> Index: compiler/pd_util.m
> ===================================================================
> RCS file: /home/mercury/mercury1/repository/mercury/compiler/pd_util.m,v
> retrieving revision 1.49
> diff -u -b -r1.49 pd_util.m
> --- compiler/pd_util.m	21 Mar 2006 22:25:27 -0000	1.49
> +++ compiler/pd_util.m	25 Mar 2006 13:57:05 -0000
> @@ -45,7 +45,7 @@
>

...

> +:- pred case_vars(nonlocals_to_recompute, list(case),
> +    set_of_var, set_of_var, set_of_var, set_of_var).
> +:- mode case_vars(in(ordinary_nonlocals), in, in, out, in, out) is det.
> +:- mode case_vars(in(code_gen_nonlocals), in, in, out, in, out) is det.
> +
> +case_vars(NonLocalsToRecompute, Cases, !Set, !LambdaSet) :-
> +    compute_case_vars(NonLocalsToRecompute, Cases,
> +        [], CaseSets, [], CaseLambdaSets),
> +    (
> +        CaseSets = [],
> +        error("case_vars: no cases")

call unexpected/2 here.

Otherwise, that looks okay so far.

Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list