[m-rev.] For review by Fergus: Loop Invariant Hoisting
Ralph Becket
rafe at cs.mu.OZ.AU
Wed Oct 30 12:48:37 AEDT 2002
Fergus Henderson, Wednesday, 16 October 2002:
> On 16-Oct-2002, Ralph Becket <rafe at cs.mu.OZ.AU> wrote:
> > I've just done a bootcheck on an updated tree and all is well.
>
> I presume you bootchecked with the optimization enabled?
> (E.g. -O4 or greater?)
Bootchecked in -O2 to -O6 in both asm_fast.gc and hlc.gc, each with
--no-loop-invariants and --loop-invariants.
> Also, what effect does this have on performance? (E.g. for the compiler
> itself, or the benchmarks in the `bench' directory). We don't want to
> enable the optimization by default if it ends up making performance worse.
> It would be a good idea to run some benchmarks to double-check that the
> intended optimization really does improve things.
In asm_fast.gc:
mercury_compile.01 average of 12 with ignore=1 52.88
mercury_compile.02 average of 12 with ignore=1 52.31
mercury_compile.03 average of 12 with ignore=1 51.40
mercury_compile.04 average of 12 with ignore=1 51.96
mercury_compile.05 average of 12 with ignore=1 50.72
mercury_compile.06 average of 12 with ignore=1 50.62
mercury_compile.07 average of 12 with ignore=1 50.37
mercury_compile.08 average of 12 with ignore=1 50.55
mercury_compile.09 average of 12 with ignore=1 56.99
mercury_compile.10 average of 12 with ignore=1 57.46
In hlc.gc:
mercury_compile.11 average of 12 with ignore=1 50.39
mercury_compile.12 average of 12 with ignore=1 50.44
mercury_compile.13 average of 12 with ignore=1 52.62
mercury_compile.14 average of 12 with ignore=1 50.80
mercury_compile.15 average of 12 with ignore=1 50.38
mercury_compile.16 average of 12 with ignore=1 50.36
mercury_compile.17 average of 12 with ignore=1 51.91
mercury_compile.18 average of 12 with ignore=1 50.67
mercury_compile.19 average of 12 with ignore=1 51.61
mercury_compile.20 average of 12 with ignore=1 50.45
So it's not pessimizing. The reason we see no general improvement is, I
suspect, that almost all code written in the compiler has had any loop
invariants already hoisted by hand.
> How much space cost does it have?
> Should it be disabled if --optimize-space is specified?
The space cost is reasonably small (with/without optimization):
linv2.mercury_compile.01 7141956 bytes
linv2.mercury_compile.02 7232452 bytes
linv2.mercury_compile.03 7125124 bytes
linv2.mercury_compile.04 7209316 bytes
linv2.mercury_compile.05 6862404 bytes
linv2.mercury_compile.06 6945092 bytes
linv2.mercury_compile.07 7055428 bytes
linv2.mercury_compile.08 7146084 bytes
linv2.mercury_compile.09 6518852 bytes
linv2.mercury_compile.10 6609444 bytes
linv2.mercury_compile.11 4942144 bytes
linv2.mercury_compile.12 5003520 bytes
linv2.mercury_compile.13 4954368 bytes
linv2.mercury_compile.14 5007904 bytes
linv2.mercury_compile.15 5011712 bytes
linv2.mercury_compile.16 5064960 bytes
linv2.mercury_compile.17 5175552 bytes
linv2.mercury_compile.18 5236992 bytes
linv2.mercury_compile.19 5175552 bytes
linv2.mercury_compile.20 5236992 bytes
I've addressed your review comments:
Estimated hours taken: 900
Branches: main
Added loop-invariant hoisting optimization.
liveness.m:
Improved the debugging output to show HLDS before and after
liveness analysis.
loop_inv.m:
New module containing the implementation of the loop invariant
hoisting optimization.
mercury_compile.m:
Added loop_inv at step 34, moving inlining to step 35.
options.m:
Added bool option --loop-invariants (default `no').
This optimization is set at -O4.
passes_aux.m:
Minor changes to support introduction of the loop
invariant hoisting optmimization.
follow_code.m:
Updated to reflect the new interface to passes_aux.m.
transform_hlds.m:
Added include_module declaration for loop_inv.
Relative diff:
diff -u compiler/liveness.m compiler/liveness.m
--- compiler/liveness.m 16 Oct 2002 01:58:25 -0000
+++ compiler/liveness.m 25 Oct 2002 03:53:54 -0000
@@ -215,20 +215,20 @@
globals__lookup_int_option(Globals, debug_liveness, DebugLiveness),
pred_id_to_int(PredId, PredIdInt),
maybe_write_progress_message("\nbefore liveness",
- DebugLiveness, PredIdInt, Goal0, VarSet, ModuleInfo, IO0, IO0b),
+ DebugLiveness, PredIdInt, Goal0, VarSet, ModuleInfo, IO0, IO1),
initial_liveness(ProcInfo1, PredId, ModuleInfo, Liveness0),
detect_liveness_in_goal(Goal0, Liveness0, LiveInfo,
_, Goal1),
maybe_write_progress_message("\nafter liveness",
- DebugLiveness, PredIdInt, Goal1, VarSet, ModuleInfo, IO0b, IO1),
+ DebugLiveness, PredIdInt, Goal1, VarSet, ModuleInfo, IO1, IO2),
initial_deadness(ProcInfo1, LiveInfo, ModuleInfo, Deadness0),
detect_deadness_in_goal(Goal1, Deadness0, Liveness0, LiveInfo,
_, Goal2),
maybe_write_progress_message("\nafter deadness",
- DebugLiveness, PredIdInt, Goal2, VarSet, ModuleInfo, IO1, IO2),
+ DebugLiveness, PredIdInt, Goal2, VarSet, ModuleInfo, IO2, IO3),
(
globals__get_trace_level(Globals, TraceLevel),
@@ -240,10 +240,10 @@
delay_death_proc_body(Goal2, VarSet, Liveness0, Goal3),
maybe_write_progress_message("\nafter delay death",
DebugLiveness, PredIdInt, Goal3, VarSet, ModuleInfo,
- IO2, IO3)
+ IO3, IO4)
;
Goal3 = Goal2,
- IO3 = IO2
+ IO4 = IO3
),
globals__get_trace_level(Globals, TraceLevel),
@@ -255,7 +255,7 @@
detect_resume_points_in_goal(Goal3, Liveness0, LiveInfo,
ResumeVars0, Goal, _),
maybe_write_progress_message("\nafter resume point",
- DebugLiveness, PredIdInt, Goal, VarSet, ModuleInfo, IO3, IO),
+ DebugLiveness, PredIdInt, Goal, VarSet, ModuleInfo, IO4, IO),
proc_info_set_goal(ProcInfo1, Goal, ProcInfo2),
proc_info_set_liveness_info(ProcInfo2, Liveness0, ProcInfo).
diff -u compiler/loop_inv.m compiler/loop_inv.m
--- compiler/loop_inv.m 16 Oct 2002 04:33:38 -0000
+++ compiler/loop_inv.m 25 Oct 2002 06:33:15 -0000
@@ -2,7 +2,7 @@
% loop_inv.m
% Ralph Becket <rbeck at microsoft.com>
% Thu Nov 8 12:05:29 EST 2001
-% vim: ft=mercury ff=unix ts=4 sw=4 et tw=0 wm=0
+% vim: ft=mercury ts=4 sw=4 et tw=0 wm=0
%
%------------------------------------------------------------------------------%
%
@@ -138,7 +138,7 @@
( if
% We only want to apply this optimization to pure preds (e.g.
- % benchmark_det_loop).
+ % not benchmark_det_loop).
%
hlds_pred__pred_info_get_purity(PredInfo, pure),
@@ -615,8 +615,6 @@
% A constant construction is a construction unification with no
% arguments or which is constructed from a statically initialized
% constant.
- %
- % NOTE: we could do better, but it's not clear it's worth it.
%
:- pred const_construction(hlds_goal).
:- mode const_construction(in) is semidet.
diff -u compiler/mercury_compile.m compiler/mercury_compile.m
--- compiler/mercury_compile.m 16 Oct 2002 01:58:28 -0000
+++ compiler/mercury_compile.m 25 Oct 2002 07:09:36 -0000
@@ -1869,48 +1869,51 @@
Verbose, Stats, HLDS33),
mercury_compile__maybe_dump_hlds(HLDS33, "33", "accum"),
- mercury_compile__maybe_loop_inv(HLDS33, Verbose, Stats, HLDS34),
- mercury_compile__maybe_dump_hlds(HLDS34, "34", "loop_inv"),
+ % Hoisting loop invariants first invokes pass 34, "mark_static".
+ % "mark_static" is also run at stage 60.
+ %
+ mercury_compile__maybe_loop_inv(HLDS33, Verbose, Stats, HLDS35),
+ mercury_compile__maybe_dump_hlds(HLDS36, "35", "loop_inv"),
- mercury_compile__maybe_do_inlining(HLDS34, Verbose, Stats, HLDS35),
- mercury_compile__maybe_dump_hlds(HLDS35, "35", "inlining"),
+ mercury_compile__maybe_do_inlining(HLDS35, Verbose, Stats, HLDS36),
+ mercury_compile__maybe_dump_hlds(HLDS36, "36", "inlining"),
- mercury_compile__maybe_deforestation(HLDS35, Verbose, Stats, HLDS36),
- mercury_compile__maybe_dump_hlds(HLDS36, "36", "deforestation"),
+ mercury_compile__maybe_deforestation(HLDS36, Verbose, Stats, HLDS37),
+ mercury_compile__maybe_dump_hlds(HLDS37, "37", "deforestation"),
- mercury_compile__maybe_delay_construct(HLDS36, Verbose, Stats, HLDS37),
- mercury_compile__maybe_dump_hlds(HLDS37, "37", "delay_construct"),
+ mercury_compile__maybe_delay_construct(HLDS37, Verbose, Stats, HLDS38),
+ mercury_compile__maybe_dump_hlds(HLDS38, "38", "delay_construct"),
- mercury_compile__maybe_unused_args(HLDS37, Verbose, Stats, HLDS39),
- mercury_compile__maybe_dump_hlds(HLDS39, "39", "unused_args"),
+ mercury_compile__maybe_unused_args(HLDS38, Verbose, Stats, HLDS40),
+ mercury_compile__maybe_dump_hlds(HLDS40, "40", "unused_args"),
- mercury_compile__maybe_unneeded_code(HLDS39, Verbose, Stats, HLDS40),
- mercury_compile__maybe_dump_hlds(HLDS40, "40", "unneeded_code"),
+ mercury_compile__maybe_unneeded_code(HLDS40, Verbose, Stats, HLDS41),
+ mercury_compile__maybe_dump_hlds(HLDS41, "41", "unneeded_code"),
- mercury_compile__maybe_lco(HLDS40, Verbose, Stats, HLDS42),
- mercury_compile__maybe_dump_hlds(HLDS42, "42", "lco"),
+ mercury_compile__maybe_lco(HLDS41, Verbose, Stats, HLDS43),
+ mercury_compile__maybe_dump_hlds(HLDS43, "43", "lco"),
% DNF transformations should be after inlining.
- mercury_compile__maybe_transform_dnf(HLDS40, Verbose, Stats, HLDS44),
- mercury_compile__maybe_dump_hlds(HLDS44, "44", "dnf"),
+ mercury_compile__maybe_transform_dnf(HLDS41, Verbose, Stats, HLDS45),
+ mercury_compile__maybe_dump_hlds(HLDS45, "45", "dnf"),
% Magic sets should be the last thing done to Aditi procedures
% before RL code generation, and must come immediately after DNF.
- mercury_compile__maybe_magic(HLDS44, Verbose, Stats, HLDS46),
- mercury_compile__maybe_dump_hlds(HLDS46, "46", "magic"),
+ mercury_compile__maybe_magic(HLDS45, Verbose, Stats, HLDS47),
+ mercury_compile__maybe_dump_hlds(HLDS47, "47", "magic"),
- mercury_compile__maybe_dead_procs(HLDS46, Verbose, Stats, HLDS48),
- mercury_compile__maybe_dump_hlds(HLDS48, "48", "dead_procs"),
+ mercury_compile__maybe_dead_procs(HLDS47, Verbose, Stats, HLDS49),
+ mercury_compile__maybe_dump_hlds(HLDS49, "49", "dead_procs"),
% Deep profiling transformation should be done late in the piece
% since it munges the code a fair amount and introduces strange
% disjunctions that might confuse other hlds->hlds transformations.
- mercury_compile__maybe_deep_profiling(HLDS48, Verbose, Stats, HLDS49,
+ mercury_compile__maybe_deep_profiling(HLDS49, Verbose, Stats, HLDS50,
DeepProfilingStructures),
- mercury_compile__maybe_dump_hlds(HLDS49, "49", "deep_profiling"),
+ mercury_compile__maybe_dump_hlds(HLDS50, "50", "deep_profiling"),
- { HLDS49 = HLDS50 },
- mercury_compile__maybe_dump_hlds(HLDS50, "50", "middle_pass").
+ { HLDS50 = HLDS51 },
+ mercury_compile__maybe_dump_hlds(HLDS51, "51", "middle_pass").
%-----------------------------------------------------------------------------%
@@ -2838,12 +2841,20 @@
mercury_compile__maybe_loop_inv(HLDS0, Verbose, Stats, HLDS) -->
globals__io_lookup_bool_option(loop_invariants, LoopInv),
( { LoopInv = yes } ->
+
+ % We run the mark_static pass because we need
+ % the construct_how flag to be valid.
+ %
+ mercury_compile__maybe_mark_static_terms(HLDS0, Verbose, Stats,
+ HLDS1),
+ mercury_compile__maybe_dump_hlds(HLDS1, "34", "mark_static"),
+
maybe_write_string(Verbose,
"% Hoisting loop invariants...\n"),
maybe_flush_output(Verbose),
process_all_nonimported_procs(
update_module(hoist_loop_invariants),
- HLDS0, HLDS),
+ HLDS1, HLDS),
maybe_write_string(Verbose, "% done.\n"),
maybe_report_stats(Stats)
;
diff -u compiler/options.m compiler/options.m
--- compiler/options.m 16 Oct 2002 01:58:35 -0000
+++ compiler/options.m 25 Oct 2002 07:22:03 -0000
@@ -414,7 +414,6 @@
; optimize_saved_vars_cell_all_path_node_ratio
; optimize_saved_vars_cell_include_all_candidates
; optimize_saved_vars
- ; loop_invariants
; delay_construct
; follow_code
; prev_code
@@ -435,6 +434,7 @@
; optimize_saved_vars_cell_all_path_node_ratio
; optimize_saved_vars_cell_include_all_candidates
; optimize_saved_vars
+ ; loop_invariants
; delay_construct
; follow_code
; prev_code
@@ -923,7 +923,6 @@
optimize_duplicate_calls - bool(no),
constant_propagation - bool(no),
excess_assign - bool(no),
- loop_invariants - bool(no),
optimize_saved_vars_const - bool(no),
optimize_saved_vars_cell - bool(no),
optimize_saved_vars_cell_loop - bool(yes),
@@ -944,6 +943,7 @@
optimize_duplicate_calls - bool(no),
constant_propagation - bool(no),
excess_assign - bool(no),
+ loop_invariants - bool(no),
optimize_saved_vars_const - bool(no),
optimize_saved_vars_cell - bool(no),
optimize_saved_vars_cell_loop - bool(yes),
@@ -1469,7 +1469,6 @@
long_option("optimize-constant-propagation", constant_propagation).
long_option("optimize-saved-vars", optimize_saved_vars).
long_option("optimise-saved-vars", optimize_saved_vars).
-long_option("loop-invariants", loop_invariants).
long_option("optimize-saved-vars-const", optimize_saved_vars_const).
long_option("optimise-saved-vars-const", optimize_saved_vars_const).
long_option("optimize-saved-vars-cell", optimize_saved_vars_cell).
@@ -1489,6 +1488,7 @@
long_option("optimize-constant-propagation", constant_propagation).
long_option("optimize-saved-vars", optimize_saved_vars).
long_option("optimise-saved-vars", optimize_saved_vars).
+long_option("loop-invariants", loop_invariants).
long_option("optimize-saved-vars-const", optimize_saved_vars_const).
long_option("optimise-saved-vars-const", optimize_saved_vars_const).
long_option("optimize-saved-vars-cell", optimize_saved_vars_cell).
@@ -2053,8 +2053,7 @@
use_local_vars - bool(yes),
inline_simple_threshold - int(8),
inline_compound_threshold - int(20),
- higher_order_size_limit - int(30),
- loop_invariants - bool(yes)
+ higher_order_size_limit - int(30)
]).
% Optimization level 5: apply optimizations which may have some
@@ -2073,7 +2072,8 @@
use_local_vars - bool(yes),
inline_simple_threshold - int(8),
inline_compound_threshold - int(20),
- higher_order_size_limit - int(30)
+ higher_order_size_limit - int(30),
+ loop_invariants - bool(yes)
]).
% Optimization level 5: apply optimizations which may have some
@@ -3160,8 +3160,6 @@
"--optimize-duplicate-calls",
"\tOptimize away multiple calls to a predicate",
"\twith the same input arguments.",
- "--loop-invariants",
- "\tHoist loop invariants out of loops.",
"--delay-constructs",
"\tReorder goals to move construction unifications after",
"\tprimitive goals that can fail.",
@@ -3190,6 +3188,8 @@
"--optimize-duplicate-calls",
"\tOptimize away multiple calls to a predicate",
"\twith the same input arguments.",
+ "--loop-invariants",
+ "\tHoist loop invariants out of loops.",
"--delay-constructs",
"\tReorder goals to move construction unifications after",
"\tprimitive goals that can fail.",
only in patch2:
--- doc/user_guide.texi 22 Oct 2002 04:36:03 -0000 1.332
+++ doc/user_guide.texi 25 Oct 2002 07:22:05 -0000
@@ -5603,6 +5603,12 @@
containing large numbers of variables can cause
slow compilation.
+ at item --loop-invariants
+ at findex --loop-invariants
+Optimize loop invariants by moving computations within a loop that are
+the same on every iteration to the outside so they are only calculated
+once. (This is a separate optimization to @samp{--optimize-rl-invariants}.)
+
@sp 1
@item --no-common-struct
@findex --no-common-struct
only in patch2:
--- compiler/notes/compiler_design.html 15 Apr 2002 05:04:15 -0000 1.78
+++ compiler/notes/compiler_design.html 25 Oct 2002 03:51:15 -0000
@@ -639,10 +639,17 @@
<li>
pd_util.m contains various utility predicates.
</ul>
+
+<li> loop_inv.m: loop invariant hoisting. This transformation moves
+ computations within loops that are the same on every iteration to the outside
+ of the loop so that the invariant computations are only computed once. The
+ transformation turns a single looping predicate containing invariant
+ computations into two: one that computes the invariants on the first
+ iteration and then loops by calling the second predicate with extra arguments
+ for the invariant values.
<li> issue warnings about unused arguments from predicates, and create
- specialized versions without them (unused_args.m); type_infos are
- often unused.
+specialized versions without them (unused_args.m); type_infos are often unused.
<li> delay_construct.m pushes construction unifications to the right in
semidet conjunctions, in an effort to reduce the probability that it will
only in patch2:
--- NEWS 22 Oct 2002 13:10:33 -0000 1.274
+++ NEWS 25 Oct 2002 07:21:56 -0000
@@ -18,6 +18,7 @@
* A new `--smart-recompilation' option, for fine-grained dependency tracking.
* A new optional warning: `--warn-non-tail-recursion'.
* A new optimization: `--constraint-propagation'.
+* A new optimization: `--loop-invariants'.
* Support for arbitrary mappings from module name to source file name.
Major improvements to the Mercury debugger, including:
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list