[m-dev.] for review: rewrite of duplicate elimination.
Zoltan Somogyi
zs at cs.mu.oz.au
Mon Dec 22 20:48:39 AEDT 1997
Tyson writes:
> > runtime/mercury_tags.h:
> > Add two new macros, mask_field and const_mask_field, that behave
> > just like field and const_field except that instead of stripping
> > off a known tag from the pointer, they strip (mask) off an unknown
> > tag.
>
> s/unknown/possibly unknown/
No, in this case it definitely should be unknown. See below.
> Is there any need for the previous mechanism to remain?
> Once you are going to do a field, it doesn't matter what the
> tag is (unless the masking operation is slower than the subtraction).
> If the old mechanism is only kept for performance reasons, this should
> be documented.
Yes, masking is slower than the subtraction, because it requires a
separate instruction. The subtraction is folded into the load or store
instruction, by modifying the offset. (Although there is one architecture,
IBM Power, in which all word load and store instructions mask off the
bottom two bits automatically. I don't know if this was changed for the
PowerPC.)
Therefore you don't want to use mask_field if field will do.
> Is it worth preserving the original comments as well?
It could be, but probably not. I will modify the coe to preserve the comments
if I find I need them, but I don't expect I will need to.
> > :- pred most_specific_instr(instr::in, instr::in, instr::out) is semidet.
>
> Like the other predicate Fergus mentioned, we need a good definition
> of "most_specific_instr" so people adding llds instructions can attempt
> to get it right without asking you to write the code for them ;-).
Given that all these predicates return the original (lval,rval,instr)
in all circumstances except a very small number, I expect noone will have
any trouble whatsoever figuring out what the correct change is.
The comment I added after Fergus's review mentions this and explicitly
lists the exceptions.
> It would also be good to have a small test case that triggers this
> optimization for the test suite.
There is no need. Several modules of the compiler and the library
are affected, and dupelim is turned on by the default opt level.
If something goes wrong, the bootcheck will fail. (This is true of
all LLDS-to-LLDS optimizations and most other optimizations, since I get
most ideas for such optimizations from examining the code of the compiler.)
A diff to address your other points (and to fix a silly typo, 5 -> 6) follows.
Zoltan.
Index: dupelim.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/dupelim.m,v
retrieving revision 1.27
diff -u -u -r1.27 dupelim.m
--- dupelim.m 1997/12/22 06:58:06 1.27
+++ dupelim.m 1997/12/22 09:42:51
@@ -26,7 +26,7 @@
% and substitute this code for the code of the copy of the block
% that step 4 has decided to keep.
%
-% 5. Convert the (possibly reduced) list of basic blocks back to a
+% 6. Convert the (possibly reduced) list of basic blocks back to a
% list of instructions and substitute all references to the labels
% starting eliminated blocks to refer to their noneliminated version.
%
@@ -60,8 +60,8 @@
% cluster(Exemplar, OtherLabels) means that references to labels
% in OtherLabels can be replaced with references to Exemplar
- % once its block has been replaced with the most specific antiunified
- % version of the blocks started by Exemplar and OtherLabels.
+ % once its block has been replaced with the most specific
+ % generalization of the blocks started by Exemplar and OtherLabels.
% OtherLabels must be nonempty.
:- type cluster ---> cluster(label, list(label)).
@@ -89,6 +89,10 @@
%-----------------------------------------------------------------------------%
+% dupelim__build_maps builds up a map mapping standardized instruction
+% sequences to the label(s) that start basic blocks with that standardized
+% form, and a set showing which labels are fallen into.
+
:- pred dupelim__build_maps(list(label)::in, block_map::in,
std_map::in, std_map::out, set(label)::in, set(label)::out) is det.
@@ -111,6 +115,11 @@
dupelim__build_maps(Labels, BlockMap, StdMap1, StdMap,
FallInto1, FallInto).
+% For each set of labels that start basic blocks with identical standard forms,
+% find_clusters finds out whether we can eliminate some of those blocks;
+% if yes, it decides which blocks can be eliminated and which other block
+% should stand in their place.
+
% If two or more blocks have the same standardized form, it may be possible
% to eliminate all but one of the blocks. However, blocks that can be fallen
% into cannot be eliminated. (Actually, they could, but only by inserting
@@ -146,6 +155,12 @@
%-----------------------------------------------------------------------------%
+% For each cluster, a set of blocks in which all but one are to be eliminated
+% favor of the remaining one, find their most specific common generalization
+% (which must exist), and substitute this code for the code of the copy of
+% the block that is to be kept. Remove the eliminated labels from the
+% label sequence and map them to their replacements.
+
:- pred process_clusters(list(cluster)::in, list(label)::in, list(label)::out,
block_map::in, block_map::out,
map(label, label)::in, map(label, label)::out) is det.
@@ -167,6 +182,16 @@
map__det_update(BlockMap0, Exemplar, ExemplarInfo, BlockMap1),
process_clusters(Clusters, LabelSeq1, LabelSeq, BlockMap1, BlockMap,
ReplMap1, ReplMap).
+
+% Given the current form of a basic block (instructions and fallthrough),
+% compute its most specific generalization with the basic blocks headed
+% by the given labels, whose basic blocks are to be eliminated.
+%
+% On the same traversal of the list of to-be-eliminated labels, remove each
+% such label from the sequence of labels whose basic blocks will make up
+% the final code of the procedure, and add the mapping of the eliminated
+% label to the replacement (exemplar) label to the set of substitutions
+% that will need to be done.
:- pred process_elim_labels(list(label)::in, list(instruction)::in,
maybe(label)::in, list(label)::in, list(label)::out, block_map::in,
Index: llds.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/llds.m,v
retrieving revision 1.216
diff -u -u -r1.216 llds.m
--- llds.m 1997/12/22 06:58:20 1.216
+++ llds.m 1997/12/22 09:23:05
@@ -377,6 +377,9 @@
% is FieldNum words. If Tag is yes, the
% arg gives the value of the tag; if it is
% no, the tag bits will have to be masked off.
+ % The value of the tag should be given if
+ % it is known, since this will lead to
+ % faster code.
/* values somewhere in memory */
More information about the developers
mailing list