[m-dev.] for review: balance in 234-trees

Zoltan Somogyi zs at cs.mu.OZ.AU
Fri Apr 5 16:23:53 AEST 2002

For anyone to review.


Add facilities for measuring and improving the balance of 234-trees.

	Add two new functions. One computes information about the balance of
	a tree, while the other creates a balanced version of a possibly
	unbalanced tree.

	Both functions are exported, but neither is documented, because they
	shouldn't be a permanent part of the module. The module should
	guarantee that all 234-trees are balanced at all times, but at
	the moment it does not meet that guarantee.

	Add a long comment about the circumstances under which we fail the

	Add two new functions analogous to the two new functions in tree234.m,
	also exported and undocumented, for the same reason.

	Add code to print out 234-tree balance information. Since this code
	is not generally useful, its operative part is commented out.

cvs diff: Diffing .
cvs diff: Diffing bindist
cvs diff: Diffing boehm_gc
cvs diff: Diffing boehm_gc/Mac_files
cvs diff: Diffing boehm_gc/cord
cvs diff: Diffing boehm_gc/cord/private
cvs diff: Diffing boehm_gc/doc
cvs diff: Diffing boehm_gc/include
cvs diff: Diffing boehm_gc/include/private
cvs diff: Diffing boehm_gc/tests
cvs diff: Diffing browser
cvs diff: Diffing bytecode
cvs diff: Diffing compiler
Index: compiler/hlds_out.m
RCS file: /home/mercury1/repository/mercury/compiler/hlds_out.m,v
retrieving revision 1.282
diff -u -b -r1.282 hlds_out.m
--- compiler/hlds_out.m	28 Mar 2002 03:43:01 -0000	1.282
+++ compiler/hlds_out.m	5 Apr 2002 05:22:45 -0000
@@ -2644,6 +2644,15 @@
 	{ map__keys(VarTypes, Vars) },
 	hlds_out__write_var_types_2(Vars, Indent, VarSet, AppendVarnums,
 		VarTypes, TVarSet).
+	% Uncomment the following if you want information about
+	% balance in the 234-trees holding variable types.
+	%
+	% hlds_out__write_indent(Indent),
+	% io__write_string("% variable types map balance info:"),
+	% { map__to_assoc_list(map__compute_balance_info(VarTypes),
+	%	DepthList) },
+	% hlds_out__write_depth_list(DepthList),
+	% io__write_string("\n").
 :- pred hlds_out__write_var_types_2(list(prog_var), int, prog_varset, bool,
 	vartypes, tvarset, io__state, io__state).
@@ -2665,6 +2674,17 @@
 	hlds_out__write_var_types_2(Vars, Indent, VarSet, AppendVarnums,
 		VarTypes, TypeVarSet).
+:- pred hlds_out__write_depth_list(assoc_list(int, int)::in,
+	io__state::di, io__state::uo) is det.
+hlds_out__write_depth_list([]) --> [].
+hlds_out__write_depth_list([Depth - Count | DepthList]) -->
+	io__write_string(" "),
+	io__write_int(Depth),
+	io__write_string(" -> "),
+	io__write_int(Count),
+	hlds_out__write_depth_list(DepthList).
 :- pred hlds_out__write_typeinfo_varmap(int, bool, map(tvar, type_info_locn),
 		prog_varset, tvarset, io__state, io__state).
cvs diff: Diffing compiler/notes
cvs diff: Diffing debian
cvs diff: Diffing deep_profiler
cvs diff: Diffing deep_profiler/notes
cvs diff: Diffing doc
cvs diff: Diffing extras
cvs diff: Diffing extras/aditi
cvs diff: Diffing extras/cgi
cvs diff: Diffing extras/complex_numbers
cvs diff: Diffing extras/complex_numbers/samples
cvs diff: Diffing extras/complex_numbers/tests
cvs diff: Diffing extras/concurrency
cvs diff: Diffing extras/curs
cvs diff: Diffing extras/curs/samples
cvs diff: Diffing extras/curses
cvs diff: Diffing extras/curses/sample
cvs diff: Diffing extras/dynamic_linking
cvs diff: Diffing extras/graphics
cvs diff: Diffing extras/graphics/mercury_opengl
cvs diff: Diffing extras/graphics/mercury_tcltk
cvs diff: Diffing extras/graphics/samples
cvs diff: Diffing extras/graphics/samples/calc
cvs diff: Diffing extras/graphics/samples/maze
cvs diff: Diffing extras/graphics/samples/pent
cvs diff: Diffing extras/lazy_evaluation
cvs diff: Diffing extras/lex
cvs diff: Diffing extras/lex/samples
cvs diff: Diffing extras/logged_output
cvs diff: Diffing extras/moose
cvs diff: Diffing extras/moose/samples
cvs diff: Diffing extras/morphine
cvs diff: Diffing extras/morphine/non-regression-tests
cvs diff: Diffing extras/morphine/scripts
cvs diff: Diffing extras/morphine/source
cvs diff: Diffing extras/odbc
cvs diff: Diffing extras/posix
cvs diff: Diffing extras/quickcheck
cvs diff: Diffing extras/quickcheck/tutes
cvs diff: Diffing extras/references
cvs diff: Diffing extras/references/samples
cvs diff: Diffing extras/references/tests
cvs diff: Diffing extras/stream
cvs diff: Diffing extras/trailed_update
cvs diff: Diffing extras/trailed_update/samples
cvs diff: Diffing extras/trailed_update/tests
cvs diff: Diffing extras/xml
cvs diff: Diffing extras/xml/samples
cvs diff: Diffing java
cvs diff: Diffing java/library
cvs diff: Diffing java/runtime
cvs diff: Diffing library
Index: library/map.m
RCS file: /home/mercury1/repository/mercury/library/map.m,v
retrieving revision 1.80
diff -u -b -r1.80 map.m
--- library/map.m	24 May 2001 02:32:28 -0000	1.80
+++ library/map.m	5 Apr 2002 05:27:55 -0000
@@ -399,6 +399,17 @@
 :- pragma type_spec(map__select/2, K = var(_)).
 :- pragma type_spec(map__select/3, K = var(_)).
+	% Given a map, returns a tree whose keys are integers representing
+        % depths and in which the value associated with a key is the number of
+        % leaves in the input tree at that depth. For a perfectly balanced
+        % tree, the returned map should have just one entry, meaning that all
+        % leaves are at the same depth. However, given the current imperfection
+        % of tree234__insert, this is not guaranteed.
+:- func map__compute_balance_info(map(K, V)) = map(int, int).
+	% Given a map, return a balanced version of it.
+:- func map__balance(map(K, V)) = map(K, V).
 :- implementation.
 :- import_module std_util, require, string.
@@ -778,6 +789,10 @@
 		error("map__det_union: map__union failed")
+map__compute_balance_info(Tree) = tree234__compute_balance_info(Tree).
+map__balance(Tree) = tree234__balance(Tree).
Index: library/tree234.m
RCS file: /home/mercury1/repository/mercury/library/tree234.m,v
retrieving revision 1.35
diff -u -b -r1.35 tree234.m
--- library/tree234.m	28 Mar 2002 03:44:40 -0000	1.35
+++ library/tree234.m	5 Apr 2002 05:27:58 -0000
@@ -208,6 +208,17 @@
 :- mode uo_tree234(K, V) :: free -> uniq_tree234(K, V).
 :- mode uo_tree234       :: free -> uniq_tree234(ground, ground).
+	% Given a tree, returns a tree whose keys are integers representing
+	% depths and in which the value associated with a key is the number of
+	% leaves in the input tree at that depth. For a perfectly balanced
+	% tree, the returned map should have just one entry, meaning that all
+	% leaves are at the same depth. However, given the current imperfection
+	% of tree234__insert, this is not guaranteed.
+:- func tree234__compute_balance_info(tree234(K, V)) = tree234(int, int).
+	% Given a tree, return a balanced version of it.
+:- func tree234__balance(tree234(K, V)) = tree234(K, V).
 :- implementation.
@@ -781,15 +792,52 @@
-% tree234__insert is implemented using the simple top-down
-% approach described in eg Sedgwick which splits 4 nodes into
-% two 2 nodes on the downward traversal of the tree as we
-% search for the right place to insert the new key-value pair.
-% We know we have the right place if the subtrees of the node
-% are empty (in which case we expand the node - which will always
-% work because we already split 4 nodes into 2 nodes), or if the
-% tree itself is empty.
-% This algorithm is O(lgN).
+% tree234__insert is implemented using the top-down approach described in
+% e.g. Sedgwick. We split 4-nodes into three 2-nodes on the downward traversal
+% of the tree, but we do not simply replace the 4-node with the three 2-nodes,
+% since that would destroy the balance of the tree. Instead, we lift the
+% key-value pair of the top two-node into its parent node. If the parent node
+% was a 2-node, this effectively transforms the original tree, which will be
+% of the form
+% two(K1, V1,
+%	T0,
+%	four(K10, V10, K11, V11, K12, V12, T10, T11, T12, T13)
+% )
+% into the tree
+% three(K1, V1, K11, V11,
+%	T0,
+%	two(K10, V10, T10, T11),
+%	two(K12, V12, T12, T13)
+% )
+% before finding the right subtree to perform the insertion on. We do a similar
+% transformation if the parent node is a 3-node. In both cases, we preserve the
+% balance of the tree, in the sense that if the tree was balanced before the
+% insertion, it will be balanced after it.
+% If the child we want to insert into and its parent node are both 4-nodes,
+% then insertion will not preserve balance. However, this can happen only after
+% the following chain of events:
+% - We insert something into a 3-node with non-empty subtrees, causing its
+%   transformation to a 4-node. This also makes sure that the subtrees
+%   involved in the insertion are not 4-nodes.
+% - We later insert other items into the subtrees involved in the earlier
+%   insertion, causing them to become 4-nodes.
+% If we insert items into a tree in a monotonic order, then once we cause
+% the growth of a node from a 2-node to a 3-node or from a 3-node to a 4-node,
+% we never insert anything into the non-right-mode subtrees of that node.
+% This ensures that step 2 of the above scenario never happens. This in turn
+% ensures that a search of the 234-tree will never find two 4-nodes in a row,
+% which means that all insertions will preserve the balance of the tree.
+% We cannot guarantee the preservation of balance for other insertion patterns,
+% and indeed most random insertion patterns yield unbalanced trees.
 tree234__insert(Tin, K, V, Tout) :-
@@ -2497,6 +2545,51 @@
 	tree234__count(T2, N2),
 	tree234__count(T3, N3),
 	N is 3 + N0 + N1 + N2 + N3.
+tree234__compute_balance_info(Tree) =
+	tree234__compute_balance_info_2(Tree, 0, empty).
+:- func tree234__compute_balance_info_2(tree234(K, V), int, tree234(int, int))
+	= tree234(int, int).
+tree234__compute_balance_info_2(empty, Depth0, DepthMap0) =
+	tree234__compute_balance_info_record_depth(Depth0, DepthMap0).
+tree234__compute_balance_info_2(two(_, _, T0, T1), Depth0,
+		DepthMap0) = DepthMap :-
+	Depth1 = Depth0 + 1,
+	DepthMap1 = tree234__compute_balance_info_2(T0, Depth1, DepthMap0),
+	DepthMap  = tree234__compute_balance_info_2(T1, Depth1, DepthMap1).
+tree234__compute_balance_info_2(three(_, _, _, _, T0, T1, T2), Depth0,
+		DepthMap0) = DepthMap :-
+	Depth1 = Depth0 + 1,
+	DepthMap1 = tree234__compute_balance_info_2(T0, Depth1, DepthMap0),
+	DepthMap2 = tree234__compute_balance_info_2(T1, Depth1, DepthMap1),
+	DepthMap  = tree234__compute_balance_info_2(T2, Depth1, DepthMap2).
+tree234__compute_balance_info_2(four(_, _, _, _, _, _, T0, T1, T2, T3), Depth0,
+		DepthMap0) = DepthMap :-
+	Depth1 = Depth0 + 1,
+	DepthMap1 = tree234__compute_balance_info_2(T0, Depth1, DepthMap0),
+	DepthMap2 = tree234__compute_balance_info_2(T1, Depth1, DepthMap1),
+	DepthMap3 = tree234__compute_balance_info_2(T2, Depth1, DepthMap2),
+	DepthMap  = tree234__compute_balance_info_2(T3, Depth1, DepthMap3).
+:- func tree234__compute_balance_info_record_depth(int, tree234(int, int))
+	= tree234(int, int).
+tree234__compute_balance_info_record_depth(Depth, DepthMap0) = DepthMap :-
+	( tree234__search(DepthMap0, Depth, Count0) ->
+		tree234__set(DepthMap0, Depth, Count0 + 1, DepthMap)
+	;
+		tree234__set(DepthMap0, Depth, 1, DepthMap)
+	).
+	% By inserting the elements of Tree0 into a new tree in order,
+	% we get tree234__insert to preserve balance.
+tree234__balance(Tree0) = Tree :-
+	tree234__tree234_to_assoc_list(Tree0, List),
+	tree234__assoc_list_to_tree234(List, Tree).
cvs diff: Diffing profiler
cvs diff: Diffing robdd
cvs diff: Diffing runtime
cvs diff: Diffing runtime/GETOPT
cvs diff: Diffing runtime/machdeps
cvs diff: Diffing samples
cvs diff: Diffing samples/c_interface
cvs diff: Diffing samples/c_interface/c_calls_mercury
cvs diff: Diffing samples/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/c_interface/mercury_calls_c
cvs diff: Diffing samples/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/c_interface/mercury_calls_fortran
cvs diff: Diffing samples/c_interface/simpler_c_calls_mercury
cvs diff: Diffing samples/c_interface/simpler_cplusplus_calls_mercury
cvs diff: Diffing samples/diff
cvs diff: Diffing samples/muz
cvs diff: Diffing samples/rot13
cvs diff: Diffing samples/solutions
cvs diff: Diffing samples/tests
cvs diff: Diffing samples/tests/c_interface
cvs diff: Diffing samples/tests/c_interface/c_calls_mercury
cvs diff: Diffing samples/tests/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/tests/c_interface/mercury_calls_c
cvs diff: Diffing samples/tests/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/tests/c_interface/mercury_calls_fortran
cvs diff: Diffing samples/tests/c_interface/simpler_c_calls_mercury
cvs diff: Diffing samples/tests/c_interface/simpler_cplusplus_calls_mercury
cvs diff: Diffing samples/tests/diff
cvs diff: Diffing samples/tests/muz
cvs diff: Diffing samples/tests/rot13
cvs diff: Diffing samples/tests/solutions
cvs diff: Diffing samples/tests/toplevel
cvs diff: Diffing scripts
cvs diff: Diffing tests
cvs diff: Diffing tests/benchmarks
cvs diff: Diffing tests/debugger
cvs diff: Diffing tests/debugger/declarative
cvs diff: Diffing tests/dppd
cvs diff: Diffing tests/general
cvs diff: Diffing tests/general/accumulator
cvs diff: Diffing tests/general/structure_reuse
cvs diff: Diffing tests/hard_coded
cvs diff: Diffing tests/hard_coded/exceptions
cvs diff: Diffing tests/hard_coded/purity
cvs diff: Diffing tests/hard_coded/sub-modules
cvs diff: Diffing tests/hard_coded/typeclasses
cvs diff: Diffing tests/invalid
cvs diff: Diffing tests/invalid/purity
cvs diff: Diffing tests/misc_tests
cvs diff: Diffing tests/recompilation
cvs diff: Diffing tests/tabling
cvs diff: Diffing tests/term
cvs diff: Diffing tests/valid
cvs diff: Diffing tests/warnings
cvs diff: Diffing tools
cvs diff: Diffing trace
cvs diff: Diffing util
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au

More information about the developers mailing list