[m-dev.] for review: balance in 234-trees

Zoltan Somogyi zs at cs.mu.OZ.AU
Fri Apr 5 16:23:53 AEST 2002


For anyone to review.

Zoltan.

Add facilities for measuring and improving the balance of 234-trees.

library/tree234.m:
	Add two new functions. One computes information about the balance of
	a tree, while the other creates a balanced version of a possibly
	unbalanced tree.

	Both functions are exported, but neither is documented, because they
	shouldn't be a permanent part of the module. The module should
	guarantee that all 234-trees are balanced at all times, but at
	the moment it does not meet that guarantee.

	Add a long comment about the circumstances under which we fail the
	guarantee.

library/map.m:
	Add two new functions analogous to the two new functions in tree234.m,
	also exported and undocumented, for the same reason.

compiler/hlds_out.m:
	Add code to print out 234-tree balance information. Since this code
	is not generally useful, its operative part is commented out.

cvs diff: Diffing .
cvs diff: Diffing bindist
cvs diff: Diffing boehm_gc
cvs diff: Diffing boehm_gc/Mac_files
cvs diff: Diffing boehm_gc/cord
cvs diff: Diffing boehm_gc/cord/private
cvs diff: Diffing boehm_gc/doc
cvs diff: Diffing boehm_gc/include
cvs diff: Diffing boehm_gc/include/private
cvs diff: Diffing boehm_gc/tests
cvs diff: Diffing browser
cvs diff: Diffing bytecode
cvs diff: Diffing compiler
Index: compiler/hlds_out.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/hlds_out.m,v
retrieving revision 1.282
diff -u -b -r1.282 hlds_out.m
--- compiler/hlds_out.m	28 Mar 2002 03:43:01 -0000	1.282
+++ compiler/hlds_out.m	5 Apr 2002 05:22:45 -0000
@@ -2644,6 +2644,15 @@
 	{ map__keys(VarTypes, Vars) },
 	hlds_out__write_var_types_2(Vars, Indent, VarSet, AppendVarnums,
 		VarTypes, TVarSet).
+	% Uncomment the following if you want information about
+	% balance in the 234-trees holding variable types.
+	%
+	% hlds_out__write_indent(Indent),
+	% io__write_string("% variable types map balance info:"),
+	% { map__to_assoc_list(map__compute_balance_info(VarTypes),
+	%	DepthList) },
+	% hlds_out__write_depth_list(DepthList),
+	% io__write_string("\n").
 
 :- pred hlds_out__write_var_types_2(list(prog_var), int, prog_varset, bool,
 	vartypes, tvarset, io__state, io__state).
@@ -2665,6 +2674,17 @@
 	io__write_string("\n"),
 	hlds_out__write_var_types_2(Vars, Indent, VarSet, AppendVarnums,
 		VarTypes, TypeVarSet).
+
+:- pred hlds_out__write_depth_list(assoc_list(int, int)::in,
+	io__state::di, io__state::uo) is det.
+
+hlds_out__write_depth_list([]) --> [].
+hlds_out__write_depth_list([Depth - Count | DepthList]) -->
+	io__write_string(" "),
+	io__write_int(Depth),
+	io__write_string(" -> "),
+	io__write_int(Count),
+	hlds_out__write_depth_list(DepthList).
 
 :- pred hlds_out__write_typeinfo_varmap(int, bool, map(tvar, type_info_locn),
 		prog_varset, tvarset, io__state, io__state).
cvs diff: Diffing compiler/notes
cvs diff: Diffing debian
cvs diff: Diffing deep_profiler
cvs diff: Diffing deep_profiler/notes
cvs diff: Diffing doc
cvs diff: Diffing extras
cvs diff: Diffing extras/aditi
cvs diff: Diffing extras/cgi
cvs diff: Diffing extras/complex_numbers
cvs diff: Diffing extras/complex_numbers/samples
cvs diff: Diffing extras/complex_numbers/tests
cvs diff: Diffing extras/concurrency
cvs diff: Diffing extras/curs
cvs diff: Diffing extras/curs/samples
cvs diff: Diffing extras/curses
cvs diff: Diffing extras/curses/sample
cvs diff: Diffing extras/dynamic_linking
cvs diff: Diffing extras/graphics
cvs diff: Diffing extras/graphics/mercury_opengl
cvs diff: Diffing extras/graphics/mercury_tcltk
cvs diff: Diffing extras/graphics/samples
cvs diff: Diffing extras/graphics/samples/calc
cvs diff: Diffing extras/graphics/samples/maze
cvs diff: Diffing extras/graphics/samples/pent
cvs diff: Diffing extras/lazy_evaluation
cvs diff: Diffing extras/lex
cvs diff: Diffing extras/lex/samples
cvs diff: Diffing extras/logged_output
cvs diff: Diffing extras/moose
cvs diff: Diffing extras/moose/samples
cvs diff: Diffing extras/morphine
cvs diff: Diffing extras/morphine/non-regression-tests
cvs diff: Diffing extras/morphine/scripts
cvs diff: Diffing extras/morphine/source
cvs diff: Diffing extras/odbc
cvs diff: Diffing extras/posix
cvs diff: Diffing extras/quickcheck
cvs diff: Diffing extras/quickcheck/tutes
cvs diff: Diffing extras/references
cvs diff: Diffing extras/references/samples
cvs diff: Diffing extras/references/tests
cvs diff: Diffing extras/stream
cvs diff: Diffing extras/trailed_update
cvs diff: Diffing extras/trailed_update/samples
cvs diff: Diffing extras/trailed_update/tests
cvs diff: Diffing extras/xml
cvs diff: Diffing extras/xml/samples
cvs diff: Diffing java
cvs diff: Diffing java/library
cvs diff: Diffing java/runtime
cvs diff: Diffing library
Index: library/map.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/map.m,v
retrieving revision 1.80
diff -u -b -r1.80 map.m
--- library/map.m	24 May 2001 02:32:28 -0000	1.80
+++ library/map.m	5 Apr 2002 05:27:55 -0000
@@ -399,6 +399,17 @@
 :- pragma type_spec(map__select/2, K = var(_)).
 :- pragma type_spec(map__select/3, K = var(_)).
 
+	% Given a map, returns a tree whose keys are integers representing
+        % depths and in which the value associated with a key is the number of
+        % leaves in the input tree at that depth. For a perfectly balanced
+        % tree, the returned map should have just one entry, meaning that all
+        % leaves are at the same depth. However, given the current imperfection
+        % of tree234__insert, this is not guaranteed.
+:- func map__compute_balance_info(map(K, V)) = map(int, int).
+
+	% Given a map, return a balanced version of it.
+:- func map__balance(map(K, V)) = map(K, V).
+
 :- implementation.
 :- import_module std_util, require, string.
 
@@ -778,6 +789,10 @@
 	;
 		error("map__det_union: map__union failed")
 	).
+
+map__compute_balance_info(Tree) = tree234__compute_balance_info(Tree).
+
+map__balance(Tree) = tree234__balance(Tree).
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
Index: library/tree234.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/tree234.m,v
retrieving revision 1.35
diff -u -b -r1.35 tree234.m
--- library/tree234.m	28 Mar 2002 03:44:40 -0000	1.35
+++ library/tree234.m	5 Apr 2002 05:27:58 -0000
@@ -208,6 +208,17 @@
 :- mode uo_tree234(K, V) :: free -> uniq_tree234(K, V).
 :- mode uo_tree234       :: free -> uniq_tree234(ground, ground).
 
+	% Given a tree, returns a tree whose keys are integers representing
+	% depths and in which the value associated with a key is the number of
+	% leaves in the input tree at that depth. For a perfectly balanced
+	% tree, the returned map should have just one entry, meaning that all
+	% leaves are at the same depth. However, given the current imperfection
+	% of tree234__insert, this is not guaranteed.
+:- func tree234__compute_balance_info(tree234(K, V)) = tree234(int, int).
+
+	% Given a tree, return a balanced version of it.
+:- func tree234__balance(tree234(K, V)) = tree234(K, V).
+
 :- implementation.
 
 %------------------------------------------------------------------------------%
@@ -781,15 +792,52 @@
 
 %------------------------------------------------------------------------------%
 
-% tree234__insert is implemented using the simple top-down
-% approach described in eg Sedgwick which splits 4 nodes into
-% two 2 nodes on the downward traversal of the tree as we
-% search for the right place to insert the new key-value pair.
-% We know we have the right place if the subtrees of the node
-% are empty (in which case we expand the node - which will always
-% work because we already split 4 nodes into 2 nodes), or if the
-% tree itself is empty.
-% This algorithm is O(lgN).
+% tree234__insert is implemented using the top-down approach described in
+% e.g. Sedgwick. We split 4-nodes into three 2-nodes on the downward traversal
+% of the tree, but we do not simply replace the 4-node with the three 2-nodes,
+% since that would destroy the balance of the tree. Instead, we lift the
+% key-value pair of the top two-node into its parent node. If the parent node
+% was a 2-node, this effectively transforms the original tree, which will be
+% of the form
+%
+% two(K1, V1,
+%	T0,
+%	four(K10, V10, K11, V11, K12, V12, T10, T11, T12, T13)
+% )
+%
+% into the tree
+%
+% three(K1, V1, K11, V11,
+%	T0,
+%	two(K10, V10, T10, T11),
+%	two(K12, V12, T12, T13)
+% )
+%
+% before finding the right subtree to perform the insertion on. We do a similar
+% transformation if the parent node is a 3-node. In both cases, we preserve the
+% balance of the tree, in the sense that if the tree was balanced before the
+% insertion, it will be balanced after it.
+%
+% If the child we want to insert into and its parent node are both 4-nodes,
+% then insertion will not preserve balance. However, this can happen only after
+% the following chain of events:
+%
+% - We insert something into a 3-node with non-empty subtrees, causing its
+%   transformation to a 4-node. This also makes sure that the subtrees
+%   involved in the insertion are not 4-nodes.
+%
+% - We later insert other items into the subtrees involved in the earlier
+%   insertion, causing them to become 4-nodes.
+%
+% If we insert items into a tree in a monotonic order, then once we cause
+% the growth of a node from a 2-node to a 3-node or from a 3-node to a 4-node,
+% we never insert anything into the non-right-mode subtrees of that node.
+% This ensures that step 2 of the above scenario never happens. This in turn
+% ensures that a search of the 234-tree will never find two 4-nodes in a row,
+% which means that all insertions will preserve the balance of the tree.
+%
+% We cannot guarantee the preservation of balance for other insertion patterns,
+% and indeed most random insertion patterns yield unbalanced trees.
 
 tree234__insert(Tin, K, V, Tout) :-
 	(
@@ -2497,6 +2545,51 @@
 	tree234__count(T2, N2),
 	tree234__count(T3, N3),
 	N is 3 + N0 + N1 + N2 + N3.
+
+%-----------------------------------------------------------------------------%
+
+tree234__compute_balance_info(Tree) =
+	tree234__compute_balance_info_2(Tree, 0, empty).
+
+:- func tree234__compute_balance_info_2(tree234(K, V), int, tree234(int, int))
+	= tree234(int, int).
+
+tree234__compute_balance_info_2(empty, Depth0, DepthMap0) =
+	tree234__compute_balance_info_record_depth(Depth0, DepthMap0).
+tree234__compute_balance_info_2(two(_, _, T0, T1), Depth0,
+		DepthMap0) = DepthMap :-
+	Depth1 = Depth0 + 1,
+	DepthMap1 = tree234__compute_balance_info_2(T0, Depth1, DepthMap0),
+	DepthMap  = tree234__compute_balance_info_2(T1, Depth1, DepthMap1).
+tree234__compute_balance_info_2(three(_, _, _, _, T0, T1, T2), Depth0,
+		DepthMap0) = DepthMap :-
+	Depth1 = Depth0 + 1,
+	DepthMap1 = tree234__compute_balance_info_2(T0, Depth1, DepthMap0),
+	DepthMap2 = tree234__compute_balance_info_2(T1, Depth1, DepthMap1),
+	DepthMap  = tree234__compute_balance_info_2(T2, Depth1, DepthMap2).
+tree234__compute_balance_info_2(four(_, _, _, _, _, _, T0, T1, T2, T3), Depth0,
+		DepthMap0) = DepthMap :-
+	Depth1 = Depth0 + 1,
+	DepthMap1 = tree234__compute_balance_info_2(T0, Depth1, DepthMap0),
+	DepthMap2 = tree234__compute_balance_info_2(T1, Depth1, DepthMap1),
+	DepthMap3 = tree234__compute_balance_info_2(T2, Depth1, DepthMap2),
+	DepthMap  = tree234__compute_balance_info_2(T3, Depth1, DepthMap3).
+
+:- func tree234__compute_balance_info_record_depth(int, tree234(int, int))
+	= tree234(int, int).
+
+tree234__compute_balance_info_record_depth(Depth, DepthMap0) = DepthMap :-
+	( tree234__search(DepthMap0, Depth, Count0) ->
+		tree234__set(DepthMap0, Depth, Count0 + 1, DepthMap)
+	;
+		tree234__set(DepthMap0, Depth, 1, DepthMap)
+	).
+
+	% By inserting the elements of Tree0 into a new tree in order,
+	% we get tree234__insert to preserve balance.
+tree234__balance(Tree0) = Tree :-
+	tree234__tree234_to_assoc_list(Tree0, List),
+	tree234__assoc_list_to_tree234(List, Tree).
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
cvs diff: Diffing profiler
cvs diff: Diffing robdd
cvs diff: Diffing runtime
cvs diff: Diffing runtime/GETOPT
cvs diff: Diffing runtime/machdeps
cvs diff: Diffing samples
cvs diff: Diffing samples/c_interface
cvs diff: Diffing samples/c_interface/c_calls_mercury
cvs diff: Diffing samples/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/c_interface/mercury_calls_c
cvs diff: Diffing samples/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/c_interface/mercury_calls_fortran
cvs diff: Diffing samples/c_interface/simpler_c_calls_mercury
cvs diff: Diffing samples/c_interface/simpler_cplusplus_calls_mercury
cvs diff: Diffing samples/diff
cvs diff: Diffing samples/muz
cvs diff: Diffing samples/rot13
cvs diff: Diffing samples/solutions
cvs diff: Diffing samples/tests
cvs diff: Diffing samples/tests/c_interface
cvs diff: Diffing samples/tests/c_interface/c_calls_mercury
cvs diff: Diffing samples/tests/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/tests/c_interface/mercury_calls_c
cvs diff: Diffing samples/tests/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/tests/c_interface/mercury_calls_fortran
cvs diff: Diffing samples/tests/c_interface/simpler_c_calls_mercury
cvs diff: Diffing samples/tests/c_interface/simpler_cplusplus_calls_mercury
cvs diff: Diffing samples/tests/diff
cvs diff: Diffing samples/tests/muz
cvs diff: Diffing samples/tests/rot13
cvs diff: Diffing samples/tests/solutions
cvs diff: Diffing samples/tests/toplevel
cvs diff: Diffing scripts
cvs diff: Diffing tests
cvs diff: Diffing tests/benchmarks
cvs diff: Diffing tests/debugger
cvs diff: Diffing tests/debugger/declarative
cvs diff: Diffing tests/dppd
cvs diff: Diffing tests/general
cvs diff: Diffing tests/general/accumulator
cvs diff: Diffing tests/general/structure_reuse
cvs diff: Diffing tests/hard_coded
cvs diff: Diffing tests/hard_coded/exceptions
cvs diff: Diffing tests/hard_coded/purity
cvs diff: Diffing tests/hard_coded/sub-modules
cvs diff: Diffing tests/hard_coded/typeclasses
cvs diff: Diffing tests/invalid
cvs diff: Diffing tests/invalid/purity
cvs diff: Diffing tests/misc_tests
cvs diff: Diffing tests/recompilation
cvs diff: Diffing tests/tabling
cvs diff: Diffing tests/term
cvs diff: Diffing tests/valid
cvs diff: Diffing tests/warnings
cvs diff: Diffing tools
cvs diff: Diffing trace
cvs diff: Diffing util
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list