[m-dev.] for review: use bitsets in quantification

Simon Taylor stayl at cs.mu.OZ.AU
Mon Nov 6 19:26:21 AEDT 2000


> > +:- typeclass enum(T) where [
> > +	func to_int(T) = int,
> > +	func from_int(int) = T
> > +].
> 
> The documentation and interface here suggests that only types which
> are isomorphic to `int' are allowed, and that `from_int' can't fail.
> But I think the intent is that this typeclass should be allowed
> for types which are isomorphic to a subrange of `int'.
> The `from_int' function should be semidet, and you should
> document when it should fail.
 
> It would also be useful to have instances for `char' and `bool'.

Done.

 
> > @@ -410,10 +437,13 @@
> >  
> >  :- pragma c_header_code("
> >  	#include <limits.h>
> > +
> > +	#define ML_BITS_PER_INT		(sizeof(MR_Integer) * CHAR_BIT)
> >  ").
> 
> Technically that is making an assumption which standard C does not guarantee,
> namely that integers have no p, since integer types

The code there is just copied from the original definition of
int__bits_per_int.
 
> > +% This module provides an ADT for storing sets of integers.
> > +% If the integers are closely grouped, this representation will be
> > +% much more compact than that provided by set.m.
> 
> The comment here doesn't explain why it is called "sparse" bitset.
 
> I think it would be worth outlining roughly how sparse_bitsets are
> represented, to give programmers a better idea of how it will perform.
> It would also be nice to document asymptotic complexities for the
> different operations.

Done.
 
> > +:- module sparse_bitset.
> ...
> > +	% `set__equal(SetA, SetB' is true iff `SetA' and `SetB'
> > +	% contain the same elements.
> > +:- pred equal(sparse_bitset(T), sparse_bitset(T)).
> > +:- mode equal(in, in) is semidet.
> 
> s/set__//
 
> Likewise in other parts of this module.

Fixed.
 
> > +:- func list_to_set(list(T)) = sparse_bitset(T) <= enum(T).
> > +:- pragma type_spec(list_to_set/1, T = var(_)).
> > +:- pragma type_spec(list_to_set/1, T = int).
> 
> Do these pragmas need to go in the interface? If so, would it be
> better to put them all in a group at the end, rather than putting them
> with the declaration of each operation?

They do need to go in the interface so that type specialization
can be done without enabling inter-module optimization.
I've moved them to the end.

> > +        % `member(X, Set)' is true iff `X' is a member of `Set'.
> > +:- pred member(T, sparse_bitset(T)) <= enum(T).
> > +:- mode member(in, in) is semidet.
> 
> Probably you should use `contains(sparse_bitset(T), T)' rather than
> `member'; for consistency with lists, `member' should be reserved for
> procedures that nondeterministically produce all the possible members.

OK. I've also added that predicate to the other set modules.
 
> > +	% `insert_list(Set0, X)' returns the union of `Set0' and the set
> > +	% containing only the members of `X'.
> > +:- func delete_list(sparse_bitset(T), list(T)) = sparse_bitset(T) <= enum(T).
> 
> The documentation there is wrong (cut-and-paste error, obviously).

Fixed.

> > +	% There's a lot of code duplication between this
> > +	% and fold/3 below. The main difference is that
> > +	% fold traverses the set in sorted order, whereas
> > +	% to_sorted_list traverses the set in reverse order
> > +	% to avoid having to reverse the resulting list.
> > +to_sorted_list(sparse_bitset(A)) =
> > +		sparse_bitset__to_sorted_list_2(A).
> 
> It may be worth defining a generic `foldr' that traverses the elements
> in reverse order, and using that.

Done.
 
> > +bits_for_index(Index, Offset, Bits) :-
> > +	% Need to use `div' and `mod' rather than `//' and `rem'
> > +	% to handle negative values correctly.
> > +	Offset = int__floor_to_multiple_of_bits_per_int(Index),
> 
> It's a bit hard to see how the comment there relates to the code.

Removed.

> > +	% The bit pattern will often look like a pointer,
> > +	% so allocate the pairs using GC_malloc_atomic()
> > +	% to avoid unnecessary memory retention.
> > +	% Doing this slows down the compiler by about 1%,
> 
> Do you know why?

I'm not sure. One possibility is that because the garbage
collector uses separate free lists for atomic and non-atomic
objects, there is a significant amount of memory stuck on the
atomic free list after quantification is run that can't be used
by other parts of the compiler. That could cause the collector
to expand the heap or collect more frequently.

> Better make that `impure unsafe_perform_io', otherwise the compiler
> will do duplicate call elimination to eliminate your newlines.
> 
> A much better approach, that avoids the need for unsafe_perform_io,
> would be to just package the data that you want displayed into a
> struct and throw that as an exception.

Done.
 
> Could you please post a relative diff when you've addressed those issues?


Estimated hours taken: 14

Use bitsets to store the sets of variables in quantification.
This change reduces the time taken by `mmc -C make_hlds' by 7-8%.

library/sparse_bitset.m:
	An ADT for storing sets of integers.

library/set.m:
library/set_ordlist.m:
library/set_unordlist.m:
library/set_bbbtree.m:
library/bintree_set.m:
	Add a predicate contains/2, which is like member/2 but
	only has the `contains(in, in)' mode. For some set representations
	it is difficult to code a predicate which does both the
	`member(out, in)' and `member(in, in)' modes efficiently.

library/enum.m:
	Contains a typeclass `enum/1' describing types which
	can be converted to and from integers.

library/term.m:
	Add an instance declaration for `enum(var(T))'.

library/char.m:
	Add an instance declaration for `enum(character)'.

library/bool.m:
	Add an instance declaration for `enum(bool)'.

library/int.m:
	Add an instance declaration for `enum(int)'.

	Add a function for finding the largest multiple of
	bits_per_int which is less than a given number, for
	use by sparse_bitset.m.

compiler/quantification.m:
	Use `sparse_bitset(prog_var)' rather than `set(prog_var)'
	for all the sets of variables used while quantifying a goal,
	but arrange things so that it is simple use `set(prog_var)'
	when debugging.

NEWS:
	Document the new modules.

tests/hard_coded/Mmakefile:
tests/hard_coded/test_sparse_bitset.m:
tests/hard_coded/bitset_tester.m:
	Add some tests for predicates and functions in
	sparse_bitset.m which are not used in quantification.m.
	Test storing negative integers in sparse_bitsets.

--- bintree_set.m	2000/11/06 06:29:49	1.1
+++ bintree_set.m	2000/11/06 06:31:31
@@ -78,11 +78,18 @@
 :- mode bintree_set__member(in, in) is semidet.
 :- mode bintree_set__member(out, in) is nondet.
 
-	% `bintree_set_member(X, Set)' is true iff `X' is a member of `Set'.
+	% `bintree_set__is_member(X, Set)' is true
+	% iff `X' is a member of `Set'.
 
 :- pred bintree_set__is_member(T, bintree_set(T)).
 :- mode bintree_set__is_member(in, in) is semidet.
 
+	% `bintree_set__contains(Set, X)' is true
+	% iff `X' is a member of `Set'.
+
+:- pred bintree_set__contains(bintree_set(T), T).
+:- mode bintree_set__contains(in, in) is semidet.
+
 	% `bintree_set__insert(Set0, X, Set)' is true iff `Set' is the union of
 	% `Set0' and the set containing only `X'.
 
@@ -210,6 +217,9 @@
 
 bintree_set__is_member(E, S) :-
 	bintree__search(S, E, _).
+
+bintree_set__contains(S, E) :-
+	bintree_set__is_member(E, S).
 
 %--------------------------------------------------------------------------%
 
--- bool.m	2000/11/05 12:17:28	1.1
+++ bool.m	2000/11/06 06:58:25
@@ -1,5 +1,5 @@
 %---------------------------------------------------------------------------%
-% Copyright (C) 1996-1997 The University of Melbourne.
+% Copyright (C) 1996-1997,2000 The University of Melbourne.
 % This file may only be copied under the terms of the GNU Library General
 % Public License - see the file COPYING.LIB in the Mercury distribution.
 %---------------------------------------------------------------------------%
@@ -17,7 +17,7 @@
 
 :- interface.
 
-:- import_module list.
+:- import_module enum, list.
 
 %-----------------------------------------------------------------------------%
 
@@ -28,18 +28,25 @@
 
 :- type bool ---> no ; yes.
 
+:- instance enum(bool).
+
@@ -47,9 +54,25 @@
 
 :- implementation.
 
+:- instance enum(bool) where [
+	func(to_int/1) is bool_to_int,
+	func(from_int/1) is int_to_bool
+].
+
+:- func bool_to_int(bool) = int.
+bool_to_int(yes) = 1.
+bool_to_int(no) = 0.
+
+:- func int_to_bool(int) = bool.
+int_to_bool(Int) = ( Int = 0 -> no ; yes ).
+
--- builtin.m	2000/11/04 01:08:31	1.1
+++ builtin.m	2000/11/04 01:09:53
@@ -171,7 +171,8 @@
 	% depending on wheither X is =, <, or > Y in the
 	% standard ordering.
 :- pred compare(comparison_result, T, T).
-	% This mode must be first -- compiler/higher_order.m depends on it.
+	% Note to implementors: this mode must be first --
+	% compiler/higher_order.m depends on it.
 :- mode compare(uo, in, in) is det.
 :- mode compare(uo, ui, ui) is det.
 :- mode compare(uo, ui, in) is det.
--- enum.m	2000/11/05 12:17:16	1.1
+++ enum.m	2000/11/06 06:56:11
@@ -12,28 +12,16 @@
 % of information.
 %
 %-----------------------------------------------------------------------------%
-%-----------------------------------------------------------------------------%
+
 :- module enum.
 
 :- interface.
 
+	% For all instances the following must hold:
+	%	all [X, Int] (Int = to_int(X) => X = from_int(Int)).
 :- typeclass enum(T) where [
 	func to_int(T) = int,
-	func from_int(int) = T
-].
-
-:- instance enum(int).
-
-%-----------------------------------------------------------------------------%
-%-----------------------------------------------------------------------------%
-
-:- implementation.
-
-:- import_module std_util.
-
-:- instance enum(int) where [
-	func(to_int/1) is id,
-	func(from_int/1) is id
+	func from_int(int) = T is semidet
 ].
 
 %-----------------------------------------------------------------------------%
--- int.m	2000/11/06 03:26:40	1.2
+++ int.m	2000/11/06 03:27:20
@@ -22,6 +22,10 @@
 
 :- interface.
 
+:- import_module enum.
+
+:- instance enum(int).
+
 	% less than
 :- pred int < int.
 :- mode in  < in is semidet.
@@ -284,7 +288,12 @@
 %-----------------------------------------------------------------------------%
 
 :- implementation.
-:- import_module require.
+:- import_module require, std_util.
+
+:- instance enum(int) where [
+	func(to_int/1) is id,
+	func(from_int/1) is id
+].
 
 % Most of the arithmetic and comparison operators are recognized by
 % the compiler as builtins, so we don't need to define them here.
--- set.m	2000/11/05 12:17:16	1.1
+++ set.m	2000/11/05 12:33:03
@@ -104,30 +100,31 @@
 :- pred set__is_member(T, set(T), bool).
 :- mode set__is_member(in, in, out) is det.
 
+	% `set__contains(Set, X)' is true iff `X' is a member of `Set'.
+
+:- pred set__contains(set(T), T).
+:- mode set__contains(in, in) is semidet.
+
@@ -306,6 +330,9 @@
 
 set__is_member(X, Set, Result) :-
 	set_ordlist__is_member(X, Set, Result).
+
+set__contains(Set, X) :-
+	set_ordlist__contains(Set, X).
 
 set__delete_list(Set0, List, Set) :-
 	set_ordlist__delete_list(Set0, List, Set).
--- set_bbbtree.m	2000/11/05 12:17:16	1.1
+++ set_bbbtree.m	2000/11/05 12:23:54
@@ -54,6 +54,12 @@
 :- pred set_bbbtree__is_member(T, set_bbbtree(T), bool).
 :- mode set_bbbtree__is_member(in, in, out) is det.
 
+	% `set_bbbtree__contains(Set, X)' is true iff `X' is a member of `Set'.
+	% O(lg n).
+
+:- pred set_bbbtree__contains(set_bbbtree(T), T).
+:- mode set_bbbtree__contains(in, in) is semidet.
+
 
 	% `set_bbbtree__least(Set, X)' is true iff `X' is smaller than all
 	% the other members of `Set'.
@@ -336,6 +342,9 @@
 		Result = (=),
 		X = V
 	).
+
+set_bbbtree__contains(Set, X) :-
+	set_bbbtree__member(X, Set).
 
 %------------------------------------------------------------------------------%
 
--- set_ordlist.m	2000/11/05 12:17:16	1.1
+++ set_ordlist.m	2000/11/05 12:45:42
@@ -87,20 +84,23 @@
 :- pred set_ordlist__is_member(T, set_ordlist(T), bool).
 :- mode set_ordlist__is_member(in, in, out) is det.
 
+	% `set_ordlist__contains(Set, X)' is true iff `X' is a member of `Set'.
+
+:- pred set_ordlist__contains(set_ordlist(T), T).
+:- mode set_ordlist__contains(in, in) is semidet.
+
@@ -248,6 +272,9 @@
 	;
 		R = no
 	).
+
+set_ordlist__contains(S, E) :-
+	set_ordlist__member(E, S).
 
 set_ordlist__delete_list(S0, D, S) :-
 	list__sort_and_remove_dups(D, DS),
--- set_unordlist.m	2000/11/05 12:17:16	1.1
+++ set_unordlist.m	2000/11/05 12:19:44
@@ -84,6 +84,12 @@
 :- pred set_unordlist__is_member(T, set_unordlist(T), bool).
 :- mode set_unordlist__is_member(in, in, out) is det.
 
+	% `set_unordlist__contains(Set, X)' is true iff
+	% `X' is a member of `Set'.
+
+:- pred set_unordlist__contains(set_unordlist(T), T).
+:- mode set_unordlist__contains(in, in) is semidet.
+
 	% `set_unordlist__insert(Set0, X, Set)' is true iff `Set' is the union
 	% of `Set0' and the set containing only `X'.
 
@@ -223,6 +229,9 @@
 	;
 		R = no
 	).
+
+set_unordlist__contains(S, E) :-
+	set_unordlist__member(E, S).
 
 set_unordlist__delete_list(S, [], S).
 set_unordlist__delete_list(S0, [X | Xs], S) :-

--- sparse_bitset.m	2000/11/05 12:17:00	1.3
+++ sparse_bitset.m	2000/11/06 08:09:51
@@ -8,14 +8,33 @@
 % Stability: medium.
 %
 % This module provides an ADT for storing sets of integers.
-% If the integers are closely grouped, this representation will be
-% much more compact than that provided by set.m. Union, intersection
-% and difference are much faster. Converting to and from lists is slower.
+% If the integers stored are closely grouped, a sparse_bitset
+% is much more compact than the representation provided by set.m,
+% and the operations will be much faster.
 %
+%
+% Efficiency notes:
+%
+% A sparse bitset is represented as a sorted list of pairs of integers.
+% For a pair `Offset - Bits', `Offset' is a multiple of `int__bits_per_int'.
+% The bits of `Bits' describe which of the elements of the range
+% `Offset' .. `Offset + bits_per_int - 1' are in the set.
+% Pairs with the same value of `Offset' are merged.
+% Pairs for which `Bits' is zero are removed.
+%
+% The values of `Offset' in the list need not be contiguous multiples
+% of `bits_per_int', hence the name _sparse_ bitset.
+%
+% A sparse_bitset is suitable for storing sets of integers which
+% can be represented using only a few `Offset - Bits' pairs.
 % In the worst case, where the integers stored are not closely
 % grouped, a sparse_bitset will take more memory than an
 % ordinary set, but the operations should not be too much slower.
 %
+% In the asymptotic complexities of the operations below,
+% `rep_size(Set)' is the number of pairs needed to represent `Set',
+% and `card(Set)' is the number of elements in `Set'.
+%
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 :- module sparse_bitset.
@@ -33,120 +52,129 @@
 :- mode empty(in) is semidet.
 :- mode empty(out) is det.
 
-	% `set__equal(SetA, SetB' is true iff `SetA' and `SetB'
+	% `equal(SetA, SetB' is true iff `SetA' and `SetB'
 	% contain the same elements.
+	% Takes O(min(rep_size(SetA), rep_size(SetB))) time.
 :- pred equal(sparse_bitset(T), sparse_bitset(T)).
 :- mode equal(in, in) is semidet.
 
-	% `set_ordlist__list_to_set(List)' returns a set
+	% `list_to_set(List)' returns a set
 	% containing only the members of `List'.
-	% In the worst case this will take O(N^2) time
+	% In the worst case this will take O(length(List)^2) time
 	% and space. If the elements of the list are closely
-	% grouped, it will be closer to O(N).
+	% grouped, it will be closer to O(length(List)).
 :- func list_to_set(list(T)) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(list_to_set/1, T = var(_)).
-:- pragma type_spec(list_to_set/1, T = int).
 
 	% `sorted_list_to_set(List)' returns a set containing
 	% only the members of `List'.
 	% `List' must be sorted.
-	% Takes O(N) time and space.
+	% Takes O(length(List)) time and space.
 :- func sorted_list_to_set(list(T)) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(sorted_list_to_set/1, T = var(_)).
-:- pragma type_spec(sorted_list_to_set/1, T = int).
 
-	% `set_ordlist__to_sorted_list(Set, List)' returns a list
+	% `to_sorted_list(Set, List)' returns a list
 	% containing all the members of `Set', in sorted order.
-	% Takes O(NlogN) time and O(N) space.
+	% Takes O(card(Set)) time and space.
 :- func to_sorted_list(sparse_bitset(T)) = list(T) <= enum(T).
-:- pragma type_spec(to_sorted_list/1, T = var(_)).
-:- pragma type_spec(to_sorted_list/1, T = int).
 
 	% `make_singleton_set(Elem)' returns a set
 	% containing just the single element `Elem'.
 :- func make_singleton_set(T) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(make_singleton_set/1, T = var(_)).
-:- pragma type_spec(make_singleton_set/1, T = int).
 
-	% `subset(SetA, SetB)' is true iff `SetA' is a subset of `SetB'.
+	% `subset(Subset, Set)' is true iff `Subset' is a subset of `Set'.
+	% Same as `intersect(Set, Subset, Subset)', but may be more efficient.
 :- pred subset(sparse_bitset(T), sparse_bitset(T)).
 :- mode subset(in, in) is semidet.
 
-	% `superset(SetA, SetB)' is true iff `SetA' is a superset of `SetB'.
+	% `superset(Superset, Set)' is true iff `Superset' is a
+	% superset of `Set'.
+	% Same as `intersect(Superset, Set, Set)', but may be more efficient.
 :- pred superset(sparse_bitset(T), sparse_bitset(T)).
 :- mode superset(in, in) is semidet.
 
-        % `member(X, Set)' is true iff `X' is a member of `Set'.
-:- pred member(T, sparse_bitset(T)) <= enum(T).
-:- mode member(in, in) is semidet.
-:- pragma type_spec(member/2, T = var(_)).
-:- pragma type_spec(member/2, T = int).
-
-	% `insert(Set0, X)' returns the union
-	% of `Set0' and the set containing only `X'.
+        % `contains(Set, X)' is true iff `X' is a member of `Set'.
+	% Takes O(rep_size(Set)) time.
+:- pred contains(sparse_bitset(T), T) <= enum(T).
+:- mode contains(in, in) is semidet.
+
+	% `insert(Set, X)' returns the union
+	% of `Set' and the set containing only `X'.
+	% Takes O(rep_size(Set)) time and space.
 :- func insert(sparse_bitset(T), T) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(insert/2, T = var(_)).
-:- pragma type_spec(insert/2, T = int).
 
-	% `insert_list(Set0, X)' returns the union of `Set0' and the set
+	% `insert_list(Set, X)' returns the union of `Set' and the set
 	% containing only the members of `X'.
+	% Same as `union(Set, list_to_set(X))', but may be more efficient.
 :- func insert_list(sparse_bitset(T), list(T)) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(insert_list/2, T = var(_)).
-:- pragma type_spec(insert_list/2, T = int).
 
-	% `delete(Set0, X)' returns the difference
-	% of `Set0' and the set containing only `X'.
+	% `delete(Set, X)' returns the difference
+	% of `Set' and the set containing only `X'.
+	% Takes O(rep_size(Set)) time and space.
 :- func delete(sparse_bitset(T), T) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(delete/2, T = var(_)).
-:- pragma type_spec(delete/2, T = int).
 
-	% `delete_list(Set0, X)' returns the difference of `Set0' and the set
+	% `delete_list(Set, X)' returns the difference of `Set' and the set
 	% containing only the members of `X'.
+	% Same as `difference(Set, list_to_set(X))', but may be more efficient.
 :- func delete_list(sparse_bitset(T), list(T)) = sparse_bitset(T) <= enum(T).
-:- pragma type_spec(delete_list/2, T = var(_)).
-:- pragma type_spec(delete_list/2, T = int).
 
-	% `remove(Set0, X)' returns the difference
-	% of `Set0' and the set containing only `X',
-	% failing if `Set0' does not contain `X'.
+	% `remove(Set, X)' returns the difference
+	% of `Set' and the set containing only `X',
+	% failing if `Set' does not contain `X'.
+	% Takes O(rep_size(Set)) time and space.
 :- func remove(sparse_bitset(T), T) = sparse_bitset(T) <= enum(T).
 :- mode remove(in, in) = out is semidet.
 
-	% `remove_list(Set0, X)' returns the difference of `Set0'
+	% `remove_list(Set, X)' returns the difference of `Set'
 	% and the set containing all the elements of `X',
 	% failing if any element of `X' is not in `Set0'.
+	% Same as
+	%	`subset(list_to_set(X), Set), difference(Set, list_to_set(X))',
+	% but may be more efficient.
 :- func remove_list(sparse_bitset(T), list(T)) = sparse_bitset(T) <= enum(T).
 :- mode remove_list(in, in) = out is semidet.
 
 	% `remove_least(Set0, X, Set)' is true iff `X' is the
 	% least element in `Set0', and `Set' is the set which
 	% contains all the elements of `Set0' except `X'.
+	% Takes O(1) time and space.
 :- pred remove_least(sparse_bitset(T), T, sparse_bitset(T)) <= enum(T).
 :- mode remove_least(in, out, out) is semidet.
 
 	% `union(SetA, SetB)' returns the union of `SetA' and `SetB'. 
 	% The efficiency of the union operation is not sensitive
 	% to the argument ordering.
+	% Takes O(rep_size(SetA) + rep_size(SetB)) time and space.
 :- func union(sparse_bitset(T), sparse_bitset(T)) = sparse_bitset(T).
 
 	% `intersect(SetA, SetB)' returns the intersection of
 	% `SetA' and `SetB'. The efficiency of the intersection
 	% operation is not sensitive to the argument ordering.
+	% Takes O(rep_size(SetA) + rep_size(SetB)) time and
+	% O(min(rep_size(SetA)), rep_size(SetB)) space.
 :- func intersect(sparse_bitset(T), sparse_bitset(T)) = sparse_bitset(T).
 
 	% `difference(SetA, SetB)' returns the set containing all the
 	% elements of `SetA' except those that occur in `SetB'.
+	% Takes O(rep_size(SetA) + rep_size(SetB)) time and
+	% O(rep_size(SetA)) space.
 :- func difference(sparse_bitset(T), sparse_bitset(T)) = sparse_bitset(T).
 
 	% `count(Set)' returns the number of elements in `Set'.
+	% Takes O(card(Set)) time.
 :- func count(sparse_bitset(T)) = int <= enum(T).
 
-	% `fold(Func, Set, Start)' calls Func with each element
+	% `foldl(Func, Set, Start)' calls Func with each element
 	% of `Set' (in sorted order) and an accumulator
 	% (with the initial value of `Start'), and returns
 	% the final value.
-	% Takes O(NlogN) time.
-:- func fold(func(T, U) = U, sparse_bitset(T), U) = U <= enum(T).
+	% Takes O(card(Set)) time.
+:- func foldl(func(T, U) = U, sparse_bitset(T), U) = U <= enum(T).
+
+	% `foldr(Func, Set, Start)' calls Func with each element
+	% of `Set' (in reverse sorted order) and an accumulator
+	% (with the initial value of `Start'), and returns
+	% the final value.
+	% Takes O(card(Set)) time.
+:- func foldr(func(T, U) = U, sparse_bitset(T), U) = U <= enum(T).
 
 %-----------------------------------------------------------------------------%
 
@@ -157,46 +185,30 @@
 
 :- pred list_to_set(list(T), sparse_bitset(T)) <= enum(T).
 :- mode list_to_set(in, out) is det.
-:- pragma type_spec(list_to_set/2, T = var(_)).
-:- pragma type_spec(list_to_set/2, T = int).
 
 :- pred sorted_list_to_set(list(T), sparse_bitset(T)) <= enum(T).
 :- mode sorted_list_to_set(in, out) is det.
-:- pragma type_spec(sorted_list_to_set/2, T = var(_)).
-:- pragma type_spec(sorted_list_to_set/2, T = int).
 
 :- pred to_sorted_list(sparse_bitset(T), list(T)) <= enum(T).
 :- mode to_sorted_list(in, out) is det.
-:- pragma type_spec(to_sorted_list/2, T = var(_)).
-:- pragma type_spec(to_sorted_list/2, T = int).
 
 	% Note: set.m contains the reverse mode of this predicate,
 	% but it is difficult to implement both modes using
 	% the representation in this module.
 :- pred singleton_set(sparse_bitset(T), T) <= enum(T).
 :- mode singleton_set(out, in) is det.
-:- pragma type_spec(singleton_set/2, T = var(_)).
-:- pragma type_spec(singleton_set/2, T = int).
 
 :- pred insert(sparse_bitset(T), T, sparse_bitset(T)) <= enum(T).
 :- mode insert(in, in, out) is det.
-:- pragma type_spec(insert/3, T = var(_)).
-:- pragma type_spec(insert/3, T = int).
 
 :- pred insert_list(sparse_bitset(T), list(T), sparse_bitset(T)) <= enum(T).
 :- mode insert_list(in, in, out) is det.
-:- pragma type_spec(insert_list/3, T = var(_)).
-:- pragma type_spec(insert_list/3, T = int).
 
 :- pred delete(sparse_bitset(T), T, sparse_bitset(T)) <= enum(T).
 :- mode delete(in, in, out) is det.
-:- pragma type_spec(delete/3, T = var(_)).
-:- pragma type_spec(delete/3, T = int).
 
 :- pred delete_list(sparse_bitset(T), list(T), sparse_bitset(T)) <= enum(T).
 :- mode delete_list(in, in, out) is det.
-:- pragma type_spec(delete_list/3, T = var(_)).
-:- pragma type_spec(delete_list/3, T = int).
 
 :- pred remove(sparse_bitset(T), T, sparse_bitset(T)) <= enum(T).
 :- mode remove(in, in, out) is semidet.
@@ -214,6 +226,69 @@
 :- mode difference(in, in, out) is det.
 
 %-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- interface.
+
+:- pragma type_spec(list_to_set/1, T = var(_)).
+:- pragma type_spec(list_to_set/1, T = int).
+
+:- pragma type_spec(sorted_list_to_set/1, T = var(_)).
+:- pragma type_spec(sorted_list_to_set/1, T = int).
+
+:- pragma type_spec(to_sorted_list/1, T = var(_)).
+:- pragma type_spec(to_sorted_list/1, T = int).
+
+:- pragma type_spec(make_singleton_set/1, T = var(_)).
+:- pragma type_spec(make_singleton_set/1, T = int).
+
+:- pragma type_spec(contains/2, T = var(_)).
+:- pragma type_spec(contains/2, T = int).
+
+:- pragma type_spec(insert/2, T = var(_)).
+:- pragma type_spec(insert/2, T = int).
+
+:- pragma type_spec(insert_list/2, T = var(_)).
+:- pragma type_spec(insert_list/2, T = int).
+
+:- pragma type_spec(delete/2, T = var(_)).
+:- pragma type_spec(delete/2, T = int).
+
+:- pragma type_spec(delete_list/2, T = var(_)).
+:- pragma type_spec(delete_list/2, T = int).
+
+:- pragma type_spec(foldr/3, T = int).
+:- pragma type_spec(foldr/3, T = var(_)).
+
+:- pragma type_spec(foldl/3, T = int).
+:- pragma type_spec(foldl/3, T = var(_)).
+
+:- pragma type_spec(list_to_set/2, T = var(_)).
+:- pragma type_spec(list_to_set/2, T = int).
+
+:- pragma type_spec(sorted_list_to_set/2, T = var(_)).
+:- pragma type_spec(sorted_list_to_set/2, T = int).
+
+:- pragma type_spec(to_sorted_list/2, T = var(_)).
+:- pragma type_spec(to_sorted_list/2, T = int).
+
+:- pragma type_spec(singleton_set/2, T = var(_)).
+:- pragma type_spec(singleton_set/2, T = int).
+
+:- pragma type_spec(insert/3, T = var(_)).
+:- pragma type_spec(insert/3, T = int).
+
+:- pragma type_spec(insert_list/3, T = var(_)).
+:- pragma type_spec(insert_list/3, T = int).
+
+:- pragma type_spec(delete/3, T = var(_)).
+:- pragma type_spec(delete/3, T = int).
+
+:- pragma type_spec(delete_list/3, T = var(_)).
+:- pragma type_spec(delete_list/3, T = int).
+
+%-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
 :- implementation.
@@ -250,99 +325,95 @@
 
 %-----------------------------------------------------------------------------%
 
-	% There's a lot of code duplication between this
-	% and fold/3 below. The main difference is that
-	% fold traverses the set in sorted order, whereas
-	% to_sorted_list traverses the set in reverse order
-	% to avoid having to reverse the resulting list.
-to_sorted_list(sparse_bitset(A)) =
-		sparse_bitset__to_sorted_list_2(A).
-
-:- func to_sorted_list_2(bitset_impl) = list(T) <= enum(T).
-:- pragma type_spec(to_sorted_list_2/1, T = int).
-:- pragma type_spec(to_sorted_list_2/1, T = var(_)).
-
-to_sorted_list_2([]) = [].
-to_sorted_list_2([Data | Rest]) =
-	extract_bits(Data ^ offset, Data ^ bits, to_sorted_list_2(Rest)).
-
-	% Find the locations of the 1 bits in an int, and
-	% add them onto the front of the list.
-:- func extract_bits(int, int, list(T)) = list(T) <= enum(T).
-:- pragma type_spec(extract_bits/3, T = int).
-:- pragma type_spec(extract_bits/3, T = var(_)).
-
-extract_bits(Offset, Bits, Ints0) = Ints :-
-	Size = bits_per_int,
-	Ints = extract_bits_2(Offset, Bits, Size, Ints0).
+to_sorted_list(Set) = foldr(func(Elem, Acc0) = [Elem | Acc0], Set, []).
 
-	% Do a binary search for the 1 bits in an int.
-:- func extract_bits_2(int, int, int, list(T)) = list(T) <= enum(T).
-:- pragma type_spec(extract_bits_2/4, T = int).
-:- pragma type_spec(extract_bits_2/4, T = var(_)).
+%-----------------------------------------------------------------------------%
 
-extract_bits_2(Offset, Bits, Size, Ints0) = Ints :-
-	( Bits = 0 ->
-		Ints = Ints0
-	; Size = 1 ->
-		Elem = from_int(Offset),
-		Ints = [Elem | Ints0]
-	;
-		HalfSize = unchecked_right_shift(Size, 1),
-		Mask = mask(HalfSize),
+foldl(F, sparse_bitset(Set), Acc0) = foldl_2(F, Set, Acc0).
 
-		% Extract the high-order half of the bits.
-		HighBits = Mask /\ unchecked_right_shift(Bits, HalfSize), 
-		Ints1 = extract_bits_2(Offset + HalfSize,
-				HighBits, HalfSize, Ints0),
-		
-		% Extract the low-order half of the bits.
-		LowBits = Mask /\ Bits,
-		Ints = extract_bits_2(Offset, LowBits, HalfSize, Ints1)
-	).
+:- func foldl_2(func(T, U) = U, bitset_impl, U) = U <= enum(T).
+:- pragma type_spec(foldl_2/3, T = int).
+:- pragma type_spec(foldl_2/3, T = var(_)).
 
-%-----------------------------------------------------------------------------%
+foldl_2(_, [], Acc) = Acc.
+foldl_2(F, [H | T], Acc0) = Acc :-
+	Acc1 = fold_bits(low_to_high, F, H ^ offset, H ^ bits, Acc0),
+	Acc = foldl_2(F, T, Acc1).
 
-fold(F, sparse_bitset(Set), Acc0) = fold_2(F, Set, Acc0).
+foldr(F, sparse_bitset(Set), Acc0) = foldr_2(F, Set, Acc0).
 
-:- func fold_2(func(T, U) = U, bitset_impl, U) = U <= enum(T).
+:- func foldr_2(func(T, U) = U, bitset_impl, U) = U <= enum(T).
+:- pragma type_spec(foldr_2/3, T = int).
+:- pragma type_spec(foldr_2/3, T = var(_)).
 
-fold_2(_, [], Acc) = Acc.
-fold_2(F, [Data | Rest], Acc0) =
-	fold_2(F, Rest, fold_bits(F, Data ^ offset, Data ^ bits, Acc0)).
+	% We don't just use list__foldr here because the
+	% overhead of allocating the closure for fold_bits
+	% is significant for the compiler's runtime,
+	% so it's best to avoid that even if
+	% `--optimize-higher-order' is not set.
+foldr_2(_, [], Acc) = Acc.
+foldr_2(F, [H | T], Acc0) = Acc :-
+	Acc1 = foldr_2(F, T, Acc0),
+	Acc = fold_bits(high_to_low, F, H ^ offset, H ^ bits, Acc1).
 
-:- func fold_bits(func(T, U) = U, int, int, U) = U <= enum(T).
+:- func fold_bits(fold_direction, func(T, U) = U, int, int, U) = U <= enum(T).
+:- pragma type_spec(fold_bits/5, T = int).
+:- pragma type_spec(fold_bits/5, T = var(_)).
 
-fold_bits(F, Offset, Bits, Acc0) = Acc :-
+fold_bits(Dir, F, Offset, Bits, Acc0) = Acc :-
 	Size = bits_per_int,
-	Acc = fold_bits_2(F, Offset, Bits, Size, Acc0).
+	Acc = fold_bits_2(Dir, F, Offset, Bits, Size, Acc0).
+
+:- type fold_direction
+	--->	low_to_high
+	;	high_to_low
+	.
 
 	% Do a binary search for the 1 bits in an int.
-:- func fold_bits_2(func(T, U) = U, int, int, int, U) = U <= enum(T).
+:- func fold_bits_2(fold_direction, func(T, U) = U,
+		int, int, int, U) = U <= enum(T).
+:- pragma type_spec(fold_bits_2/6, T = int).
+:- pragma type_spec(fold_bits_2/6, T = var(_)).
 
-fold_bits_2(F, Offset, Bits, Size, Acc0) = Acc :-
+fold_bits_2(Dir, F, Offset, Bits, Size, Acc0) = Acc :-
 	( Bits = 0 ->
 		Acc = Acc0
 	; Size = 1 ->
-		Elem = from_int(Offset),
-		Acc = F(Elem, Acc0)
+		( Elem = from_int(Offset) ->
+			Acc = F(Elem, Acc0)
+		;
+			% We only apply `from_int/1' to integers returned
+			% by `to_int/1', so it should never fail.
+			error("sparse_bitset.m: `enum__from_int/1' failed")
+		)
 	;
 		HalfSize = unchecked_right_shift(Size, 1),
 		Mask = mask(HalfSize),
 		
 		% Extract the low-order half of the bits.
 		LowBits = Mask /\ Bits,
-		Acc1 = fold_bits_2(F, Offset, LowBits, HalfSize, Acc0),
 
 		% Extract the high-order half of the bits.
 		HighBits = Mask /\ unchecked_right_shift(Bits, HalfSize), 
-		Acc = fold_bits_2(F, Offset + HalfSize,
-				HighBits, HalfSize, Acc1)
+
+		(
+			Dir = low_to_high,
+			Acc1 = fold_bits_2(Dir, F, Offset, LowBits,
+					HalfSize, Acc0),
+			Acc = fold_bits_2(Dir, F, Offset + HalfSize, HighBits,
+					HalfSize, Acc1)
+		;
+			Dir = high_to_low,
+			Acc1 = fold_bits_2(Dir, F, Offset + HalfSize, HighBits,
+					HalfSize, Acc0),
+			Acc = fold_bits_2(Dir, F, Offset, LowBits,
+					HalfSize, Acc1)
+		)
 	).
 
 %-----------------------------------------------------------------------------%
 
-count(Set) = fold((func(_, Acc) = Acc + 1), Set, 0).
+count(Set) = foldl((func(_, Acc) = Acc + 1), Set, 0).
 
 %-----------------------------------------------------------------------------%
 
@@ -376,7 +447,7 @@
 delete_list(Set, List) = difference(Set, list_to_set(List)).
 
 remove(Set0, Elem) = Set :-
-	member(Elem, Set0),
+	contains(Set0, Elem),
 	Set = delete(Set0, Elem).
 
 remove_list(Set0, Elems) = Set :-
@@ -391,7 +462,13 @@
 	Bits0 = First ^ bits,
 	Offset = First ^ offset,
 	Bit = find_least_bit(Bits0),
-	Elem = from_int(Offset + Bit),
+	( Elem0 = from_int(Offset + Bit) ->
+		Elem = Elem0
+	;
+		% We only apply `from_int/1' to integers returned
+		% by `to_int/1', so it should never fail.
+		error("sparse_bitset.m: `enum__from_int/1' failed")
+	),
 	Bits = clear_bit(Bits0, Bit),
 	( Bits = 0 ->
 		Set = Rest
@@ -429,7 +506,8 @@
 
 %-----------------------------------------------------------------------------%
 
-list_to_set(List) = sparse_bitset(list_to_set_2(List, [])).
+list_to_set(List) =
+	sparse_bitset(list_to_set_2(List, [])).
 
 	% Each pass over the input list selects out the elements which
 	% belong in the same bitset_elem as the first element.
@@ -444,7 +522,8 @@
 list_to_set_2([H | T], List0) = List :-
 	bits_for_index(enum__to_int(H), Offset, Bits0),
 	list_to_set_3(T, Offset, Bits0, Bits, [], Rest),
-	List1 = insert_bitset_elem(make_bitset_elem(Offset, Bits), List0),
+	List1 = insert_bitset_elem(make_bitset_elem(Offset, Bits),
+			List0),
 	List = list_to_set_2(Rest, List1).
 
 	% Go through the list picking out the elements
@@ -519,25 +598,25 @@
 
 %-----------------------------------------------------------------------------%
 
-subset(Subset, Set) :- intersect(Set, Subset, Set).
+subset(Subset, Set) :- intersect(Set, Subset, Subset).
 
-superset(SuperSet, Set) :- subset(Set, SuperSet).
+superset(Superset, Set) :- subset(Set, Superset).
 
 %-----------------------------------------------------------------------------%
 
-member(Elem, sparse_bitset(Set)) :-
-	member_2(enum__to_int(Elem), Set).
+contains(sparse_bitset(Set), Elem) :-
+	contains_2(Set, enum__to_int(Elem)).
 
-:- pred member_2(int, bitset_impl).
-:- mode member_2(in, in) is semidet.
+:- pred contains_2(bitset_impl, int).
+:- mode contains_2(in, in) is semidet.
 
-member_2(Index, [Data | Rest]) :-
+contains_2([Data | Rest], Index) :-
 	Offset = Data ^ offset,
 	Index >= Offset,
 	( Index < Offset + bits_per_int ->
 		get_bit(Data ^ bits, Index - Offset) \= 0
 	;		
-		member_2(Index, Rest)
+		contains_2(Rest, Index)
 	).
 
 %-----------------------------------------------------------------------------%
@@ -681,8 +760,6 @@
 	% to avoid unnecessary memory retention.
 	% Doing this slows down the compiler by about 1%,
 	% but in a library module it's better to be safe.
-	% On the other hand, the bit patterns probably cause
-	% no more memory retention than unboxed floats.
 :- pragma c_code(make_bitset_elem(A::in, B::in) = (Pair::out),
 		[will_not_call_mercury, thread_safe],
 "{

--- bitset_tester.m	2000/11/05 13:31:55	1.1
+++ bitset_tester.m	2000/11/06 03:24:46
@@ -8,6 +8,12 @@
 
 :- type bitset_tester(T).
 
+:- type bitset_error(T)
+	--->	one_argument(string,
+			bitset_tester(T), bitset_tester(T))
+	;	two_arguments(string, bitset_tester(T),
+			bitset_tester(T), bitset_tester(T)).
+
 :- func init = bitset_tester(T).
 :- func insert(bitset_tester(T), T) = bitset_tester(T) <= enum(T).
 :- func insert_list(bitset_tester(T), list(T)) =
@@ -45,12 +51,12 @@
 
 :- func count(bitset_tester(T)) = int <= enum(T).
 
-:- func fold(func(T, U) = U, bitset_tester(T), U) = U <= enum(T).
+:- func foldl(func(T, U) = U, bitset_tester(T), U) = U <= enum(T).
 
 :- pred empty(bitset_tester(T)).
 :- mode empty(in) is semidet.
 
-:- pred member(T::in, bitset_tester(T)::in) is semidet <= enum(T).
+:- pred contains(bitset_tester(T)::in, T::in) is semidet <= enum(T).
 
 :- pred init(bitset_tester(T)::out) is det.
 :- pred singleton_set(bitset_tester(T)::out, T::in) is det <= enum(T).
@@ -77,7 +83,7 @@
 
 :- implementation.
 
-:- import_module bool, list, int, require, set, std_util.
+:- import_module bool, exception, list, int, require, set, std_util.
 :- import_module sparse_bitset.
 
 :- type bitset_tester(T) == pair(sparse_bitset(T), set__set(T)).
@@ -177,19 +183,19 @@
 
 %-----------------------------------------------------------------------------%
 
-member(Var, SetA - SetB) :-
-	( member(Var, SetA) -> InSetA = yes ; InSetA = no),
-	( set__member(Var, SetB) -> InSetB = yes ; InSetB = no),
+contains(SetA - SetB, Var) :-
+	( contains(SetA, Var) -> InSetA = yes ; InSetA = no),
+	( set__contains(SetB, Var) -> InSetB = yes ; InSetB = no),
 	( InSetA = InSetB ->
 		InSetA = yes
 	;
-		error("member failed")
+		error("contains failed")
 	).
 
 %-----------------------------------------------------------------------------%
 
-fold(F, SetA - SetB, Acc0) = Acc :-
-	AccA = fold(F, SetA, Acc0),
+foldl(F, SetA - SetB, Acc0) = Acc :-
+	AccA = foldl(F, SetA, Acc0),
 	AccB = fold(F, SetB, Acc0),
 	( AccA = AccB ->
 		Acc = AccA
@@ -290,26 +296,7 @@
 	( BitSetSet1 = BitSet1, BitSet = BitSetSet ->
 		true
 	;
-		unsafe_perform_io(io__write_string("Error in ")),
-		unsafe_perform_io(io__write_string(Op)),
-		unsafe_perform_io(io__write_string(":\n")),
-		unsafe_perform_io(io__write_string("Set1 : ")),
-		unsafe_perform_io(io__write(Set1)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("BitSet1 : ")),
-		unsafe_perform_io(io__write(BitSetSet1)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("Result Set: ")),
-		unsafe_perform_io(io__write(Set)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("Result BitSet: ")),
-		unsafe_perform_io(io__write(BitSetSet)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		error("bitset failed")
+		throw(one_argument(Op, Tester1, Tester))
 	).
 
 :- func check2(string, bitset_tester(T), bitset_tester(T),
@@ -317,7 +304,8 @@
 
 check2(Op, Tester1, Tester2, Tester) = Result :-
 	Tester1 = BitSet1 - Set1,
-	BitSetSet1 = sparse_bitset__sorted_list_to_set(set__to_sorted_list(Set1)),
+	BitSetSet1 =
+		sparse_bitset__sorted_list_to_set(set__to_sorted_list(Set1)),
 	Tester2 = BitSet2 - Set2,
 	BitSetSet2 = sorted_list_to_set(
 		set__to_sorted_list(Set2)),
@@ -328,67 +316,7 @@
 	( BitSetSet1 = BitSet1, BitSetSet2 = BitSet2, BitSet = BitSetSet ->
 		Result = Tester
 	;
-		unsafe_perform_io(io__write_string("Error in ")),
-		unsafe_perform_io(io__write_string(Op)),
-		unsafe_perform_io(io__write_string(":\n")),
-
-		unsafe_perform_io(io__write_string("Set1 : ")),
-		unsafe_perform_io(io__write(Set1)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("BitSet1 : ")),
-		unsafe_perform_io(io__write(BitSet1)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("Set2 : ")),
-		unsafe_perform_io(io__write(Set2)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("BitSet2 : ")),
-		unsafe_perform_io(io__write(BitSet2)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("Result Set: ")),
-		unsafe_perform_io(io__write(Set)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__write_string("Result BitSet: ")),
-		unsafe_perform_io(io__write(BitSet)),
-		unsafe_perform_io(io__nl),
-		unsafe_perform_io(io__nl),
-		error("bitset failed")
+		throw(two_arguments(Op, Tester1, Tester2, Tester))
 	).
-
-%-----------------------------------------------------------------------------%
-
-:- import_module io.
-
-:- pred unsafe_perform_io(pred(io__state, io__state)).
-:- mode unsafe_perform_io(pred(di, uo) is det) is det.
-:- mode unsafe_perform_io(pred(di, uo) is cc_multi) is det.
-
-:- pragma c_code(
-unsafe_perform_io(P::(pred(di, uo) is det)),
-	may_call_mercury,
-"{
-	call_io_pred_det(P);
-}").
-:- pragma c_code(
-unsafe_perform_io(P::(pred(di, uo) is cc_multi)),
-	may_call_mercury,
-"{
-	call_io_pred_cc_multi(P);
-}").
-
-:- pred call_io_pred(pred(io__state, io__state), io__state, io__state).
-:- mode call_io_pred(pred(di, uo) is det, di, uo) is det.
-:- mode call_io_pred(pred(di, uo) is cc_multi, di, uo) is cc_multi.
-
-:- pragma export(call_io_pred(pred(di, uo) is det, di, uo),
-		"call_io_pred_det").
-:- pragma export(call_io_pred(pred(di, uo) is cc_multi, di, uo),
-		"call_io_pred_cc_multi").
-
-call_io_pred(P) --> P.
 
 %-----------------------------------------------------------------------------%
--- test_bitset.m	2000/11/06 07:45:04	1.1
+++ test_bitset.m	2000/11/06 07:45:22
@@ -89,10 +89,10 @@
 	io__write_int(count(Set2)),
 	io__nl,
 
-	io__write_string("testing fold\n"),
+	io__write_string("testing foldl\n"),
 	{ Sum = (func(Elem, Acc) = Elem + Acc) },
-	{ Result1 = fold(Sum, Set1, 0) },
-	{ Result2 = fold(Sum, Set2, 0) },
+	{ Result1 = foldl(Sum, Set1, 0) },
+	{ Result2 = foldl(Sum, Set2, 0) },
 	( { Write = yes } ->
 		io__write_string("Sum of List1 = "),
 		io__write_int(Result1),
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list