[m-rev.] Version types for review
Ralph Becket
rafe at cs.mu.OZ.AU
Fri Sep 3 15:20:47 AEST 2004
I've tidied up the version types and would like to include them in the
upcoming release. I can either put them in the standard library or
extras - any opinions? Do things need to pass in Java, .NET and with
agc to go in the library? If so, I can sort those details out.
I'll add some tests once these pass review.
-- Ralph
%-----------------------------------------------------------------------------%
% version_types.m
% Ralph Becket <rafe at cs.mu.oz.au>
% Fri Jul 9 15:31:16 EST 2004
% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
%
% (This module only provides documentation describing general properties
% of version types.)
%
% Version types are efficient pure implementations of typically imperative
% structures, subject to the following caveat: efficient access is only
% guaranteed for the "latest" version of a given structure. An older version
% incurrs an access cost proportional to the number of its descendants.
%
% For example, if A0 is a version array, and A1 is created by updating A0,
% and A2 is created by updating A1, ..., and An is created by updating An-1,
% then accesses to An cost O(1) (assuming no further versions of the array
% have been created from An), but accesses to A0 cost O(n).
%
% Most of these data structures come with impure, unsafe means to "rewind"
% to an earlier version, restoring that version's O(1) access times, but
% leaving later versions undefined (i.e. only do this if you are discarding
% all later versions of the structure.)
%
%-----------------------------------------------------------------------------%
:- module version_types.
:- interface.
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% Copyright (C) 2004 The University of Melbourne.
% This file may only be copied under the terms of the GNU Library General
% Public License - see the file COPYING.LIB in the Mercury distribution.
% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
%-----------------------------------------------------------------------------%
% version_array.m
% Ralph Becket <rafe at cs.mu.oz.au>
% Wed Jan 21 15:44:04 EST 2004
%
% (See the header comments in version_types.m for an explanation of version
% types.)
%
% This module implements version arrays. A version array provides O(1)
% access and update for the "latest" version of the array. "Older"
% versions of the array incurr an O(k) penalty on accesses where k is
% the number of updates that have been made since.
%
% The advantage of version arrays is that in the common, singly threaded,
% case, they are almost as fast as unique arrays, but can be treated as
% ordinary ground values rather than unique values.
%
% Version arrays are zero based.
%
%-----------------------------------------------------------------------------%
:- module version_array.
:- interface.
:- import_module int.
:- import_module list.
:- type version_array(T).
% empty_array returns the empty array.
%
:- func empty = version_array(T).
% new(N, X) returns an array of size N with each item initialised
% to X.
%
:- func new(int, T) = version_array(T).
% A synonym for new/2.
%
:- func init(int, T) = version_array(T).
% version_array(Xs) returns an array constructed from the items in the list
% Xs.
%
:- func version_array(list(T)) = version_array(T).
% A ^ elem(I) = X iff the Ith member of A is X (the first item has
% index 0).
%
:- func version_array(T) ^ elem(int) = T.
% lookup(A, I) = A ^ elem(I).
%
:- func lookup(version_array(T), int) = T.
% (A ^ elem(I) := X) is a copy of array A with item I updated to be
% X. An exception is thrown if I is out of bounds.
%
:- func (version_array(T) ^ elem(int) := T) = version_array(T).
% Version of the above suitable for use with state variables.
%
:- pred set(int::in, T::in, version_array(T)::in, version_array(T)::out)
is det.
% size(A) = N if A contains N items (i.e. the valid indices for A
% range from 0 to N - 1).
%
:- func size(version_array(T)) = int.
% max(Z) = size(A) - 1.
%
:- func max(version_array(T)) = int.
% resize(A, N, X) returns a new array whose items from
% 0..min(size(A), N - 1) are taken from A and whose items
% from min(size(A), N - 1)..(N - 1) (if any) are initialised
% to X.
%
:- func resize(version_array(T), int, T) = version_array(T).
% Version of the above suitable for use with state variables.
%
:- pred resize(int::in, T::in, version_array(T)::in, version_array(T)::out)
is det.
% list(A) = Xs where Xs is the list of items in A
% (i.e. A = version_array(Xs)).
%
:- func list(version_array(T)) = list(T).
% foldl(F, A, X) is equivalent to list.foldl(F, list(A), Xs).
%
:- func foldl(func(T1, T2) = T2, version_array(T1), T2) = T2.
% foldr(F, A, X) is equivalent to list.foldr(F, list(A), Xs).
%
:- func foldr(func(T1, T2) = T2, version_array(T1), T2) = T2.
% copy(A) is a copy of array A. Access to the copy is O(1).
%
:- func copy(version_array(T)) = version_array(T).
% unsafe_rewind(A) produces a version of A for which all accesses are
% O(1). Invoking this predicate renders A and all later versions undefined
% that were derived by performing individual updates. Only use this when
% you are absolutely certain there are no live references to A or later
% versions of A.
%
:- func unsafe_rewind(version_array(T)) = version_array(T).
% A version of the above suitable for use with state variables.
%
:- pred unsafe_rewind(version_array(T)::in, version_array(T)::out) is det.
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% The first implementation of version arrays used nb_references.
% This incurred three memory allocations for every update. This
% version works at a lower level, but only performs one allocation
% per update.
%-----------------------------------------------------------------------------%
:- implementation.
:- import_module array.
:- import_module require.
%-----------------------------------------------------------------------------%
init(N, X) = version_array.new(N, X).
%-----------------------------------------------------------------------------%
version_array([]) = version_array.empty.
version_array([X | Xs]) =
version_array_2(1, Xs, version_array.new(1 + length(Xs), X)).
:- func version_array_2(int, list(T), version_array(T)) = version_array(T).
version_array_2(_, [], VA) = VA.
version_array_2(I, [X | Xs], VA) =
version_array_2(I + 1, Xs, VA ^ elem(I) := X).
%-----------------------------------------------------------------------------%
VA ^ elem(I) =
( if get_if_in_range(VA, I, X)
then X
else func_error("version_array.elem: index out of range")
).
lookup(VA, I) = VA ^ elem(I).
%-----------------------------------------------------------------------------%
(VA0 ^ elem(I) := X) =
( if set_if_in_range(VA0, I, X, VA)
then VA
else func_error("version_array.'elem :=': index out of range")
).
set(I, X, VA, VA ^ elem(I) := X).
%-----------------------------------------------------------------------------%
max(VA) = size(VA) - 1.
%-----------------------------------------------------------------------------%
copy(VA) =
( if size(VA) = 0 then VA
else resize(VA, size(VA), VA ^ elem(0))
).
%-----------------------------------------------------------------------------%
list(VA) = foldr(list.cons, VA, []).
%-----------------------------------------------------------------------------%
foldl(F, VA, Acc) = foldl_2(F, VA, Acc, 0, size(VA)).
:- func foldl_2(func(T1, T2) = T2, version_array(T1), T2, int, int) = T2.
foldl_2(F, VA, Acc, Lo, Hi) =
( if Lo < Hi then foldl_2(F, VA, F(VA ^ elem(Lo), Acc), Lo + 1, Hi)
else Acc
).
%-----------------------------------------------------------------------------%
foldr(F, VA, Acc) = foldr_2(F, VA, Acc, size(VA) - 1).
:- func foldr_2(func(T1, T2) = T2, version_array(T1), T2, int) = T2.
foldr_2(F, VA, Acc, Hi) =
( if 0 =< Hi then foldr_2(F, VA, F(VA ^ elem(Hi), Acc), Hi - 1)
else Acc
).
%-----------------------------------------------------------------------------%
unsafe_rewind(VA, unsafe_rewind(VA)).
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% Sordid stuff below this point...
:- pragma foreign_type("C", version_array(T), "struct va *")
where equality is eq_version_array,
comparison is cmp_version_array.
:- pred eq_version_array(version_array(T)::in, version_array(T)::in)
is semidet.
eq_version_array(VAa, VAb) :-
N = max(VAa),
N = max(VAb),
eq_version_array_2(N, VAa, VAb).
:- pred eq_version_array_2(int::in,
version_array(T)::in, version_array(T)::in) is semidet.
eq_version_array_2(I, VAa, VAb) :-
( if I >= 0 then
VAa ^ elem(I) = VAb ^ elem(I),
eq_version_array_2(I - 1, VAa, VAb)
else
true
).
:- pred cmp_version_array(comparison_result::uo,
version_array(T)::in, version_array(T)::in) is det.
cmp_version_array(R, VAa, VAb) :-
N = min(max(VAa), max(VAb)),
cmp_version_array_2(N, VAa, VAb, R).
:- pred cmp_version_array_2(int::in,
version_array(T)::in, version_array(T)::in, comparison_result::uo)
is det.
cmp_version_array_2(I, VAa, VAb, R) :-
( if I >= 0 then
compare(R0, VAa ^ elem(I), VAb ^ elem(I)),
( if R0 = (=)
then cmp_version_array_2(I - 1, VAa, VAb, R)
else R = R0
)
else
R = (=)
).
:- pragma foreign_proc("C", version_array.empty = (VA::out),
[will_not_call_mercury, promise_pure],
"
VA = va_new_empty();
").
:- pragma foreign_proc("C", version_array.new(N::in, X::in) = (VA::out),
[will_not_call_mercury, promise_pure],
"
VA = va_new(N, X);
").
:- pragma foreign_proc("C",
resize(VA0::in, N::in, X::in) = (VA::out),
[will_not_call_mercury, promise_pure],
"
VA = va_resize(VA0, N, X);
").
resize(N, X, VA, resize(VA, N, X)).
:- pragma foreign_proc("C", size(VA::in) = (N::out),
[will_not_call_mercury, promise_pure],
"
N = va_size(VA);
").
:- pred get_if_in_range(version_array(T)::in, int::in, T::out) is semidet.
:- pragma foreign_proc("C", get_if_in_range(VA::in, I::in, X::out),
[will_not_call_mercury, promise_pure],
"
SUCCESS_INDICATOR = va_get(VA, I, &X);
").
:- pred set_if_in_range(version_array(T)::in, int::in, T::in,
version_array(T)::out) is semidet.
:- pragma foreign_proc("C", set_if_in_range(VA0::in, I::in, X::in, VA::out),
[will_not_call_mercury, promise_pure],
"
SUCCESS_INDICATOR = va_set(VA0, I, X, &VA);
").
:- pragma foreign_proc("C", unsafe_rewind(VA0::in) = (VA::out),
[will_not_call_mercury, promise_pure],
"
VA = va_rewind(VA0);
").
:- pragma foreign_decl("C", "
/*
** If index is -1 then value is undefined and rest is the latest
** array value.
**
** Otherwise value is the overwritten value at index and rest is
** a pointer to the next version in the chain.
*/
struct va {
MR_Integer index; /* -1 for latest, >= 0 for older */
MR_Word value; /* Valid if index >= 0 */
union {
MR_ArrayPtr array;/* Valid if index == -1 */
struct va *next; /* Valid if index >= 0 */
} rest;
};
/*
** Constructs a new empty version array.
*/
struct va *
va_new_empty(void);
/*
** Constructs a new populated version array.
*/
struct va *
va_new(MR_Integer, MR_Word);
/*
** Resizes a version array, populating new items with the
** given default value. The result is always a `latest'
** version.
*/
struct va *
va_resize(struct va *, MR_Integer, MR_Word);
/*
** Returns the number of items in a version array.
*/
MR_Integer
va_size(struct va *);
/*
** If I is in range then va_get(VA, I, &X) sets X to the Ith item
** in VA (counting from zero) and returns MR_TRUE. Otherwise it
** returns MR_FALSE.
*/
int
va_get(struct va *, MR_Integer, MR_Word *);
/*
** If I is in range then va_set(VA0, I, X, VA) sets VA to be VA0
** updated with the Ith item as X (counting from zero) and
returns MR_TRUE. Otherwise it returns MR_FALSE.
*/
int
va_set(struct va *, MR_Integer, MR_Word, struct va **);
/*
** `Rewinds' a version array, invalidating all extant successors
** including the argument.
*/
struct va*
va_rewind(struct va *);
").
:- pragma foreign_code("C", "
#define va_latest_version(VA) ((VA)->index == -1)
struct va *
va_new_empty(void) {
struct va *VA = MR_GC_NEW(struct va);
VA->index = -1;
VA->value = (MR_Word) NULL;
VA->rest.array = (MR_ArrayPtr) MR_GC_NEW_ARRAY(MR_Word, 1);
VA->rest.array->size = 0;
return VA;
}
struct va *
va_new(MR_Integer N, MR_Word X) {
MR_Integer i;
struct va *VA = MR_GC_NEW(struct va);
VA->index = -1;
VA->value = (MR_Word) NULL;
VA->rest.array = (MR_ArrayPtr) MR_GC_NEW_ARRAY(MR_Word, N + 1);
VA->rest.array->size = N;
for(i = 0; i < N; i++) {
VA->rest.array->elements[i] = X;
}
return VA;
}
struct va *
va_resize(struct va *VA0, MR_Integer N, MR_Word X) {
MR_Integer i;
MR_Integer size_VA0 = va_size(VA0);
MR_Integer min = (N <= size_VA0 ? N : size_VA0);
struct va *VA = MR_GC_NEW(struct va);
VA->index = -1;
VA->value = (MR_Word) NULL;
VA->rest.array = (MR_ArrayPtr) MR_GC_NEW_ARRAY(MR_Word, N + 1);
VA->rest.array->size = N;
for(i = 0; i < min; i++) {
(void) va_get(VA0, i, &VA->rest.array->elements[i]);
}
for(i = min; i < N; i++) {
VA->rest.array->elements[i] = X;
}
return VA;
}
MR_Integer
va_size(struct va *VA) {
while(!va_latest_version(VA)) {
VA = VA->rest.next;
}
return VA->rest.array->size;
}
int
va_get(struct va *VA, MR_Integer I, MR_Word *Xptr) {
while(!va_latest_version(VA)) {
if(I == VA->index) {
*Xptr = VA->value;
return MR_TRUE;
}
VA = VA->rest.next;
}
if(0 <= I && I < VA->rest.array->size) {
*Xptr = VA->rest.array->elements[I];
return MR_TRUE;
} else {
return MR_FALSE;
}
}
int
va_set(struct va *VA0, MR_Integer I, MR_Word X, struct va **VAptr) {
struct va *VA1 = MR_GC_NEW(struct va);
if(va_latest_version(VA0)) {
if(I < 0 || I >= VA0->rest.array->size) {
return MR_FALSE;
}
VA1->index = -1;
VA1->value = (MR_Word) NULL;
VA1->rest.array = VA0->rest.array;
VA0->index = I;
VA0->value = VA0->rest.array->elements[I];
VA0->rest.next = VA1;
VA1->rest.array->elements[I] = X;
} else {
if(I >= va_size(VA0)) {
return MR_FALSE;
}
VA1->index = I;
VA1->value = X;
VA1->rest.next = VA0;
}
*VAptr = VA1;
return MR_TRUE;
}
struct va*
va_rewind(struct va *VA) {
MR_Integer I;
MR_Word X;
if(va_latest_version(VA)) {
return VA;
}
I = VA->index;
X = VA->value;
VA = va_rewind(VA->rest.next);
VA->index = I;
VA->value = X;
return VA;
}
").
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% version_store.m
% Ralph Becket <rafe at cs.mu.oz.au>
% Thu Jan 22 12:01:19 EST 2004
% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
%
% (See the header comments in version_types.m for an explanation of version
% types.)
%
% A version store - this operates with similar efficiency to an ordinary
% store, but "older" versions of the store are still available, but work
% less efficiently as more updates are made to the "latest" version of
% the store. Operations on the "latest" version are always O(1).
%
% The advantage of version stores is that they are ordinary ground terms
% and can therefore be nested and so forth without the need for complicated
% insts.
%
%-----------------------------------------------------------------------------%
:- module version_store.
:- interface.
:- type version_store(S).
:- type mutvar(T, S).
% Construct a new version store. This is distinguised from other
% version stores by its existentially quantified type. This means
% the compiler can automatically detect any attempt to use a
% mutvar with the wrong version store.
%
:- some [S] func new = version_store(S).
% new_mutvar(X, Mutvar, VS0, VS) adds a new mutvar with value reference X
% to the version store.
%
:- pred new_mutvar(T::in, mutvar(T, S)::out,
version_store(S)::in, version_store(S)::out) is det.
% new_cyclic_mutvar(F, Mutvar, VS0, VS) adds a new mutvar with value
% reference F(Mutvar) to the version store. This can be used to
% construct cyclic terms.
%
:- pred new_cyclic_mutvar((func(mutvar(T, S)) = T)::in, mutvar(T, S)::out,
version_store(S)::in, version_store(S)::out) is det.
% copy_mutvar(Mutvar, NewMutvar, VS0, VS) constructs NewMutvar
% with the value reference as Mutvar.
%
:- pred copy_mutvar(mutvar(T, S)::in, mutvar(T, S)::out,
version_store(S)::in, version_store(S)::out) is det.
% VS ^ elem(Mutvar) returns the element referenced by Mutvar in
% the version store.
%
:- func version_store(S) ^ elem(mutvar(T, S)) = T.
% lookup(VS, Mutvar) = VS ^ elem(Mutvar).
%
:- func lookup(version_store(S), mutvar(T, S)) = T.
% A version of the above suitable for use with state variables.
%
:- pred get_mutvar(mutvar(T, S)::in, T::out,
version_store(S)::in, version_store(S)::out) is det.
% ( VS ^ elem(Mutvar) := X ) updates the version store so that
% Mutvar now refers to value X.
%
:- func ( version_store(S) ^ elem(mutvar(T, S)) := T ) = version_store(S).
% set(VS, Mutvar, X) = ( VS ^ elem(Mutvar) := X ).
%
:- func set(version_store(S), mutvar(T, S), T) = version_store(S).
% A version of the above suitable for use with state variables.
%
:- pred set_mutvar(mutvar(T, S)::in, T::in,
version_store(S)::in, version_store(S)::out) is det.
% unsafe_rewind(VS) produces a version of VS for which all accesses are
% O(1). Invoking this predicate renders VS and all later versions
% undefined that were derived by performing individual updates. Only use
% this when you are absolutely certain there are no live references to VS
% or later versions of VS.
%
:- func unsafe_rewind(version_store(T)) = version_store(T).
% VS version of the above suitable for use with state variables.
%
:- pred unsafe_rewind(version_store(T)::in, version_store(T)::out) is det.
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
:- implementation.
:- import_module counter.
:- import_module int.
:- import_module list.
:- import_module std_util.
:- import_module version_array.
% Index 0 of the version_store contains the counter used to assign
% new version store mutvars. A mutvar is just an index into the
% version_store.
%
:- type version_store(S) ---> version_store(version_array(univ)).
:- type mutvar(T, S) ---> mutvar(int).
:- type some_version_store_type ---> some_version_store_type.
%-----------------------------------------------------------------------------%
new = version_store(VA) `with_type` version_store(some_version_store_type) :-
% 256 is just a magic number. The version_store is resized by
% doubling if necessary when adding a new mutvar. Index 0 of
% the version_store holds a counter for allocating new mutvars.
%
VA = version_array.new(256, univ(counter.init(1) `with_type` counter)).
%-----------------------------------------------------------------------------%
new_mutvar(X, Mutvar, VS0, VS) :-
new_cyclic_mutvar(func(_) = X, Mutvar, VS0, VS).
%-----------------------------------------------------------------------------%
new_cyclic_mutvar(F, Mutvar, VS0, VS) :-
Counter0 = VS0 ^ elem(mutvar(0)),
counter.allocate(I, Counter0, Counter),
Mutvar = mutvar(I),
Size0 = size(VS0),
VS1 = ( if I >= Size0 then resize(VS0, Size0 + Size0)
else VS0 ),
VS = (( VS1 ^ elem(mutvar(0)) := Counter )
^ elem(Mutvar ) := F(Mutvar) ).
:- func size(version_store(S)) = int.
size(version_store(VA)) = size(VA).
:- func resize(version_store(S), int) = version_store(S).
resize(version_store(VA), N) = version_store(resize(VA, N, univ(unit))).
%-----------------------------------------------------------------------------%
copy_mutvar(Mutvar0, Mutvar, VS0, VS) :-
X = VS0 ^ elem(Mutvar0),
new_mutvar(X, Mutvar, VS0, VS).
%-----------------------------------------------------------------------------%
version_store(VA) ^ elem(mutvar(I)) = X :-
UnivX = lookup(VA, I),
det_univ_to_type(UnivX, X).
lookup(VS, Mutvar) = VS ^ elem(Mutvar).
get_mutvar(Mutvar, VS ^ elem(Mutvar), VS, VS).
%-----------------------------------------------------------------------------%
( version_store(VA) ^ elem(mutvar(I)) := X ) =
version_store(VA ^ elem(I) := univ(X)).
set(VS, Mutvar, X) = ( VS ^ elem(Mutvar) := X ).
set_mutvar(Mutvar, X, VS, VS ^ elem(Mutvar) := X).
%-----------------------------------------------------------------------------%
unsafe_rewind(version_store(VA)) = version_store(unsafe_rewind(VA)).
unsafe_rewind(VS, unsafe_rewind(VS)).
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% Copyright (C) 2001-2002 The University of Melbourne
% This file may only be copied under the terms of the GNU Library General
% Public License - see the file COPYING.LIB in the Mercury distribution.
% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
%-----------------------------------------------------------------------------%
% version_bitmap.m
% Ralph Becket <rafe at cs.mu.oz.au>
%
% (See the header comments in version_types.m for an explanation of version
% types.)
%
% Version bitmaps: an implementation of bitmaps using version arrays.
%
% The advantage of version bitmaps is that in the common, singly threaded,
% case, they are almost as fast as unique bitmaps, but can be treated as
% ordinary ground values rather than unique values.
%
%-----------------------------------------------------------------------------%
:- module version_bitmap.
:- interface.
:- import_module int, bool.
:- type version_bitmap.
% new(N, B) creates a version_bitmap of size N (indexed 0 .. N-1)
% setting each bit if B = yes and clearing each bit if B = no.
% An exception is thrown if N is negative.
%
:- func new(int, bool) = version_bitmap.
% Returns the number of bits in a version_bitmap.
%
:- func num_bits(version_bitmap) = int.
% set(BM, I), clear(BM, I) and flip(BM, I) set, clear and flip
% bit I in BM respectively. An exception is thrown if I is out
% of range.
%
:- func set(version_bitmap, int) = version_bitmap.
:- func clear(version_bitmap, int) = version_bitmap.
:- func flip(version_bitmap, int) = version_bitmap.
% Versions of the above suitable for use with state variables.
%
:- pred set(int::in, version_bitmap::in, version_bitmap::out) is det.
:- pred clear(int::in, version_bitmap::in, version_bitmap::out) is det.
:- pred flip(int::in, version_bitmap::in, version_bitmap::out) is det.
% is_set(BM, I) and is_clear(BM, I) succeed iff bit I in BM
% is set or clear respectively.
%
:- pred is_set(version_bitmap, int).
:- mode is_set(in, in) is semidet.
:- pred is_clear(version_bitmap, int).
:- mode is_clear(in, in) is semidet.
% get(BM, I) returns `yes' if is_set(BM, I) and `no' otherwise.
%
:- func get(version_bitmap, int) = bool.
% Create a new copy of a version_bitmap.
%
:- func copy(version_bitmap) = version_bitmap.
% Set operations; the second argument is altered in all cases.
%
:- func complement(version_bitmap) = version_bitmap.
:- func union(version_bitmap, version_bitmap) = version_bitmap.
:- func intersect(version_bitmap, version_bitmap) = version_bitmap.
:- func difference(version_bitmap, version_bitmap) = version_bitmap.
% resize(BM, N, B) resizes version_bitmap BM to have N bits; if N is
% smaller than the current number of bits in BM then the excess
% are discarded. If N is larger than the current number of bits
% in BM then the new bits are set if B = yes and cleared if
% B = no.
%
:- func resize(version_bitmap, int, bool) = version_bitmap.
% Version of the above suitable for use with state variables.
%
:- pred resize(int::in, bool::in, version_bitmap::in, version_bitmap::out)
is det.
% unsafe_rewind(B) produces a version of B for which all accesses are
% O(1). Invoking this predicate renders B and all later versions undefined
% that were derived by performing individual updates. Only use this when
% you are absolutely certain there are no live references to B or later
% versions of B.
%
:- func unsafe_rewind(version_bitmap) = version_bitmap.
% A version of the above suitable for use with state variables.
%
:- pred unsafe_rewind(version_bitmap::in, version_bitmap::out) is det.
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
:- implementation.
:- import_module version_array, exception, require.
% A version_bitmap is represented as an array of ints where each int stores
% int.bits_per_int bits. The first element of the array (index 0)
% is used to hold the number of bits in the version_bitmap. This avoids
% having to create a new version_bitmap cell on each update.
%
% NOTE: the `filler' bits in the last element of the array *must*
% be clear (i.e. zero). This makes the set operations simpler to
% implement.
%
:- type version_bitmap == version_array(int).
%-----------------------------------------------------------------------------%
new(N, B) = BM :-
( if N < 0 then
throw(software_error("version_bitmap.new: negative size"))
else
X = initializer(B),
BM0 = (version_array.init(num_ints_required(N), X) ^ elem(0) := N),
BM = clear_filler_bits(BM0)
).
%-----------------------------------------------------------------------------%
resize(BM0, N, B) = BM :-
( if N =< 0 then
BM = new(N, B)
else
X = initializer(B),
NumInts = num_ints_required(N),
BM1 = version_array.resize(BM0, NumInts, X),
% Now we need to ensure that bits N, N+1, N+2, ... up to
% the word boundary are initialized properly.
%
int.min(num_bits(BM0), N, M),
Offset = int_offset(M - 1),
Mask = bitsmask(M - 1), % For bits we need to preserve.
Bits = \(Mask) /\ X, % Bits we need to fill in.
BM2 = (( BM1
^ elem(0) := N )
^ elem(Offset) := (BM1 ^ elem(Offset) /\ Mask) \/ Bits),
BM = clear_filler_bits(BM2)
).
resize(N, B, BM, resize(BM, N, B)).
%-----------------------------------------------------------------------------%
:- func clear_filler_bits(version_bitmap) = version_bitmap.
clear_filler_bits(BM0) = BM :-
N = num_bits(BM0),
( if N > 0 then
Last = int_offset(N - 1), % Offset of last bit.
Ksam = bitsmask(N - 1), % Masks off the filler bits.
BM = BM0 ^ elem(Last) := BM0 ^ elem(Last) /\ Ksam
else
BM = BM0
).
%-----------------------------------------------------------------------------%
:- func initializer(bool) = int.
initializer(no) = 0.
initializer(yes) = \(0).
%-----------------------------------------------------------------------------%
num_bits(BM) = BM ^ elem(0).
%-----------------------------------------------------------------------------%
:- pred in_range(version_bitmap, int).
:- mode in_range(in, in) is semidet.
in_range(BM, I) :- 0 =< I, I < num_bits(BM).
%-----------------------------------------------------------------------------%
set(BM, I) =
( if in_range(BM, I)
then BM ^ elem(int_offset(I)) :=
BM ^ elem(int_offset(I)) \/ bitmask(I)
else throw(software_error("version_bitmap.set: out of range"))
).
clear(BM, I) =
( if in_range(BM, I)
then BM ^ elem(int_offset(I)) :=
BM ^ elem(int_offset(I)) /\ \bitmask(I)
else throw(software_error("version_bitmap.clear: out of range"))
).
flip(BM, I) =
( if in_range(BM, I)
then BM ^ elem(int_offset(I)) :=
BM ^ elem(int_offset(I)) `xor` bitmask(I)
else throw(software_error("version_bitmap.flip: out of range"))
).
%-----------------------------------------------------------------------------%
set(I, BM, set(BM, I)).
clear(I, BM, clear(BM, I)).
flip(I, BM, flip(BM, I)).
%-----------------------------------------------------------------------------%
is_set(BM, I) :-
( if in_range(BM, I)
then BM ^ elem(int_offset(I)) /\ bitmask(I) \= 0
else throw(software_error("version_bitmap.is_set: out of range"))
).
is_clear(BM, I) :-
( if in_range(BM, I)
then BM ^ elem(int_offset(I)) /\ bitmask(I) = 0
else throw(software_error("version_bitmap.is_clear: out of range"))
).
%-----------------------------------------------------------------------------%
get(BM, I) = ( if is_clear(BM, I) then no else yes ).
%-----------------------------------------------------------------------------%
copy(BM) = version_array.copy(BM).
%-----------------------------------------------------------------------------%
complement(BM) =
clear_filler_bits(complement_2(BM ^ elem(0) - 1, BM)).
:- func complement_2(int, version_bitmap) = version_bitmap.
complement_2(WordI, BM) =
( if WordI =< 0
then BM
else complement_2(WordI - 1, BM ^ elem(WordI) := \(BM ^ elem(WordI)))
).
%-----------------------------------------------------------------------------%
union(BMa, BMb) =
( if num_bits(BMa) > num_bits(BMb) then
zip(int_offset(num_bits(BMb) - 1), (\/), BMb, version_bitmap.copy(BMa))
else
zip(int_offset(num_bits(BMa) - 1), (\/), BMa, BMb)
).
%-----------------------------------------------------------------------------%
intersect(BMa, BMb) =
( if num_bits(BMa) > num_bits(BMb) then
zip(int_offset(num_bits(BMb) - 1), (/\), BMb, version_bitmap.copy(BMa))
else
zip(int_offset(num_bits(BMa) - 1), (/\), BMa, BMb)
).
%-----------------------------------------------------------------------------%
difference(BMa, BMb) =
( if num_bits(BMa) > num_bits(BMb) then
zip(int_offset(num_bits(BMb) - 1), Xor, BMb, version_bitmap.copy(BMa))
else
zip(int_offset(num_bits(BMa) - 1), Xor, BMa, BMb)
)
:-
Xor = ( func(X, Y) = (X `xor` Y) ).
%-----------------------------------------------------------------------------%
% Applies a function to every corresponding element between +ve I
% and 1 inclusive, destructively updating the second version_bitmap.
%
:- func zip(int, func(int, int) = int, version_bitmap, version_bitmap) =
version_bitmap.
zip(I, Fn, BMa, BMb) =
( if I > 0 then
zip(I - 1, Fn, BMa, BMb ^ elem(I) := Fn(BMb ^ elem(I), BMa ^ elem(I)))
else
BMb
).
%-----------------------------------------------------------------------------%
% The size of the version_array required to hold an N-bit version_bitmap.
%
:- func num_ints_required(int) = int.
% We add the 1 on because version_arrays of size N are indexed 0 .. N - 1.
%
num_ints_required(N) = 1 + ( if N > 0 then int_offset(N) else 0 ).
%-----------------------------------------------------------------------------%
% The version_array index containing the given bit.
%
:- func int_offset(int) = int.
% We add the extra 1 on because elem(0) is used to store the number of
% bits in the version_bitmap; the data are stored in the following
% elements.
%
int_offset(I) = 1 + int.quot_bits_per_int(I).
%-----------------------------------------------------------------------------%
% Construct the bitmask for a given bit in a word.
%
% E.g. assuming int.bits_per_int = 8 and I = 11 then
% bitmask(I) = 2'00001000
%
:- func bitmask(int) = int.
% NOTE: it would be nicer to use /\ with a bitmask here rather
% than rem. Do modern back-ends do the decent thing here if
% int.bits_per_int is the expected power of two?
%
bitmask(I) = 1 `unchecked_left_shift` int.rem_bits_per_int(I).
%-----------------------------------------------------------------------------%
% Construct the bitmask for all the bits up to and including
% the given bit in a word.
%
% E.g. assuming int.bits_per_int = 8 and I = 11 then
% bitmask(I) = 2'00001111
%
:- func bitsmask(int) = int.
bitsmask(I) = BitsMask :-
BitMask = bitmask(I),
BitsMask = BitMask \/ (BitMask - 1).
%-----------------------------------------------------------------------------%
unsafe_rewind(BM) = version_array.unsafe_rewind(BM).
unsafe_rewind(BM, version_bitmap.unsafe_rewind(BM)).
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
% Copyright (C) 2001, 2003 The University of Melbourne
% This file may only be copied under the terms of the GNU Library General
% Public License - see the file COPYING.LIB in the Mercury distribution.
% vim: ft=mercury ts=4 sw=4 et tw=0 wm=0
%-----------------------------------------------------------------------------%
%
% File: version_hash_table.m
% Main author: rafe
% Stability: low
%
% (See the header comments in version_types.m for an explanation of version
% types.)
%
% Version hash tables. The "latest" version of the hash table
% provides roughly the same performance as the unique hash table
% implementation. "Older" versions of the hash table are still
% accessible, but will incurr a growing performance penalty as
% more updates are made to the hash table.
%
%-----------------------------------------------------------------------------%
:- module version_hash_table.
:- interface.
:- import_module int, assoc_list, float, string, char.
:- type version_hash_table(K, V).
:- type hash_pred(K) == ( pred(K, int, int) ).
:- inst hash_pred == ( pred(in, out, out) is det ).
% new(HashFunc, N, MaxOccupancy)
% constructs a new hash table with initial size 2 ^ N that is
% doubled whenever MaxOccupancy is achieved; elements are
% indexed using HashFunc.
%
% HashFunc must compute two *independent* hashes for a given
% key - that is, one hash should not be a function of the other.
% Otherwise poor performance will result.
%
% N must be greater than 1.
% MaxOccupancy must be in (0.0, 1.0).
%
% XXX Values too close to the limits may cause bad things
% to happen.
%
:- func new(hash_pred(K)::in(hash_pred), int::in, float::in) =
(version_hash_table(K, V)::out) is det.
% new_default(HashFn) constructs a hash table with default size and
% occupancy arguments.
%
:- func new_default(hash_pred(K)::in(hash_pred)) =
(version_hash_table(K, V)::out) is det.
% Retrieve the hash_pred associated with a hash table.
%
% :- func hash_pred(version_hash_table(K, V)) = hash_pred(K).
% Default hash_preds for ints and strings and everything (buwahahaha!)
%
:- pred int_double_hash `with_type` hash_pred(int) `with_inst` hash_pred.
:- pred string_double_hash `with_type` hash_pred(string) `with_inst` hash_pred.
:- pred char_double_hash `with_type` hash_pred(char) `with_inst` hash_pred.
:- pred float_double_hash `with_type` hash_pred(float) `with_inst` hash_pred.
:- pred generic_double_hash `with_type` hash_pred(T) `with_inst` hash_pred.
% Returns the number of buckets in a hash table.
%
:- func num_buckets(version_hash_table(K, V)) = int.
% Returns the number of occupants in a hash table.
%
:- func num_occupants(version_hash_table(K, V)) = int.
% Insert key-value binding into a hash table; if one is
% already there then the previous value is overwritten.
%
:- func set(version_hash_table(K, V), K, V) = version_hash_table(K, V).
% A version of the above suitable for use with state variables.
%
:- pred set(K::in, V::in,
version_hash_table(K, V)::in, version_hash_table(K, V)::out)
is det.
% Field update for hash tables.
% HT ^ elem(K) := V is equivalent to set(HT, K, V).
%
:- func 'elem :='(K, version_hash_table(K, V), V) = version_hash_table(K, V).
% Insert a key-value binding into a hash table. An
% exception is thrown if a binding for the key is already
% present.
%
:- func det_insert(version_hash_table(K, V), K, V) = version_hash_table(K, V).
% A version of the above suitable for use with state variables.
%
:- pred det_insert(K::in, V::in,
version_hash_table(K, V)::in, version_hash_table(K, V)::out)
is det.
% Change a key-value binding in a hash table. An
% exception is thrown if a binding for the key does not
% already exist.
%
:- func det_update(version_hash_table(K, V), K, V) = version_hash_table(K, V).
% A version of the above suitable for use with state variables.
%
:- pred det_update(K::in, V::in,
version_hash_table(K, V)::in, version_hash_table(K, V)::out)
is det.
% Delete the entry for the given key, leaving the hash table
% unchanged if there is no such entry.
%
:- func delete(version_hash_table(K, V), K) = version_hash_table(K, V).
% A version of the above suitable for use with state variables.
%
:- pred delete(K::in,
version_hash_table(K, V)::in, version_hash_table(K, V)::out)
is det.
% Lookup the value associated with the given key. An exception
% is raised if there is no entry for the key.
%
:- func lookup(version_hash_table(K, V), K) = V.
% Field access for hash tables.
% HT ^ elem(K) is equivalent to lookup(HT, K).
%
:- func version_hash_table(K, V) ^ elem(K) = V.
% Like lookup, but just fails if there is no entry for the key.
%
:- func search(version_hash_table(K, V), K) = V is semidet.
:- pred search(version_hash_table(K, V)::in, K::in, V::out) is semidet.
% Convert a hash table into an association list.
%
:- func to_assoc_list(version_hash_table(K, V)) = assoc_list(K, V).
% Fold a function over the key-value bindings in a hash table.
%
:- func fold(func(K, V, T) = T, version_hash_table(K, V), T) = T.
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
:- implementation.
:- import_module math, bool, exception, list, require, std_util, array.
:- import_module version_array.
:- type version_hash_table(K, V)
---> ht(
num_buckets :: int,
num_occupants :: int,
max_occupants :: int,
hash_func :: hash_func(K),
buckets :: buckets(K, V)
).
:- type hash_func(K) == (func(K) = {int, int}).
:- type buckets(K, V) == version_array(bucket(K, V)).
:- type bucket(K, V)
---> empty
; full(K, V).
%-----------------------------------------------------------------------------%
% THE HASHING SCHEME
%
% The user provided hashing function computes two independent
% hashes H1 and H2 for a given key K.
%
% We calculate D = 2*H2 + 1 (to ensure that D is non-zero and
% odd and is therefore coprime to the number of buckets) and
% probe the table where the Kth probe examines slot
%
% (H1 + K * H2) `mod` num_buckets
%
% The search is guaranteed to terminate because table occupancy
% must be less than 1.0.
%-----------------------------------------------------------------------------%
new(HashPred, N, MaxOccupancy) = HT :-
( if N =< 1 then
throw(software_error("version_hash_table__new: N =< 1"))
else if N >= int__bits_per_int then
throw(software_error(
"version_hash_table__new: N >= int__bits_per_int"))
else if MaxOccupancy =< 0.0 ; 1.0 =< MaxOccupancy then
throw(software_error(
"version_hash_table__new: MaxOccupancy not in (0.0, 1.0)"))
else
NumBuckets = 1 << N,
MaxOccupants = ceiling_to_int(float(NumBuckets) * MaxOccupancy),
HashFunc = (func(X) = {I, J} :- HashPred(X, I, J)),
VArray = version_array.init(NumBuckets, empty),
HT = ht(NumBuckets, 0, MaxOccupants, HashFunc, VArray)
).
%-----------------------------------------------------------------------------%
% These numbers are picked out of thin air.
%
new_default(HashPred) = new(HashPred, 7, 0.9).
%-----------------------------------------------------------------------------%
% find_slot(HT, K) looks up key K in hash table HT and
% returns the index for the entry K in H. This is either the
% first free slot identified (K is not in the table, but here
% is where it would go) or the slot for K (K is in the table
% and this is its slot).
%
% Whether or not K is actually in the table can be decided
% by checking to see whether its bit in the vbitmap is set
% or clear.
%
:- func find_slot(version_hash_table(K, V), K) = int.
find_slot(HT, K) = H :-
{Hash1, Hash2} = (HT ^ hash_func)(K),
H0 = Hash1 mod HT ^ num_buckets,
% Have to ensure it's odd and non-zero.
Delta = Hash2 + Hash2 + 1,
H = find_slot_2(HT, K, H0, Delta).
:- func find_slot_2(version_hash_table(K, V), K, int, int) = int.
find_slot_2(HT, K, H0, Delta) = H :-
B = HT ^ buckets ^ elem(H0),
(
B = empty,
H = H0
;
B = full(Key, _Value),
( if K = Key then
H = H0
else
H1 = (H0 + Delta) mod HT ^ num_buckets,
H = find_slot_2(HT, K, H1, Delta)
)
).
%-----------------------------------------------------------------------------%
set(HT0, K, V) = HT :-
% If this is the first entry in the hash table, then we use it to
% set up the hash table (the version_arrays are currently empty because
% we need values to initialise them with).
%
H = find_slot(HT0, K),
B = HT0 ^ buckets ^ elem(H),
(
B = empty,
HT =
( if HT0 ^ num_occupants = HT0 ^ max_occupants then
set(expand(HT0), K, V)
else
(( HT0 ^ num_occupants := HT0 ^ num_occupants + 1 )
^ buckets ^ elem(H) := full(K, V) )
)
;
B = full(_, _),
HT = ( HT0 ^ buckets ^ elem(H) := full(K, V) )
).
'elem :='(K, HT, V) = set(HT, K, V).
set(K, V, HT, set(HT, K, V)).
%-----------------------------------------------------------------------------%
search(HT, K, search(HT, K)).
search(HT, K) = V :-
H = find_slot(HT, K),
HT ^ buckets ^ elem(H) = full(K, V).
%-----------------------------------------------------------------------------%
det_insert(HT0, K, V) = HT :-
H = find_slot(HT0, K),
B = HT0 ^ buckets ^ elem(H),
(
B = full(_, _),
error("version_hash_table__det_update: key already present")
;
B = empty,
HT =
( if HT0 ^ num_occupants = HT0 ^ max_occupants then
set(expand(HT0), K, V)
else
(( HT0 ^ num_occupants := HT0 ^ num_occupants + 1 )
^ buckets ^ elem(H) := full(K, V) )
)
).
det_insert(K, V, HT, det_insert(HT, K, V)).
%-----------------------------------------------------------------------------%
det_update(HT0, K, V) = HT :-
H = find_slot(HT0, K),
B = HT0 ^ buckets ^ elem(H),
(
B = empty,
error("version_hash_table__det_update: key not found")
;
B = full(_, _),
HT = ( HT0 ^ buckets ^ elem(H) := full(K, V) )
).
det_update(K, V, HT, det_update(HT, K, V)).
%-----------------------------------------------------------------------------%
lookup(HT, K) =
( if V = search(HT, K)
then V
else func_error("version_hash_table__lookup: key not found")
).
HT ^ elem(K) = lookup(HT, K).
%-----------------------------------------------------------------------------%
delete(HT, K) =
HT ^ buckets ^ elem(find_slot(HT, K)) := empty.
delete(K, HT, delete(HT, K)).
%-----------------------------------------------------------------------------%
to_assoc_list(HT) =
fold_up(cons_k_v(HT ^ buckets), 0, HT ^ num_buckets - 1, []).
:- func cons_k_v(version_array(bucket(K, V)), int, assoc_list(K, V)) =
assoc_list(K, V).
cons_k_v(Bs, I, KVs) =
( if Bs ^ elem(I) = full(K, V)
then [K - V | KVs]
else KVs
).
%-----------------------------------------------------------------------------%
% Hash tables expand by doubling in size.
%
:- func expand(version_hash_table(K, V)) = version_hash_table(K, V).
expand(HT0) = HT :-
HT0 = ht(NBs0, _NOs, MOs0, HF, Bs0),
NBs = NBs0 + NBs0,
MOs = MOs0 + MOs0,
Bs1 = version_array.init(NBs, empty),
HT1 = ht(NBs, 0, MOs, HF, Bs1),
HT = fold_up(reinsert_k_v(Bs0), 0, NBs0 - 1, HT1).
:- func reinsert_k_v(buckets(K, V), int, version_hash_table(K, V)) =
version_hash_table(K, V).
reinsert_k_v(Bs, I, HT) =
( if Bs ^ elem(I) = full(K, V)
then HT ^ elem(K) := V
else HT
).
%-----------------------------------------------------------------------------%
% There are almost certainly better ones out there...
%
% NOTE that H1 \= N since neither of H1 or H2 should be a function
% of the other under machine integer arithmetic.
%
int_double_hash(N, H1, H2) :-
H1 = N * N,
H2 = N `xor` (N + N).
%-----------------------------------------------------------------------------%
% There are almost certainly better ones out there...
%
string_double_hash(S, H1, H2) :-
H1 = string__hash(S),
H2 = string__foldl(func(C, N) = char__to_int(C) + N, S, 0).
%------------------------------------------------------------------------------%
% There are almost certainly better ones out there...
%
float_double_hash(F, H1, H2) :-
H1 = float__hash(F),
H2 = float__hash(F * F).
%------------------------------------------------------------------------------%
% There are almost certainly better ones out there...
%
char_double_hash(C, H1, H2) :-
int_double_hash(char__to_int(C), H1, H2).
%-----------------------------------------------------------------------------%
% This, again, is straight off the top of my head.
%
generic_double_hash(T, Ha, Hb) :-
( if dynamic_cast(T, Int) then
int_double_hash(Int, Ha, Hb)
else if dynamic_cast(T, String) then
string_double_hash(String, Ha, Hb)
else if dynamic_cast(T, Float) then
float_double_hash(Float, Ha, Hb)
else if dynamic_cast(T, Char) then
char_double_hash(Char, Ha, Hb)
else if dynamic_cast(T, Univ) then
generic_double_hash(univ_value(Univ), Ha, Hb)
else if dynamic_cast_to_array(T, Array) then
{Ha, Hb} =
array__foldl(
( func(X, {HA0, HB0}) = {HA, HB} :-
generic_double_hash(X, HXA, HXB),
double_munge(HXA, HA0, HA, HXB, HB0, HB)
),
Array,
{0, 0}
)
else
deconstruct(T, FunctorName, Arity, Args),
string_double_hash(FunctorName, Ha0, Hb0),
double_munge(Arity, Ha0, Ha1, Arity, Hb0, Hb1),
list__foldl2(
( pred(U::in, HA0::in, HA::out, HB0::in, HB::out) is det :-
generic_double_hash(U, HUA, HUB),
double_munge(HUA, HA0, HA, HUB, HB0, HB)
),
Args,
Ha1, Ha,
Hb1, Hb
)
).
%-----------------------------------------------------------------------------%
:- func munge_factor_a = int.
munge_factor_a = 5.
:- func munge_factor_b = int.
munge_factor_b = 3.
:- pred double_munge(int::in, int::in, int::out, int::in, int::in, int::out)
is det.
double_munge(X, Ha0, Ha, Y, Hb0, Hb) :-
Ha = munge(munge_factor_a, Ha0, X),
Hb = munge(munge_factor_b, Hb0, Y).
:- func munge(int, int, int) = int.
munge(N, X, Y) =
(X `unchecked_left_shift` N) `xor`
(X `unchecked_right_shift` (int__bits_per_int - N)) `xor`
Y.
%-----------------------------------------------------------------------------%
fold(Fn, HT, X) = fold_up(apply_k_v(Fn, HT ^ buckets), 0, HT ^ num_buckets, X).
:- func apply_k_v(func(K, V, T) = T, buckets(K, V), int, T) = T.
apply_k_v(Fn, Bs, I, A) =
( if Bs ^ elem(I) = full(K, V)
then Fn(K, V, A)
else A
).
%-----------------------------------------------------------------------------%
% XXX To go into array.m
%
% dynamic_cast/2 won't work for arbitrary arrays since array/1 is
% not a ground type (that is, dynamic_cast/2 will work when the
% target type is e.g. array(int), but not when it is array(T)).
%
:- some [T2] pred dynamic_cast_to_array(T1::in, array(T2)::out) is semidet.
dynamic_cast_to_array(X, A) :-
% If X is an array then it has a type with one type argument.
%
[ArgTypeDesc] = type_args(type_of(X)),
% Convert ArgTypeDesc to a type variable ArgType.
%
(_ `with_type` ArgType) `has_type` ArgTypeDesc,
% Constrain the type of A to be array(ArgType) and do the
% cast.
%
dynamic_cast(X, A `with_type` array(ArgType)).
%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list