[m-dev.] For review: bitmap module.

Ralph Becket rbeck at microsoft.com
Sat Feb 3 03:34:16 AEDT 2001


>From Fergus Henderson on 02/02/2001 03:39:36
> On 01-Feb-2001, Ralph Becket <rbeck at microsoft.com> wrote:
> > % bitmap.m
> > % Ralph Becket <rbeck at microsoft.com>
> 
> There should be a copyright notice (also for hash_table.m).

Done.

> The module is quite similar to the `sparse_bitset' module.
> It might make more sense to name this one `bitset' or `dense_bitset'
> rather than `bitmap'.

Hmm, I'm not so sure.  `bitset' to me sounds like the domain {yes, no}.
I think `sparse_bitset' is a misnomer since it's really an ADT for
sparse int sets.

I quite like bitmap since it's fairly unambiguous, but if there are
strong feelings otherwise I don't mind changing the name.

> Also, it would be useful to provide set operations such as
> union, intersection, etc.

Done.

I've called the intersection function `intersect' in line with
the rest of the std library, although I prefer `intersection' - 
`intersect' sounds too imperative to me.

> You should also define `inst bitmap == array' and `inst uniq_bitmap ==
> uniq_array'.  That is needed in case someone defines a type which
> contains a bitmap and they want to define an inst for their type.

Done.

> I suggest also providing
> 
> 	:- func get(bitmap, int) = bool.

Done.

> Use `%---...---%' rather than `% ---...--- %'.

Is this really important?  I can certainly do that, but I *like*
the spaces.  Use of any `%[ -]-----....-----[ -]%' separators seems
to be a matter of preference at the mo.

> And fix your mailer so that it doesn't wrap long lines.

Ah, well if anybody knows how to make Outlook do that thing I'd be
very grateful to hear it :)  You would not believe the amount of sick
hackery required just to get it to optionally include replied-to
e-mail messages with chevrons in the right place...

> > :- func num_ints_required(int) = int.
> > 
> > num_ints_required(N) = 1 + ((N + int__bits_per_int - 1) //
int__bits_per_int).
> 
> You could use int__quot_bits_per_int there.
> 
> > :- func bitmask(int) = int.
> > 
> >     % NOTE: it would be nicer to use /\ with a bitmask here rather
> >     % than rem.  Do modern back-ends do the decent thing here if
> >     % int__bits_per_int is the expected power of two?
> >     %
> > bitmask(I) = 1 `unchecked_left_shift` (I `rem` int__bits_per_int).
> 
> I suggest defining int__rem_bits_per_int,
> similar to the int__quot_bits_per_int used by sparse_bitset.m.

Both good ideas.  Done.

> > :- func int_offset(int) = int.
> > 
> > int_offset(I) = 1 + (I // int__bits_per_int).
> 
> Use `int__quot_bits_per_int' here.
> 
> Probably worth a comment to explain the `1 +'.
> (You explained the reason at the definition of the bitmap
> type, but I think it's probably worth mentioning again here.)

Done.


> You should write that as
> 
>     set(BM, I) =
> 	( if in_range(BM, I) then
> 	    BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) \/
bitmask(I)
> 	else
> 	    throw("bitmap__set: out of range")
> 	).
> 
> Otherwise with `--no-fully-strict'
> the range check might get optimized away.

Done.  (Don't know why I didn't do that in the first place...)

> The double range check -- one for bitmap, one for array --
> is unfortunate.  It may be worth unsafe versions of the
> array indexing procedures, that omit the range checks,
> and using them here.

Well, array__elem is unsafe if you compile with array bounds
checking turned off.

What I've done is supply unsafe__{set,clear,flip,is_set,is_clear}
that do no range checking.

Enough has changed that here's the whole thing again (sorry, if I
was more organised I'd be able to offer relative diffs...)


%
----------------------------------------------------------------------------
%
% bitmap.m
% Copyright (C) 2001 Ralph Becket <rbeck at microsoft.com>
% Thu Feb  1 14:38:31 GMT 2001
% vim: ts=4 sw=4 et tw=0 wm=0 ff=unix ft=mercury
%
% THIS MODULE IS HEREBY DONATED TO THE MERCURY PROJECT TO BE
% RELEASED UNDER WHATEVER LICENCE IS DEEMED APPROPRIATE BY
% THE MERCURY PROJECT TEAM.
%
% Efficient bitmap implementation.
%
%
----------------------------------------------------------------------------
%

:- module bitmap.

:- interface.

:- import_module array, int, bool.



:- type bitmap.

:- inst bitmap    == array.
:- inst uniq_bitmap == uniq_array.
:- mode bitmap_ui == array_ui.
:- mode bitmap_di == array_di.
:- mode bitmap_uo == array_uo.

    % new(N, B) creates a bitmap of size N (indexed 0 .. N-1)
    % setting each bit if B = yes and clearing each bit if B = no.
    % An exception is thrown if N is negative.
    %
:- func new(int, bool) = bitmap.
:- mode new(in, in) = bitmap_uo is det.

    % Returns the number of bits in a bitmap.
    %
:- func num_bits(bitmap) = int.
:- mode num_bits(bitmap_ui) = out is det.
:- mode num_bits(in) = out is det.

    % set(BM, I), clear(BM, I) and flip(BM, I) set, clear and flip
    % bit I in BM respectively.
    %
:- func set(bitmap, int) = bitmap.
:- mode set(bitmap_di, in) = bitmap_uo is det.

:- func clear(bitmap, int) = bitmap.
:- mode clear(bitmap_di, in) = bitmap_uo is det.

:- func flip(bitmap, int) = bitmap.
:- mode flip(bitmap_di, in) = bitmap_uo is det.

    % Unsafe versions of the above: if the index is out of range
    % then behaviour is undefined and bad things are likely to happen.
    %
:- func unsafe_set(bitmap, int) = bitmap.
:- mode unsafe_set(bitmap_di, in) = bitmap_uo is det.

:- func unsafe_clear(bitmap, int) = bitmap.
:- mode unsafe_clear(bitmap_di, in) = bitmap_uo is det.

:- func unsafe_flip(bitmap, int) = bitmap.
:- mode unsafe_flip(bitmap_di, in) = bitmap_uo is det.

    % is_set(BM, I) and is_clear(BM, I) succeed iff bit I in BM
    % is set or clear respectively.
    %
:- pred is_set(bitmap, int).
:- mode is_set(bitmap_ui, in) is semidet.
:- mode is_set(in, in) is semidet.

:- pred is_clear(bitmap, int).
:- mode is_clear(bitmap_ui, in) is semidet.
:- mode is_clear(in, in) is semidet.

    % Unsafe versions of the above: if the index is out of range
    % then behaviour is undefined and bad things are likely to happen.
    %
:- pred unsafe_is_set(bitmap, int).
:- mode unsafe_is_set(bitmap_ui, in) is semidet.
:- mode unsafe_is_set(in, in) is semidet.

:- pred unsafe_is_clear(bitmap, int).
:- mode unsafe_is_clear(bitmap_ui, in) is semidet.
:- mode unsafe_is_clear(in, in) is semidet.

    % get(BM, I) returns `yes' if is_set(BM, I) and `no' otherwise.
    %
:- func get(bitmap, int) = bool.
:- mode get(bitmap_ui, in) = out is det.

    % Create a new copy of a bitmap.
    %
:- func copy(bitmap) = bitmap.
:- mode copy(bitmap_ui) = bitmap_uo is det.

    % Set operations; the second argument is altered in all cases.
    %
:- func complement(bitmap) = bitmap.
:- mode complement(bitmap_ui) = bitmap_uo is det.

:- func union(bitmap, bitmap) = bitmap.
:- mode union(bitmap_ui, bitmap_di) = bitmap_uo is det.

:- func intersect(bitmap, bitmap) = bitmap.
:- mode intersect(bitmap_ui, bitmap_di) = bitmap_uo is det.

:- func difference(bitmap, bitmap) = bitmap.
:- mode difference(bitmap_ui, bitmap_di) = bitmap_uo is det.

    % resize(BM, N, B) resizes bitmap BM to have N bits; if N is
    % smaller than the current number of bits in BM then the excess
    % are discarded.  If N is larger than the current number of bits
    % in BM then the new bits are set if B = yes and cleared if
    % B = no.
    %
:- func resize(bitmap, int, bool) = bitmap.
:- mode resize(bitmap_di, in, in) = bitmap_uo is det.

%
----------------------------------------------------------------------------
%
%
----------------------------------------------------------------------------
%

:- implementation.

:- import_module exception.

    % A bitmap is represented as an array of ints where each int stores
    % int__bits_per_int bits.  The first element of the array (index 0)
    % is used to hold the number of bits in the bitmap.  This avoids
    % having to create a new bitmap cell on each update.
    %
    % NOTE: the `filler' bits in the last element of the array *must*
    % be clear (i.e. zero).  This makes the set operations simpler to
    % implement.

:- type bitmap == array(int).

%
----------------------------------------------------------------------------
%

new(N, B) = BM :-
    ( if N < 0 then
        throw("bitmap__new: negative size")
      else
        X    = initializer(B),
        BM0  = (array__init(num_ints_required(N), X) ^ elem(0) := N),
        BM   = clear_filler_bits(BM0)
    ).

%
----------------------------------------------------------------------------
%

resize(BM0, N, B) = BM :-
    ( if N =< 0 then
        BM      = new(N, B)
      else
        X       = initializer(B),
        NumInts = num_ints_required(N),
        BM1     = array__resize(BM0, NumInts, X),

            % Now we need to ensure that bits N, N+1, N+2, ... up to
            % the word boundary are initialized properly.
            %
        int__min(num_bits(BM0), N, M),
        Offset  = int_offset(M - 1),
        Mask    = bitsmask(M - 1),          % For bits we need to preserve.
        Bits    = \(Mask) /\ X,             % Bits we need to fill in.
        BM2     = (( BM1
                        ^ elem(0)      := N )
                        ^ elem(Offset) := (BM1 ^ elem(Offset) /\ Mask) \/
Bits),
        BM      = clear_filler_bits(BM2)
    ).

%
----------------------------------------------------------------------------
%

:- func clear_filler_bits(bitmap) = bitmap.
:- mode clear_filler_bits(bitmap_di) = bitmap_uo is det.

clear_filler_bits(BM0) = BM :-
    N = num_bits(BM0),
    ( if N > 0 then
        Last = int_offset(N - 1),       % Offset of last bit.
        Ksam = bitsmask(N - 1),         % Masks off the filler bits.
        BM   = BM0 ^ elem(Last) := BM0 ^ elem(Last) /\ Ksam
      else
        BM   = BM0
    ).

%
----------------------------------------------------------------------------
%

:- func initializer(bool) = int.

initializer(no)  = 0.
initializer(yes) = \(0).

%
----------------------------------------------------------------------------
%

num_bits(BM) = BM ^ elem(0).

%
----------------------------------------------------------------------------
%

:- pred in_range(bitmap, int).
:- mode in_range(bitmap_ui, in) is semidet.
:- mode in_range(in, in) is semidet.

in_range(BM, I) :- 0 =< I, I < num_bits(BM).

%
----------------------------------------------------------------------------
%

set(BM, I) =
    ( if in_range(BM, I)
      then BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) \/
bitmask(I)
      else throw("bitmap__set: out of range")
    ).

clear(BM, I) =
    ( if in_range(BM, I)
      then BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) /\
bitmask(I)
      else throw("bitmap__clear: out of range")
    ).

flip(BM, I) =
    ( if in_range(BM, I)
      then BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) `xor`
bitmask(I)
      else throw("bitmap__flip: out of range")
    ).

%
----------------------------------------------------------------------------
%

unsafe_set(BM, I) =
    BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) \/ bitmask(I).

unsafe_clear(BM, I) =
    BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) /\ bitmask(I).

unsafe_flip(BM, I) =
    BM ^ elem(int_offset(I)) := BM ^ elem(int_offset(I)) `xor` bitmask(I).

%
----------------------------------------------------------------------------
%

is_set(BM, I) :-
    ( if in_range(BM, I)
      then BM ^ elem(int_offset(I)) /\ bitmask(I) \= 0
      else throw("bitmap__is_set: out of range")
    ).

is_clear(BM, I) :-
    ( if in_range(BM, I)
      then BM ^ elem(int_offset(I)) /\ bitmask(I) = 0
      else throw("bitmap__is_clear: out of range")
    ).

%
----------------------------------------------------------------------------
%

unsafe_is_set(BM, I) :-
    BM ^ elem(int_offset(I)) /\ bitmask(I) \= 0.

unsafe_is_clear(BM, I) :-
    BM ^ elem(int_offset(I)) /\ bitmask(I) = 0.

%
----------------------------------------------------------------------------
%

get(BM, I) = ( if is_clear(BM, I) then no else yes ).

%
----------------------------------------------------------------------------
%

copy(BM) = array__copy(BM).

%
----------------------------------------------------------------------------
%

complement(BM) =
    clear_filler_bits(complement_2(BM ^ elem(0) - 1, BM)).



:- func complement_2(int, bitmap) = bitmap.
:- mode complement_2(in, bitmap_di) = bitmap_uo is det.

complement_2(WordI, BM) =
    ( if WordI =< 0
      then BM
      else complement_2(WordI - 1, BM ^ elem(WordI) := \(BM ^ elem(WordI)))
    ).

%
----------------------------------------------------------------------------
%

union(BMa, BMb) =
    ( if num_bits(BMa) > num_bits(BMb) then
        zip(int_offset(num_bits(BMb) - 1), (\/), BMb, bitmap__copy(BMa))
      else
        zip(int_offset(num_bits(BMa) - 1), (\/), BMa, BMb)
    ).

%
----------------------------------------------------------------------------
%

intersect(BMa, BMb) =
    ( if num_bits(BMa) > num_bits(BMb) then
        zip(int_offset(num_bits(BMb) - 1), (/\), BMb, bitmap__copy(BMa))
      else
        zip(int_offset(num_bits(BMa) - 1), (/\), BMa, BMb)
    ).

%
----------------------------------------------------------------------------
%

difference(BMa, BMb) =
    ( if num_bits(BMa) > num_bits(BMb) then
        zip(int_offset(num_bits(BMb) - 1), Xor, BMb, bitmap__copy(BMa))
      else
        zip(int_offset(num_bits(BMa) - 1), Xor, BMa, BMb)
    )
 :-
    Xor = ( func(X, Y) = (X `xor` Y) ).

%
----------------------------------------------------------------------------
%

    % Applies a function to every corresponding element between +ve I
    % and 1 inclusive, destructively updating the second bitmap.
    %
:- func zip(int, func(int, int) = int, bitmap, bitmap) = bitmap.
:- mode zip(in, func(in, in) = out is det, bitmap_ui, bitmap_di) = bitmap_uo
            is det.

zip(I, Fn, BMa, BMb) =
    ( if I > 0 then
        zip(I - 1, Fn, BMa, BMb ^ elem(I) := Fn(BMb ^ elem(I), BMa ^
elem(I)))
      else
        BMb
    ).

%
----------------------------------------------------------------------------
%

    % The size of the array required to hold an N-bit bitmap.
    %
:- func num_ints_required(int) = int.

    % We add the 1 on because arrays of size N are indexed 0 .. N - 1.
    %
num_ints_required(N) = 1 + ( if N > 0 then int_offset(N) else 0 ).

%
----------------------------------------------------------------------------
%

    % The array index containing the given bit.
    %
:- func int_offset(int) = int.

    % We add the extra 1 on because elem(0) is used to store the number
    % of bits in the bitmap; the data are stored in the following elements.
    %
int_offset(I) = 1 + int__quot_bits_per_int(I).

%
----------------------------------------------------------------------------
%

    % Construct the bitmask for a given bit in a word.
    %
    % E.g. assuming int__bits_per_int = 8 and I = 11 then
    % bitmask(I) = 2'00001000
    %
:- func bitmask(int) = int.

    % NOTE: it would be nicer to use /\ with a bitmask here rather
    % than rem.  Do modern back-ends do the decent thing here if
    % int__bits_per_int is the expected power of two?
    %
bitmask(I) = 1 `unchecked_left_shift` int_rem_bits_per_int(I).

%
----------------------------------------------------------------------------
%

    % Construct the bitmask for all the bits up to and including
    % the given bit in a word.
    %
    % E.g. assuming int__bits_per_int = 8 and I = 11 then
    % bitmask(I) = 2'00001111
    %
:- func bitsmask(int) = int.

bitsmask(I) = BitsMask :-
    BitMask  = bitmask(I),
    BitsMask = BitMask \/ (BitMask - 1).

%
----------------------------------------------------------------------------
%

    % XXX To go in int.m
    %
    % int_rem_bits_per_int(I) = I `rem` int__bits_per_int.
    %
    % XXX This assumes that int__bits_per_int is a power of two.
    %
:- func int_rem_bits_per_int(int) = int.

int_rem_bits_per_int(I) =
    I /\ (int__bits_per_int - 1).

%
----------------------------------------------------------------------------
%
%
----------------------------------------------------------------------------
%


--
Ralph Becket      |      MSR Cambridge      |      rbeck at microsoft.com 

--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list