[m-rev.] for review: fix some error reporting in the bitmap module

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri Dec 29 02:23:41 AEDT 2017



On Thu, 28 Dec 2017 07:07:36 -0500 (EST), Julien Fischer <jfischer at opturion.com> wrote:
> -BM ^ bits(FirstBit, NumBits) =
> +BM ^ bits(FirstBit, NumBits) = Word :-
>       ( if
>           FirstBit >= 0,
>           NumBits >= 0,
>           NumBits =< int.bits_per_int,
>           in_range_rexcl(BM, FirstBit + NumBits)
>       then
> -        BM ^ unsafe_bits(FirstBit, NumBits)
> +        Word = BM ^ unsafe_bits(FirstBit, NumBits)
>       else if
>           ( NumBits < 0
>           ; NumBits > int.bits_per_int
> @@ -587,7 +600,7 @@ BM ^ bits(FirstBit, NumBits) =
>           throw_bitmap_error("bitmap.bits: number of bits must be between " ++
>               "0 and `int.bits_per_int'.")
>       else
> -        throw_bounds_error(BM, "bitmap.bits", FirstBit)
> +        throw_bit_bounds_error(BM, "bitmap.bits", FirstBit)
>       ).

Here and in lots of other places, the code specifies the predicate name
as a string. Using $pred would be easier to maintain, and would specify
the func/pred distinction and the arity automatically.

For the operations that have both a function and a predicate form,
having one form just call the other would lead to $pred referring
to the wrong form whenever the user calls the forwarding form.
This could be fixed by *both* forms just calling an internal
predicate (which would be marked for inlining) passing $pred
as an argument.

A style issue: having the predicate name *between* the bitmap
and the index on that bitmap looks *really* strange to me.

> +(!.BM ^ bits(FirstBit, NumBits) := Bits) = !:BM :-
>       ( if
>           FirstBit >= 0,
>           NumBits >= 0,
>           NumBits =< int.bits_per_int,
> -        in_range_rexcl(BM, FirstBit + NumBits)
> +        in_range_rexcl(!.BM, FirstBit + NumBits)
>       then
> -        BM ^ unsafe_bits(FirstBit, NumBits) := Bits
> +        !BM ^ unsafe_bits(FirstBit, NumBits) := Bits
>       else if
>           ( NumBits < 0
>           ; NumBits > int.bits_per_int
> @@ -652,7 +665,7 @@ extract_bits_from_byte_index(ByteIndex, FirstBitIndex,
>               "bitmap.'bits :=': number of bits must be between " ++
>               "0 and `int.bits_per_int'.")
>       else
> -        throw_bounds_error(BM, "bitmap.'bits :='", FirstBit)
> +        throw_bit_bounds_error(!.BM, "bitmap.'bits :='", FirstBit)
>       ).

A suggestion unrelated to the current change: we could avoid
this "unknown error" error by testing first for NumBits < 0,
then for FirstBit not being in range, and then for the last bit
not being in range, generating separate error messages in
in all three (not just two) cases, and doing the nonerror case
in the last else case.

There are other procedures to which this also applies.

> +throw_byte_bounds_error(BM, Pred, ByteIndex) :-
> +    ( if BM ^ num_bits = 0 then
> +        UB = -1
> +    else
> +        UB = BM ^ num_bits / bits_per_byte
> +    ),
> +    string.format(
> +        "%s: byte index %d is out of bounds [0, %d].",
> +        [s(Pred), i(ByteIndex), i(UB)], Msg),
> +    throw_bitmap_error(Msg).

I think that final ] can look wrong. For example, ...

> --- a/tests/hard_coded/bitmap_bytes.exp
> +++ b/tests/hard_coded/bitmap_bytes.exp
> @@ -0,0 +1,19 @@
> +Bitmap: 
> +^ byte(-1): bitmap.byte: byte index -1 is out of bounds [0, -1].
> +^ byte(0): bitmap.byte: byte index 0 is out of bounds [0, -1].
> +^ byte(1): bitmap.byte: byte index 1 is out of bounds [0, -1].
> +
> +Bitmap: 00000000.00000000.00000000
> +^ byte(-1): bitmap.byte: byte index -1 is out of bounds [0, 3].
> +^ byte(0): 0
> +^ byte(1): 0
> +^ byte(2): 0
> +^ byte(3): bitmap.byte: byte index 3 is out of bounds [0, 3].

... having 3 being out of bounds in [0, 3] looks wrong,
while it would be understandable if the bounds were [0, 3).

The issue here is: what should the error message say if the
bitmap contains the first bit of the selected byte, but is missing
some later bits? In that case, both of the obvious options
(saying that selected byte is in bounds with ']', and saying that it is
out of bounds with ')' ) will surprise some people at least some of the time.
And you cannot introduce new notation such as "[0, 3>" in an
error message and expect to be understood :-(

I think the least astonishing thing would be a message
that showed the bounds as [0, 3) in this case (i.e. it would show
a byte index as being in bounds only if there was a whole byte
at that index), with an additional few words if there is a *partial*
byte at the given index.

The diff is otherwise fine.

Zoltan.


More information about the reviews mailing list