[m-rev.] for discussion: undefined behaviours in int.m

Julien Fischer jfischer at opturion.com
Wed Oct 19 02:10:00 AEDT 2016


Hi Peter,

The following addresses some, but not all, of your discussion points.

In summary, I have no objections to adding additional overflowing detecting
versions of int operations now; there's also a larger discussion that needs
to be had about support for numeric types in Mercury in general.

On Wed, 12 Oct 2016, Peter Wang wrote:

> Here are some possible changes before I work on the implementation.

Bigger picture: what should the default overflow checking on ints be and how
controllable should it be?  The following comment has been at the head of the
int module for a long time:

     The behaviour of a computation for which overflow occurs is undefined.
     (In the current implementation, the predicates and functions in this
     module do not check for overflow, and the results you get are those
     delivered by the C compiler.  However, future implementations
     might check for overflow.)

I guess on the principle that the behaviour of +, - etc may change in the
future, it might be worth adding both "checked_{plus,minus,times} and
"unchecked_{plus,minus,times}" now.

Perhaps we should consider adding a 'checked' scope that operates similarly to
C#'s 'checked' keyword?

...

> diff --git a/library/int.m b/library/int.m
> index ae28d7b..42b18d7 100644
> --- a/library/int.m
> +++ b/library/int.m
> @@ -42,149 +44,200 @@
>
>     % Less than or equal.
>     %
> :- pred (int::in) =< (int::in) is semidet.
>
>     % Greater than or equal.
>     %
> :- pred (int::in) >= (int::in) is semidet.
>
>     % Absolute value.
> +    % Throws a `math.overflow_error' exception if an overflow occurs.
>     %
> :- func abs(int) = int.
> :- pred abs(int::in, int::out) is det.

I think we should deprecate the predicate version of this.
(There's no corresponding predicate for abs in the integer and float modules.)

For the purposes of backwards compatibility, I suggest leaving the semantics of
abs/1 as-is (for now) and adding both checked_abs/1 and unchecked_abs/1.

> New exception type, only one use so far.  None of the other
> "simply-named" functions in the module check for overflow, though,
> so it does seem a bit out of place.
> _______________________

The math module is not really an ideal place to define it. It's not the ideal
place for domain_error/1 to be defined either, since the math module gets
imported in both the int and integer modules solely to access that type.

At some point, I was planning to add the following exception types to the float
module:

     :- type invalid_operation_error ---> ...
     :- type division_by_zero_error ---> ...
     :- type overflow_error ---> ...
     :- type underflow_error ---> ...
     :- type inexact_error ---> ...

These correspond to the five kinds of exception defined in IEEE 754-2008.
Since some of those errors are applicable to other numeric types and since we
could (potentially) add another floating point type (e.g. one way of getting
rid of the .spf grades is to have single precision floats), I suggest we add a
new standard library module ('numeric' or 'number) that defines things that are
common to all / many numeric modules.

...

> +    % negative_abs(X) = -abs(X) except that overflow does not occur for any X.
> +    %
> +:- func negative_abs(int) = int.

This is generally named 'nabs' in the literature.

...

>     % Exponentiation.
>     % pow(X, Y, Z): Z is X raised to the Yth power.
>     % Throws a `math.domain_error' exception if Y is negative.
> +    % The behaviour is undefined if an overflow occurs.
>     %
> :- func pow(int, int) = int.
> :- pred pow(int::in, int::in, int::out) is det.
>
>     % Base 2 logarithm.
>     % log2(X) = N is the least integer such that 2 to the power N
>     % is greater than or equal to X.
>     % Throws a `math.domain_error' exception if X is not positive.
>     %
> :- func log2(int) = int.
> :- pred log2(int::in, int::out) is det.
>
>     % Addition.
> +    % The behaviour is undefined if overflow occurs.

The only reason it is "undefined" is because in the C grades you can't
necessarily assume two's complement arithmetic; I'm not aware of Mercury
running on any system that doesn't use two's complement arithmetic**, so
perhaps we should just require that Mercury ints use two's complement
representation and define these operations to wrap.

(** in the event of there being such a system we would just have to
implement the wrapping behaviour.)

...

> +    % Checked addition, subtraction, multiplication.
> +    % These predicates fail if overflow occurs.
> +    %
> +:- pred checked_add(int::in, int::in, int::out) is semidet.
> +:- pred checked_sub(int::in, int::in, int::out) is semidet.
> +:- pred checked_mul(int::in, int::in, int::out) is semidet.
> +
> _______________________
> Or semidet_add, etc.?
> _______________________

In general, I've tired to avoid adding things named "semidet_*" except where
existing names force us into that usage.

I think we should just follow the prevalent naming convention in th
library:

     :- func det_checked_add(int, int) = int.
     ...
     :- pred checked_add(int::in, int::in, int::out) is semidet.
     ...

...

>     % Left shift.
>     % X << Y returns X "left shifted" by Y bits.
>     % To be precise, if Y is negative, the result is
>     % X div (2^(-Y)), otherwise the result is X * (2^Y).
>     %
> +    % The behaviour is undefined if the result cannot be represented.
> +    %
> :- func (int::in) << (int::in) = (int::uo) is det.
> _______________________
> Left shift of a negative value is undefined.
> (Currently even shifting by zero triggers undefined behaviour.)
>
> Shifting into the sign bit is undefined.
>
> Perhaps we should throw exceptions, as << is the slow operation?
> _______________________

Just to be clear: you mean throw an exception if X is negative?

...

> @@ -233,20 +287,21 @@
> :- pred min_int(int::out) is det.
>
>     % bits_per_int is the number of bits in an int on this machine.
>     %
> :- func bits_per_int = int.
> :- pred bits_per_int(int::out) is det.
>
>     % fold_up(F, Low, High, !Acc) <=> list.foldl(F, Low .. High, !Acc)
>     %
>     % NOTE: fold_up/5 is undefined if High = max_int.
> +    % XXX use checked arithmetic
>     %
> :- pred fold_up(pred(int, T, T), int, int, T, T).
> :- mode fold_up(pred(in, in, out) is det, in, in, in, out) is det.
> :- mode fold_up(pred(in, mdi, muo) is det, in, in, mdi, muo) is det.
> :- mode fold_up(pred(in, di, uo) is det, in, in, di, uo) is det.
> :- mode fold_up(pred(in, array_di, array_uo) is det, in, in,
>     array_di, array_uo) is det.
> :- mode fold_up(pred(in, in, out) is semidet, in, in, in, out)
>     is semidet.
> :- mode fold_up(pred(in, mdi, muo) is semidet, in, in, mdi, muo)
> _______________________
> We can define these predicates for all values of Low and High.
> _______________________

They should be defined for all value of Low and High.
Ditto for all_true_in_range.

Julien.


More information about the reviews mailing list