[m-rev.] for discussion: undefined behaviours in int.m

Peter Wang novalazy at gmail.com
Thu Oct 20 15:39:01 AEDT 2016


On Wed, 19 Oct 2016 02:10:00 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:
> 
> Hi Peter,
> 
> The following addresses some, but not all, of your discussion points.
> 
> In summary, I have no objections to adding additional overflowing detecting
> versions of int operations now; there's also a larger discussion that needs
> to be had about support for numeric types in Mercury in general.

Hi Julien,

> On Wed, 12 Oct 2016, Peter Wang wrote:
> 
> > Here are some possible changes before I work on the implementation.
> 
> Bigger picture: what should the default overflow checking on ints be and how
> controllable should it be?  The following comment has been at the head of the
> int module for a long time:
> 
>      The behaviour of a computation for which overflow occurs is undefined.
>      (In the current implementation, the predicates and functions in this
>      module do not check for overflow, and the results you get are those
>      delivered by the C compiler.  However, future implementations
>      might check for overflow.)
> 
> I guess on the principle that the behaviour of +, - etc may change in the
> future, it might be worth adding both "checked_{plus,minus,times} and
> "unchecked_{plus,minus,times}" now.
> 
> Perhaps we should consider adding a 'checked' scope that operates similarly to
> C#'s 'checked' keyword?
> 
> ...

It's a pretty specific feature that doesn't generalise, and the list of
"checked" functions might not be obvious.

Not quite the same thing, but perhaps in future the compiler could
optionally warn about uses of unchecked operations.  It might be a
generalisation of pragma obsolete, like "annotations" in some languages?
The user could label predicates as "unchecked" or "unsafe" or anything
they like, and the compiler would warn wherever those predicates are used.
A pragma or scope would allow the programmer to suppress warnings for
some uses, asserting that integer overflow is impossible here, array
indexing without bounds checking is safe here, etc.

> > diff --git a/library/int.m b/library/int.m
> > index ae28d7b..42b18d7 100644
> > --- a/library/int.m
> > +++ b/library/int.m
> > @@ -42,149 +44,200 @@
> >
> >     % Less than or equal.
> >     %
> > :- pred (int::in) =< (int::in) is semidet.
> >
> >     % Greater than or equal.
> >     %
> > :- pred (int::in) >= (int::in) is semidet.
> >
> >     % Absolute value.
> > +    % Throws a `math.overflow_error' exception if an overflow occurs.
> >     %
> > :- func abs(int) = int.
> > :- pred abs(int::in, int::out) is det.
> 
> I think we should deprecate the predicate version of this.
> (There's no corresponding predicate for abs in the integer and float modules.)

Ok.

> 
> For the purposes of backwards compatibility, I suggest leaving the semantics of
> abs/1 as-is (for now) and adding both checked_abs/1 and unchecked_abs/1.

Ok.

> 
> > New exception type, only one use so far.  None of the other
> > "simply-named" functions in the module check for overflow, though,
> > so it does seem a bit out of place.
> > _______________________
> 
> The math module is not really an ideal place to define it. It's not the ideal
> place for domain_error/1 to be defined either, since the math module gets
> imported in both the int and integer modules solely to access that type.
> 
> At some point, I was planning to add the following exception types to the float
> module:
> 
>      :- type invalid_operation_error ---> ...
>      :- type division_by_zero_error ---> ...
>      :- type overflow_error ---> ...
>      :- type underflow_error ---> ...
>      :- type inexact_error ---> ...
> 
> These correspond to the five kinds of exception defined in IEEE 754-2008.
> Since some of those errors are applicable to other numeric types and since we
> could (potentially) add another floating point type (e.g. one way of getting
> rid of the .spf grades is to have single precision floats), I suggest we add a
> new standard library module ('numeric' or 'number) that defines things that are
> common to all / many numeric modules.
> 
> ...

Sounds ok.

> 
> > +    % negative_abs(X) = -abs(X) except that overflow does not occur for any X.
> > +    %
> > +:- func negative_abs(int) = int.
> 
> This is generally named 'nabs' in the literature.
> 

Ok.

> 
> >     % Exponentiation.
> >     % pow(X, Y, Z): Z is X raised to the Yth power.
> >     % Throws a `math.domain_error' exception if Y is negative.
> > +    % The behaviour is undefined if an overflow occurs.
> >     %
> > :- func pow(int, int) = int.
> > :- pred pow(int::in, int::in, int::out) is det.
> >
> >     % Base 2 logarithm.
> >     % log2(X) = N is the least integer such that 2 to the power N
> >     % is greater than or equal to X.
> >     % Throws a `math.domain_error' exception if X is not positive.
> >     %
> > :- func log2(int) = int.
> > :- pred log2(int::in, int::out) is det.
> >
> >     % Addition.
> > +    % The behaviour is undefined if overflow occurs.
> 
> The only reason it is "undefined" is because in the C grades you can't
> necessarily assume two's complement arithmetic; I'm not aware of Mercury
> running on any system that doesn't use two's complement arithmetic**, so
> perhaps we should just require that Mercury ints use two's complement
> representation and define these operations to wrap.
> 
> (** in the event of there being such a system we would just have to
> implement the wrapping behaviour.)

I think the reason it is undefined is the comment at the top of the
module: reserving the right to check for overflow in the future.

I have read that C compilers take advantage of the undefined behaviour
of signed integer overflow to allow optimisations that would be invalid
with wrapping.  I don't know how much we would miss out on by using
unsigned operations to get defined overflow behaviour.  I think that
wrapping on signed ints is not that useful, though.

> > +    % Checked addition, subtraction, multiplication.
> > +    % These predicates fail if overflow occurs.
> > +    %
> > +:- pred checked_add(int::in, int::in, int::out) is semidet.
> > +:- pred checked_sub(int::in, int::in, int::out) is semidet.
> > +:- pred checked_mul(int::in, int::in, int::out) is semidet.
> > +
> > _______________________
> > Or semidet_add, etc.?
> > _______________________
> 
> In general, I've tired to avoid adding things named "semidet_*" except where
> existing names force us into that usage.
> 
> I think we should just follow the prevalent naming convention in th
> library:
> 
>      :- func det_checked_add(int, int) = int.
>      ...
>      :- pred checked_add(int::in, int::in, int::out) is semidet.
>      ...

Ok.

> >     % Left shift.
> >     % X << Y returns X "left shifted" by Y bits.
> >     % To be precise, if Y is negative, the result is
> >     % X div (2^(-Y)), otherwise the result is X * (2^Y).
> >     %
> > +    % The behaviour is undefined if the result cannot be represented.
> > +    %
> > :- func (int::in) << (int::in) = (int::uo) is det.
> > _______________________
> > Left shift of a negative value is undefined.
> > (Currently even shifting by zero triggers undefined behaviour.)
> >
> > Shifting into the sign bit is undefined.
> >
> > Perhaps we should throw exceptions, as << is the slow operation?
> > _______________________
> 
> Just to be clear: you mean throw an exception if X is negative?
> 
> ...

I think that's what I meant.


I have a related proposal: add unsigned_int (or 'uint') to the library,
and only the library to begin with.  The standard library would use it
to implement sparse_bitset, hash functions, perhaps the PRNG, avoiding
the undefined behaviours of signed ints.  It would be useful to expose
for users as well even without other support, which could be added
incrementally (if ever).

I've attached an incomplete module.

Peter
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: unsigned_int.m
URL: <http://lists.mercurylang.org/archives/reviews/attachments/20161020/933f606d/attachment.ksh>


More information about the reviews mailing list