[m-rev.] for discussion: undefined behaviours in int.m

Peter Wang novalazy at gmail.com
Wed Oct 12 18:01:41 AEDT 2016


Hi,

Here are some possible changes before I work on the implementation.

Peter

diff --git a/library/int.m b/library/int.m
index ae28d7b..42b18d7 100644
--- a/library/int.m
+++ b/library/int.m
@@ -42,149 +44,200 @@
 
     % Less than or equal.
     %
 :- pred (int::in) =< (int::in) is semidet.
 
     % Greater than or equal.
     %
 :- pred (int::in) >= (int::in) is semidet.
 
     % Absolute value.
+    % Throws a `math.overflow_error' exception if an overflow occurs.
     %
 :- func abs(int) = int.
 :- pred abs(int::in, int::out) is det.
_______________________
New exception type, only one use so far.  None of the other
"simply-named" functions in the module check for overflow, though,
so it does seem a bit out of place.
_______________________

+    % negative_abs(X) = -abs(X) except that overflow does not occur for any X.
+    %
+:- func negative_abs(int) = int.
+
     % Maximum.
     %
 :- func max(int, int) = int.
 :- pred max(int::in, int::in, int::out) is det.
 
     % Minimum.
     %
 :- func min(int, int) = int.
 :- pred min(int::in, int::in, int::out) is det.
 
     % Exponentiation.
     % pow(X, Y, Z): Z is X raised to the Yth power.
     % Throws a `math.domain_error' exception if Y is negative.
+    % The behaviour is undefined if an overflow occurs.
     %
 :- func pow(int, int) = int.
 :- pred pow(int::in, int::in, int::out) is det.
 
     % Base 2 logarithm.
     % log2(X) = N is the least integer such that 2 to the power N
     % is greater than or equal to X.
     % Throws a `math.domain_error' exception if X is not positive.
     %
 :- func log2(int) = int.
 :- pred log2(int::in, int::out) is det.
 
     % Addition.
+    % The behaviour is undefined if overflow occurs.
     %
 :- func int + int = int.
 :- mode in  + in  = uo  is det.
 :- mode uo  + in  = in  is det.
 :- mode in  + uo  = in  is det.
 
+    % Addition.
+    % The behaviour is undefined if overflow occurs.
+    %
 :- func plus(int, int) = int.
 
-    % Multiplication.
-    %
-:- func (int::in) * (int::in) = (int::uo) is det.
-:- func times(int, int) = int.
-
     % Subtraction.
+    % The behaviour is undefined if overflow occurs.
     %
 :- func int - int = int.
 :- mode in  - in  = uo  is det.
 :- mode uo  - in  = in  is det.
 :- mode in  - uo  = in  is det.
 
+    % Subtraction.
+    % The behaviour is undefined if overflow occurs.
+    %
 :- func minus(int, int) = int.
 
+    % Multiplication.
+    % The behaviour is undefined if overflow occurs.
+    %
+:- func (int::in) * (int::in) = (int::uo) is det.
+:- func times(int, int) = int.
+
+    % Checked addition, subtraction, multiplication.
+    % These predicates fail if overflow occurs.
+    %
+:- pred checked_add(int::in, int::in, int::out) is semidet.
+:- pred checked_sub(int::in, int::in, int::out) is semidet.
+:- pred checked_mul(int::in, int::in, int::out) is semidet.
+
_______________________
Or semidet_add, etc.?
_______________________

     % Flooring integer division.
     % Truncates towards minus infinity, e.g. (-10) div 3 = (-4).
     %
     % Throws a `math.domain_error' exception if the right operand is zero.
     % See the comments at the top of math.m to find out how to disable
     % domain checks.
     %
+    % The behaviour is undefined if the result cannot be represented.
+    %
 :- func div(int::in, int::in) = (int::uo) is det.
_______________________
div(min_int, -1) overflows two's complement representation
_______________________
 
     % Truncating integer division.
     % Truncates towards zero, e.g. (-10) // 3 = (-3).
     % `div' has nicer mathematical properties for negative operands,
     % but `//' is typically more efficient.
     %
     % Throws a `math.domain_error' exception if the right operand is zero.
     % See the comments at the top of math.m to find out how to disable
     % domain checks.
     %
+    % The behaviour is undefined if the result cannot be represented.
+    %
 :- func (int::in) // (int::in) = (int::uo) is det.
 
     % (/)/2 is a synonym for (//)/2 to bring Mercury into line with
     % the common convention for naming integer division.
     %
 :- func (int::in) / (int::in) = (int::uo) is det.
 
-    % unchecked_quotient(X, Y) is the same as X // Y, but the behaviour
-    % is undefined if the right operand is zero.
+    % checked_quotient(X, Y, Quotient) is the same as Quotient = X // Y,
+    % but fails if the right operand Y is zero or if the quotient cannot be
+    % represented.
+    %
+:- pred checked_quotient(int::in, int::in, int::uo) is semidet.
+
+    % unchecked_quotient(X, Y) is the same as X // Y, but the behaviour is
+    % undefined if the right operand is zero or if the quotient cannot be
+    % represented.
     %
 :- func unchecked_quotient(int::in, int::in) = (int::uo) is det.
 
     % Modulus.
     % X mod Y = X - (X div Y) * Y
     %
+    % The behaviour is undefined if overflow occurs.
+    %
 :- func (int::in) mod (int::in) = (int::uo) is det.
 
     % Remainder.
     % X rem Y = X - (X // Y) * Y
     % `mod' has nicer mathematical properties for negative X,
     % but `rem' is typically more efficient.
     %
     % Throws a `math.domain_error' exception if the right operand is zero.
     % See the comments at the top of math.m to find out how to disable
     % domain checks.
     %
 :- func (int::in) rem (int::in) = (int::uo) is det.

+    % checked_rem(X, Y, Remainder) is the same as Remainder = X rem Y,
+    % but fails if the right operand Y is zero.
+    %
+:- pred checked_rem(int::in, int::in, int::uo) is semidet.
+

     % unchecked_rem(X, Y) is the same as X rem Y, but the behaviour
-    % is undefined if the right operand is zero.
+    % is undefined if the right operand is zero, or if the quotient X/Y
+    % cannot be represented.
     %
 :- func unchecked_rem(int::in, int::in) = (int::uo) is det.
_______________________
unchecked_rem(min_int, -1) is not okay because C's % operator may be
implemented with the same instruction as division.

We will specifically allow rem(min_int, -1) and checked_rem(min_int, -1)
because they are already the slow and safe operations.
_______________________

     % Left shift.
     % X << Y returns X "left shifted" by Y bits.
     % To be precise, if Y is negative, the result is
     % X div (2^(-Y)), otherwise the result is X * (2^Y).
     %
+    % The behaviour is undefined if the result cannot be represented.
+    %
 :- func (int::in) << (int::in) = (int::uo) is det.
_______________________
Left shift of a negative value is undefined.
(Currently even shifting by zero triggers undefined behaviour.)

Shifting into the sign bit is undefined.

Perhaps we should throw exceptions, as << is the slow operation?
_______________________

     % unchecked_left_shift(X, Y) is the same as X << Y
     % except that the behaviour is undefined if Y is negative,
-    % or greater than or equal to the result of `bits_per_int/1'.
+    % or greater than or equal to the result of `bits_per_int - 1'.
     % It will typically be implemented more efficiently than X << Y.
     %
+    % The behaviour is undefined if the result cannot be represented.
+    %
 :- func unchecked_left_shift(int::in, int::in) = (int::uo) is det.
_______________________
The change to the domain of Y is because shifting into the sign bit is
undefined behaviour in C, and unchecked_left_shift is designed to map to
C's << operator.  It seems recent C++ allows shifting into the sign bit
(DR1457), perhaps C will follow?
_______________________

     % Right shift.
     % X >> Y returns X "arithmetic right shifted" by Y bits.
     % To be precise, if Y is negative, the result is
     % X * (2^(-Y)), otherwise the result is X div (2^Y).
     %
+    % The behaviour is undefined if the result cannot be represented.
+    %
 :- func (int::in) >> (int::in) = (int::uo) is det.
_______________________
Right shift by negative value same as left shift by positive value.

Perhaps we should throw exceptions, as >> is the slow operation?
_______________________
 
     % unchecked_right_shift(X, Y) is the same as X >> Y
     % except that the behaviour is undefined if Y is negative,
     % or greater than or equal to the result of `bits_per_int/1'.
     % It will typically be implemented more efficiently than X >> Y.
     %
+    % XXX also undefined if X < 0
+    %
 :- func unchecked_right_shift(int::in, int::in) = (int::uo) is det.
_______________________
Due to mapping to C's >> operator.
_______________________

@@ -205,20 +258,21 @@
 
     % Bitwise complement.
     %
 :- func \ (int::in) = (int::uo) is det.
 
     % Unary plus.
     %
 :- func + (int::in) = (int::uo) is det.
 
     % Unary minus.
+    % The behaviour is undefined if an overflow occurs.
     %
 :- func - (int::in) = (int::uo) is det.
 
     % is/2, for backwards compatibility with Prolog.
     %
 :- pred is(T, T) is det.
 :- mode is(uo, di) is det.
 :- mode is(out, in) is det.
 :- pragma obsolete(is/2).
 
@@ -233,20 +287,21 @@
 :- pred min_int(int::out) is det.
 
     % bits_per_int is the number of bits in an int on this machine.
     %
 :- func bits_per_int = int.
 :- pred bits_per_int(int::out) is det.
 
     % fold_up(F, Low, High, !Acc) <=> list.foldl(F, Low .. High, !Acc)
     %
     % NOTE: fold_up/5 is undefined if High = max_int.
+    % XXX use checked arithmetic
     %
 :- pred fold_up(pred(int, T, T), int, int, T, T).
 :- mode fold_up(pred(in, in, out) is det, in, in, in, out) is det.
 :- mode fold_up(pred(in, mdi, muo) is det, in, in, mdi, muo) is det.
 :- mode fold_up(pred(in, di, uo) is det, in, in, di, uo) is det.
 :- mode fold_up(pred(in, array_di, array_uo) is det, in, in,
     array_di, array_uo) is det.
 :- mode fold_up(pred(in, in, out) is semidet, in, in, in, out)
     is semidet.
 :- mode fold_up(pred(in, mdi, muo) is semidet, in, in, mdi, muo)
_______________________
We can define these predicates for all values of Low and High.
_______________________


@@ -408,20 +463,21 @@
     % nondet_int_in_range(Low, High, I):
     %
     % On successive successes, set I to every integer from Low to High.
     %
 :- pred nondet_int_in_range(int::in, int::in, int::out) is nondet.
 
     % all_true_in_range(P, Low, High):
     % True iff P is true for every integer in Low to High.
     %
     % NOTE: all_true_in_range/3 is undefined if High = max_int.
+    % XXX overflow
     %
 :- pred all_true_in_range(pred(int)::in(pred(in) is semidet),
     int::in, int::in) is semidet.
_______________________
We can define this predicate for all values of Low and High.
_______________________


More information about the reviews mailing list