[m-rev.] for discussion: undefined behaviours in int.m

Julien Fischer jfischer at opturion.com
Fri Oct 21 02:56:26 AEDT 2016


Hi Peter,

On Thu, 20 Oct 2016, Peter Wang wrote:

> On Wed, 19 Oct 2016 02:10:00 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:

>> On Wed, 12 Oct 2016, Peter Wang wrote:
>>
>>> Here are some possible changes before I work on the implementation.
>>
>> Bigger picture: what should the default overflow checking on ints be and how
>> controllable should it be?  The following comment has been at the head of the
>> int module for a long time:
>>
>>      The behaviour of a computation for which overflow occurs is undefined.
>>      (In the current implementation, the predicates and functions in this
>>      module do not check for overflow, and the results you get are those
>>      delivered by the C compiler.  However, future implementations
>>      might check for overflow.)
>>
>> I guess on the principle that the behaviour of +, - etc may change in the
>> future, it might be worth adding both "checked_{plus,minus,times} and
>> "unchecked_{plus,minus,times}" now.
>>
>> Perhaps we should consider adding a 'checked' scope that operates similarly to
>> C#'s 'checked' keyword?
>>
>> ...
>
> It's a pretty specific feature that doesn't generalise, and the list of
> "checked" functions might not be obvious.
>
> Not quite the same thing, but perhaps in future the compiler could
> optionally warn about uses of unchecked operations.  It might be a
> generalisation of pragma obsolete, like "annotations" in some languages?
> The user could label predicates as "unchecked" or "unsafe" or anything
> they like, and the compiler would warn wherever those predicates are used.
> A pragma or scope would allow the programmer to suppress warnings for
> some uses, asserting that integer overflow is impossible here, array
> indexing without bounds checking is safe here, etc.

Like what we do for impurity?

...

>>> +    % The behaviour is undefined if overflow occurs.
>>
>> The only reason it is "undefined" is because in the C grades you can't
>> necessarily assume two's complement arithmetic; I'm not aware of Mercury
>> running on any system that doesn't use two's complement arithmetic**, so
>> perhaps we should just require that Mercury ints use two's complement
>> representation and define these operations to wrap.
>>
>> (** in the event of there being such a system we would just have to
>> implement the wrapping behaviour.)
>
> I think the reason it is undefined is the comment at the top of the
> module: reserving the right to check for overflow in the future.

Maybe, I'm not sure what the original author had in mind!

> I have read that C compilers take advantage of the undefined behaviour
> of signed integer overflow to allow optimisations that would be invalid
> with wrapping.  I don't know how much we would miss out on by using
> unsigned operations to get defined overflow behaviour.  I think that
> wrapping on signed ints is not that useful, though.

Is there any reason not to require 2's complement integer representation for
Mercury ints?  Both C# and Java require this (so that's what you're getting for
those backends anyway).  In C grades, we are assuming that we have a 2's
complement representation in a number of spots anyway.

(On a related issue: does anyone have an objection to use requiring that
Mercury floats be IEEE 754 single or double precision values?)

>>>     % Left shift.
>>>     % X << Y returns X "left shifted" by Y bits.
>>>     % To be precise, if Y is negative, the result is
>>>     % X div (2^(-Y)), otherwise the result is X * (2^Y).
>>>     %
>>> +    % The behaviour is undefined if the result cannot be represented.
>>> +    %
>>> :- func (int::in) << (int::in) = (int::uo) is det.
>>> _______________________
>>> Left shift of a negative value is undefined.
>>> (Currently even shifting by zero triggers undefined behaviour.)
>>>
>>> Shifting into the sign bit is undefined.
>>>
>>> Perhaps we should throw exceptions, as << is the slow operation?
>>> _______________________
>>
>> Just to be clear: you mean throw an exception if X is negative?
>>
>> ...
>
> I think that's what I meant.

Fine by me then.

> I have a related proposal: add unsigned_int (or 'uint') to the library,
> and only the library to begin with.  The standard library would use it
> to implement sparse_bitset, hash functions, perhaps the PRNG, avoiding
> the undefined behaviours of signed ints.  It would be useful to expose
> for users as well even without other support, which could be added
> incrementally (if ever).

I have no objections to adding 'uint'.

I have been working on similar support for various fixed sized int types as a
library, see: <https://github.com/juliensf/mercury-inttypes>.  Based on that, I
think you would need to extend the language to at least support 'uint' literals
as well -- their absence makes things quite awkward.

> I've attached an incomplete module.

Let me complete it a little for you: the C# and Java implementations
are in the int32 module in my library ;-)

Julien.


More information about the reviews mailing list