[m-rev.] for discussion: undefined behaviours in int.m

Peter Wang novalazy at gmail.com
Fri Oct 21 17:53:29 AEDT 2016


On Fri, 21 Oct 2016 02:56:26 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:
> 
> Hi Peter,
> 
> On Thu, 20 Oct 2016, Peter Wang wrote:
> >
> > Not quite the same thing, but perhaps in future the compiler could
> > optionally warn about uses of unchecked operations.  It might be a
> > generalisation of pragma obsolete, like "annotations" in some languages?
> > The user could label predicates as "unchecked" or "unsafe" or anything
> > they like, and the compiler would warn wherever those predicates are used.
> > A pragma or scope would allow the programmer to suppress warnings for
> > some uses, asserting that integer overflow is impossible here, array
> > indexing without bounds checking is safe here, etc.
> 
> Like what we do for impurity?
> 

Not like impurity.

Like pragma obsolete, we could mark predicates which do not check for
overflow, e.g.

    :- pragma unchecked((+)/2).

If the user code makes a call to int.+/2

    :- func twice(int) = int.
    twice(X) = X + X.

the compiler can optionally warn the user that twice/1 calls int.+/2,
which may overflow.  The user can get rid of the warning by replacing
the call to + with a call to det_checked_add, or suppress the warning
because twice/1 cannot overflow in practice:

    :- pragma no_warn_unchecked(twice/1).

The user may also wish to be told about uses of other predicates that
require some care (usually named "unsafe" already), so the next step was
to allow arbitrary labels/annotations to be attached to predicates or
functions, e.g.

	% in int.m
    :- pragma annotate((+)/2, [unchecked]).

	% in array.m
    :- pragma annotate(unsafe_lookup/3, [unsafe]).

then user can ask for warnings about uses of so-labelled predicates,
e.g. --warn-on-use=unchecked --warn-on-use=unsafe

And suppress some warnings:

    :- pragma no_warn(twice/1, [unchecked]).

pragma obsolete would just be a shorthand for the "obsolete" annotation.
You could suppress warnings about calls to obsolete predicates.

Perhaps predicates which use the current input/output streams could have
been annotated with "implicit_streams", as a quick version of the
recently added style check?

Well, it's just an idea.


> 
> > I have read that C compilers take advantage of the undefined behaviour
> > of signed integer overflow to allow optimisations that would be invalid
> > with wrapping.  I don't know how much we would miss out on by using
> > unsigned operations to get defined overflow behaviour.  I think that
> > wrapping on signed ints is not that useful, though.
> 
> Is there any reason not to require 2's complement integer representation for
> Mercury ints?  Both C# and Java require this (so that's what you're getting for
> those backends anyway).  In C grades, we are assuming that we have a 2's
> complement representation in a number of spots anyway.

We'd need to use unsigned integer operations to get _defined_ overflow
behaviour, or rely on compiler switches like gcc -fwrapv or gcc
-fno-strict-overflow.  Otherwise, the C compiler can assume that signed
integer overflow does not happen, and generate code accordingly.

I don't know if there are any performance implications for us.


> > I have a related proposal: add unsigned_int (or 'uint') to the library,
> > and only the library to begin with.  The standard library would use it
> > to implement sparse_bitset, hash functions, perhaps the PRNG, avoiding
> > the undefined behaviours of signed ints.  It would be useful to expose
> > for users as well even without other support, which could be added
> > incrementally (if ever).
> 
> I have no objections to adding 'uint'.

Great, it looks like there are no objections.

> 
> I have been working on similar support for various fixed sized int types as a
> library, see: <https://github.com/juliensf/mercury-inttypes>.  Based on that, I
> think you would need to extend the language to at least support 'uint' literals
> as well -- their absence makes things quite awkward.

I saw that.  The difference is that uint is defined to be the same width
as int, and you can always "cast" between int and uint.  The absence of
uint interals should not be quite as awkward as for the fixed sized ints.

> > I've attached an incomplete module.
> 
> Let me complete it a little for you: the C# and Java implementations
> are in the int32 module in my library ;-)

Thanks ;)

Peter


More information about the reviews mailing list