[m-dev.] undefined behaviour

Peter Wang novalazy at gmail.com
Wed Sep 28 17:30:39 AEST 2016


Hi,

I have been trying out Address Sanitizer and Undefined Behaviour
Sanitizer.  So far the main warnings have to do with left shifts.

    1. the left operand is negative

    2. the right operand is equal to bits_per_int-1

A lot of warnings are thrown up for sparse_bitset.m, which uses ints as
bit vectors.  Other UB are from hashing functions and lookup switches,
again treating ints as bit vectors.

e.g. left operand negative

    !:HashVal = !.HashVal `xor` (!.HashVal `unchecked_left_shift` 5),

e.g. right operand equal to bits_per_int-1

    if ((((mercury__char_scalar_common_5[0])[(((mercury__char__Char_2 - (MR_Integer) 48)) >> (MR_Integer) 6)])
	& (((MR_Integer) 1 << ((((mercury__char__Char_2 - (MR_Integer) 48)) & (MR_Integer) 63))))))

All of these basically want unsigned shifts instead.
How shall we do that?


unchecked_left_shift is currently defined as:

    % unchecked_left_shift(X, Y) is the same as X << Y
    % except that the behaviour is undefined if Y is negative,
    % or greater than or equal to the result of `bits_per_int/1'.

With the current C implementation, it would need to be amended to be
undefined if X < 0, or Y < 0, or Y >= bits_per_int-1.


I think we should provide checked arithmetic predicates in int.m, e.g.

    :- pred checked_abs(int::in, int::out) is semidet.
    :- pred checked_add(int::in, int::in, int::out) is semidet.
    :- pred checked_mul(int::in, int::in, int::out) is semidet.

They can be implemented with gcc/clang builtin functions like
__builtin_add_overflow, falling back to slower algorithms where
necessary.

Peter

P.S. I added the following to Mmake.params to enable the sanitizers,
building with mmake --use-mmc-make.  Plain mmake requires more
variables.  Only tested hlc.gc.

EXTRA_MCFLAGS += --cflag -fsanitize=address,undefined --ld-flag -fsanitize=address,undefined --ld-flag -pthread


More information about the developers mailing list