[m-dev.] for review: change semantics of integer shifts

Fergus Henderson fjh at cs.mu.OZ.AU
Mon Mar 15 12:10:53 AEDT 1999


On 15-Mar-1999, Simon Taylor <stayl at cs.mu.OZ.AU> wrote:
> 
> Change the bit shift operators to perform checking to avoid
> undefined behaviour.
...
> --- NEWS	1999/03/14 23:31:32	1.2
> +++ NEWS	1999/03/14 23:51:24
> @@ -34,3 +34,11 @@
>    See the "Interactive query commands" subsection of the "Debugger commands"
>    section of the "Debugging" chapter of the Mercury User's Guide for details.
>  
> +* The semantics of `int:>>/2' and `int:<</2' has changed.
> +  The results are undefined only if the left operand of
> +  `int:>>/2' is negative, or if both operands of `int:<</2' are negative.
> +  The new functions `int:unchecked_left_shift/2' and
> +  `int:unchecked_right_shift/2' implement the old semantics,
> +  using the C shift operators.

I think it would be much better if the results of `>>' and `<<' were always
defined, even for the case where both operands are negative.  For `int',
the word size and whether or not overflow are checked should both be
implementation-defined, but apart from that, if you're using ordinary
functions from module `int', there should be no undefined behaviour.
(Of course, using `unchecked_*' functions is another story...)

Also, I don't think it's quite right to say that the semantics of `>>'
and `<<' have changed.  Really we've just clarified the previously
unclear documentation, and fixed the implementation so that it correctly
implements the documentation.  If you say that the semantics have changed,
people might think that there is some compatibility issue, but there isn't.
So I suggest just saying that the implementations of `>>' and `<<' have
been changed to handle negative shift counts and shifts counts greater
than the word size, and that for efficiency we also provide unchecked
versions that (like the previous implementations) do not check this.

>  	% left shift
> +	% The result is undefined if both operands are negative.
> +	% If the right operand is negative, a right shift is performed.
> +	% If the right operand is greater than or equal to the result of
> +	% `int__bits_per_int/1', 0 is returned.
>  :- func int << int = int.
>  :- mode in  << in  = uo  is det.

As explained above, I think this should be reworded to define the
semantics in all cases.

> +	% left shift
> +	% The result is undefined if the right operand is greater
> +	% than or equal to the result of `int__bits_per_int/1'.
> +:- func unchecked_left_shift(int, int) = int.
> +:- mode unchecked_left_shift(in, in) = uo is det.

s/is greater/is negative, or is greater/

>  	% (arithmetic) right shift
> +	% The result is undefined if the left operand is negative.
> +	% if the right operand is negative, a left shift is performed.
> +	% If the right operand is greater than or equal to the result of
> +	% `int__bits_per_int/1', 0 is returned.
>  :- func int >> int = int.
>  :- mode in  >> in  = uo  is det.

As explained above, I think this should be reworded to define the
semantics in all cases.

> +	% (arithmetic) right shift
> +	% The result is undefined if either operand is negative,
> +	% or if the right operand is greater than or equal to the
> +	% result of `int__bits_per_int/1'.
> +:- func unchecked_right_shift(int, int) = int.
> +:- mode unchecked_right_shift(in, in) = uo is det.

The result should not be undefined if the left operand is negative.

I know that in ANSI C the result of `>>' is implementation defined (not
undefined) if the left operand is negative.  But that is only because
ANSI C was supposed to work efficiently even on machines using one's
complement arithmetic.  All modern machines use two's complement, and
so, I believe, all C implementations on these machines define `>>' for
negative left operands in the same way.  Thus there is no good reason for
Mercury to make `>>' or even `unchecked_right_shift' have undefined
behaviour in the case where the left hand operand is negative.

Instead, I would suggest just adding some test cases to the test suite
to check the behaviour of `>>' and/or `unchecked_left_shift' with
negative left operands.  These tests will ensure that in the unlikely
event that we should ever try to port Mercury to a C compiler which
defines the semantics of `>>' for negative left operations differently,
we will discover the problem.

This is the same technique we used for ensuring that Mercury's '//'
integer division can be safely implemented using C's '/' operator,
even though the current C standard does not define the behaviour of '/'
for negative operands.  (Helpfully, the forthcoming new C standard C9X
does define the behaviour of '/' in that case.)

> +
> +	% XXX This definition is here for bootstrapping.
> +	% After the corresponding change to the compiler is installed,
> +	% the compiler ignore this clause, and will internally replace
> +	% the body of unchecked_left_shift with a `recursive' call.
> +	% See make_hlds:add_builtin.
> +unchecked_left_shift(X, Y) = X << Y.

s/compiler ignore/compiler will ignore/

> +X << Y = Z :-
> +	(
> +		int__bits_per_int(IntBits),
> +		Y > IntBits
> +	->
> +		Z = 0
> +	;
> +		Y >= 0
> +	->
> +		Z = unchecked_left_shift(X, Y)
> +	;
> +		Z = unchecked_right_shift(X, - Y)
> +	).

The test `Y > IntBits' should be `>=' not `>'.
This code should also handle the case where Y < IntBits.

Thus I suggest

	X << Y = Z :-
		int__bits_per_int(IntBits),
		( Y >= 0 ->
			( Y >= IntBits ->
				Z = 0
			;
				Z = unchecked_left_shift(X, Y)
			)
		;
			( Y =< -IntBits ->
				Z = (if X >= 0 then 0 else -1)
			;
				Z = unchecked_right_shift(X, -Y)
			)
		).

> +X >> Y = Z :-
> +	(
> +		int__bits_per_int(IntBits),
> +		Y > IntBits
> +	->
> +		Z = 0
> +	;
> +		Y >= 0
> +	->
> +		Z = unchecked_right_shift(X, Y)
> +	;
> +		Z = unchecked_left_shift(X, - Y)
> +	).

Likewise.  Also, arithmetic right shift should preserve the sign bit,
so for a right-shift of >= IntBits, if X < 0 then the result should
be -1 (all bits set) rather than zero.

	X >> Y = Z :-
		int__bits_per_int(IntBits),
		( Y >= 0 ->
			( Y >= IntBits ->
				Z = (if X >= 0 then 0 else -1)
			;
				Z = unchecked_right_shift(X, Y)
			)
		;
			( Y =< -IntBits ->
				Z = 0
			;
				Z = unchecked_left_shift(X, -Y)
			)
		).

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.



More information about the developers mailing list