[mercury-users] Tagged datatypes and boxing
Richard A. O'Keefe
ok at hermes.otago.ac.nz
Tue Sep 28 09:02:12 AEST 1999
The debate is between me and Peter Scachte.
> Open-endedness is precisely what I want at this stage in the discussion.
> The whole point is that I don't *know* what everyone will find appropriate
> for constraining variables.
>
> type smallprime = X:integer where 0 < X & X < 1024 & prime(X).
>
> Yes, why not?
I suspect we are talking about different things here. I believe you
are just talking about admissible values for a type, and are
suggesting that it should be possible to specify types in a more
expressive way than is possible with only regular types. This is
something Lee Naish has advocated for quite a while, and, details
aside, seems like a good idea to me. The devil is in the details,
though.
Agreed. To be honest, the ad hoc syntax is mine, but the *idea* is
exactly Lee Naish's idea.
What I was referring to was just *representation*.
Your use of Pascal syntax in your example confused me,
because subranges in Pascal have *nothing* to do with representation.
Come to that, they don't have anything to do with representation in
Ada either. If I write
subtype Nat8 is Integer range 0 .. 2**8 - 1;
then what I get is a subset of the *values* of Integer (0 to 255)
with *exactly the same representation* as Integer. You can change
that with an entirely separate declaration
for Nat8'Size use 8;
which only makes sense *given* the range bounds.
You could probably
represent the primes between 0 and 1024 with 7 or 8 bits, but you'd be
crazy to do so.
*BUT* the compiler would be free to exploit the bounds 0..1023 (as
1024 is not a prime) to represent the primes between 0 and 1024 with
TEN bits.
As you observe, you could represent the odd integers
between 1 and 511 in 8 bits, but as soon as you tried to do math with
them you'd be sorry you did.
Let's face it, you will deeply regret *ANY* packing if you try to do
much, even loads and stores, with the packed data. There is, after
all, a reason packing fell out of disfavour in Pascal: the code
bloats up and slows down like you wouldn't believe.
You *don't* do arithmetic with odd integers between 1 and 511
any more than modern machines do arithmetic with bytes. When you
fetch a byte (or other packed value) from memory, it is widened to
a more convenient format (typically 32 bits); when you store a
byte (or other packed value) into memory, it is narrowed again
(and may be bounds checked).
This is partly the point of the type/subtype distinction.
Suppose I have
:- integer subtype (X: odd8 :- 0 < X < 512, X mod 2 =:= 1).
and
X, Y, Z: odd8
What is the type of X+Y+Z? Is it odd8? No way! It's *integer*.
(A compiler that does interval analysis, and such compilers exist,
would realise that the subtype is some subset of [3,1533], which
is clearly not odd8.) So the implementation would be
from_odd8(N) where 0 <= N, N < 256 -> N*2+1.
to_odd8(N) where 0 < N, N < 512, X mod 2 = 1 -> (N-1) div 2.
and
(S:odd8) = X+Y+Z
would be implemented as
S = to_odd8(from_odd8(X) + from_odd8(Y) + from_odd8(Z)).
If the compiler will not, for example, pack -500..+500 into 10
bits, which requires nearly as much implementation complexity
as the odd8 example, then the *programmer* will have to.
You could represent the numbers between
1024 and 1040 in 4 bits, but even that seems rather dubious if you're
going to be doing much math with them.
Once again, you *DON'T* do calculations on packed forms. Pascal
didn't, modern hardware can't. Even on PDP-10s, the hardware could
only do 18-bit or 36-bit integer arithmetic.
This is why I suggested the
old Pascal-style subranges for specifying small integer
representation: it has all the expressiveness you'd want for that.
Once again, it _didn't_ have that expressiveness in Pascal.
Pascal subrange notation merely describes a set of values.
To get packing, you have to use the 'packed' keyword, which
applies to record and array types. And there are all sorts of
limitations on packed fields/array elements in Pascal, for example
(and for good reason) they cannot be passed as 'var' parameters.
Ada *does* allow its equivalent, but expressly involves conversion
to general representation on entry and back to packed form on exit.
Even C notionally doesn't do arithmetic *directly* on anything
smaller than an 'int'.
Even if you argue for open-ended typing information, I still think it
makes sense to have a specialized syntax for specifying a
representation.
My point was that the "specialised syntax" can *be* that subset of
the open-ended typing information which the compiler is capable
of recognising as useful for the purpose. One compiler might
recognise nothing; another might recognise constant bounds; another
might recognise modular information. Why raise the limitations of
some compilers to the level of special-purpose straitjacket syntax?
--------------------------------------------------------------------------
mercury-users mailing list
post: mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the users
mailing list