[mercury-users] Tagged datatypes and boxing

Fergus Henderson fjh at cs.mu.OZ.AU
Fri Sep 24 00:27:05 AEST 1999


On 23-Sep-1999, Ralph Becket <rbeck at microsoft.com> wrote:
> 
> I was very surprised when my compression program that *shouldn't* be doing
> any structure creation turned out to be doing just that in spades.
> 
> A swift application of mprof later and it turns out that the culprits are a
> couple of predicates in the inner loop that return tagged integers (there are
> only two possible tags in each case).  I had naively expected the compiler to
> steal a bit for this purpose, but of course that's unreasonable.  So the
> compiler ended up boxing these things and, in the absence of compile-time GC,
> the program ends up putting all this stuff on the heap.

Actually the compiler doesn't do that optimization even when it is
reasonable, e.g. if the type in question is `char' or something else
with sufficient bits free.

> Typically I rarely require all 32 bits (or whatever) that I have in an int
> and I'd like to be able to tell the compiler that it can use the rest for
> tagging.  Call it a `short' or somesuch.  24 bits of payload should be
> ample.
> 
> In fact, I have another suggestion.  Way back when in my systems days, it was
> a source of constant aggravation that the C standard didn't specify the number
> of bits/bytes occupied by the various integer types.  There are cases where
> this sort of knowledge really matters.  I think I would like to see the
> following types in Mercury: int8, int16, int24, int32, int64, and let plain
> int be implementation dependent (e.g. 30 bits + 2 for tagging).  Use of the
> intXX types would be subject to the standard mod 2^XX behaviour, while
> the int library could specify upper and lower bounds and the number of bits
> used as it does at present.

Yes, we (the Mercury developers) discussed this a while ago, and came
to exactly the same conclusions.  However, this is not that high on
our list of priorities.  Probably it will wait until after we have
a typeclass-ified standard library, since otherwise the various overloadings
for `+' would often require explicit type declarations to resolve,
which is a bit of a pain if you're using type inference.

> I'm aware that this would complicate the compiler somewhat, but I have a
> funny feeling that it would pay all sorts of dividends, not least in the
> optimisation department.

It would in fact complicate not only the compiler, it would also complicate
the runtime system, in particular the RTTI support needed for the
library predicates std_util__deconstruct, std_util__arg, store__arg_ref,
and store__set_arg, the debugger, and the accurate garbage collector.
So it's a non-trivial task.  I agree that it would be a good idea.
It's just a question how we prioritize our limited resources.

Cheers,
	Fergus.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the users mailing list