[m-rev.] diff: improve boxing and unboxing in IL backend.

Tyson Dowd trd at cs.mu.OZ.AU
Fri Aug 3 05:24:24 AEST 2001


On 03-Aug-2001, Fergus Henderson <fjh at cs.mu.OZ.AU> wrote:
> On 26-Jul-2001, Tyson Dowd <trd at cs.mu.OZ.AU> wrote:
> > Simplify boxing, unboxing and casts in the IL backend.
> > 
> > compiler/mlds_to_il.m:
> > 	Previously we generated calls to out-of-line procedures written
> > 	in C++ (and then they forwarded to ones written in IL) for
> > 	various reasons mostly to do with churn of the .NET
> > 	implementation of boxing.
> > 
> > 	Now we generate much better code:
> > 		- "unbox" and "ldobj" for unboxing
> > 		- "box" for boxing
> > 		- "castclass" for casts between reference types
> > 		- nothing for casts to the same type
> > 		- for casting from mercury types to value types we
> > 		  unload from the MR_Word, and unbox the value
> 
> That's wrong for the --high-level-data case.
> 
> > Index: compiler/mlds_to_il.m
> > ===================================================================
> > RCS file: /home/mercury1/repository/mercury/compiler/mlds_to_il.m,v
> > retrieving revision 1.55
> > diff -u -r1.55 mlds_to_il.m
> > --- compiler/mlds_to_il.m	2001/07/20 14:14:02	1.55
> > +++ compiler/mlds_to_il.m	2001/07/26 10:00:15
> > @@ -1966,53 +1966,82 @@
> >  
> >  		% if we are casting from an unboxed type, we should box
> >  		% it first.
> > -		% XXX should also test the cast-to type, to handle the
> > -		% cases where it is unboxed.
> >  unaryop_to_il(cast(Type), Rval, Instrs) -->
> >  	DataRep =^ il_data_rep,
> >  	{ ILType = mlds_type_to_ilds_type(DataRep, Type) },
> > -	{ 
> > -		Rval = const(Const),
> > -		RvalType = rval_const_to_type(Const),
> > -		RvalILType = mlds_type_to_ilds_type(DataRep, RvalType),
> > -		not already_boxed(RvalILType)
> > -	->
> > -		Instrs = node([call(convert_to_object(RvalILType)),
> > -			castclass(ILType)])
> > +	{ rval_to_type(Rval, RvalType) },
> 
> That change is very dangerous, since rval_to_type/2 is basically broken.
> It only handles a subset of all rvals.  For the others it calls error/1.
> 
> Previously, the code here only called rval_const_to_type/1, which is safe.
> 
> Your change introduces a latent bug.  I don't know off-hand if the bug
> will be actually exercised in the current compiler, but even if not,
> it may be triggered later by e.g. bug fixes to the MLDS code generator
> that cause it to generate more casts.

I'm not quite context switched on this one, but I'll answer as best I
can.

I'd prefer to handle everything in a uniform manner.
I'd prefer to be forced to fix the missing cases of rval_to_type
(although not today) than to go with the old approach that generated
sub-optimal code a lot of the time.

> 
> > +		( already_boxed(RvalILType) ->
> > +			( RvalType = mercury_type(_, user_type) ->
> > +				% XXX we should look into a nicer way to
> > +				% generate MLDS so we don't need to do this
> > +				Instrs = tree__list([
> > +					comment_node(
> > +						"loading out of an MR_Word"),
> > +					instr_node(ldc(int32, i(0))),
> > +					instr_node(ldelem(
> > +						il_generic_simple_type)),
> > +					comment_node(
> > +						"turning a cast into an unbox"),
> > +					convert_from_object(ILType)
> > +				])
> 
> I'm pretty sure that code is wrong for --high-level-data.

If it were every triggered, it would be. 

But --high-level-data should be loading from fields using field rvals,
not cast.

I think this only comes up in low-level-data.

But I am open to suggestions to improve it -- however we need to get the
same behaviour because this was what caused the bug in calculator.m
-- we were unloading characters from MR_Words incorrectly.

> > +unaryop_to_il(box(UnboxedType), _, Instrs) -->
> >  	DataRep =^ il_data_rep,
> > -	{ ILType = mlds_type_to_ilds_type(DataRep, Type) },
> > -	{ already_boxed(ILType) ->
> > -		Instrs = node([isinst(il_generic_type)])
> > +	{ UnboxedILType = mlds_type_to_ilds_type(DataRep, UnboxedType) },
> > +	{ already_boxed(UnboxedILType) ->
> > +			% It is already boxed, so we don't need
> > +			% to do anything.
> > +			% It would be good if we didn't generate 
> > +			% such code, but it's no big deal
> > +		Instrs = empty
> 
> What do you mean by the last sentence in the comment?
> 
> If you mean that we shouldn't represent conversions like this using the
> MLDS box/1 rval, then I disagree.  All type conversions in the MLDS should
> be explicit, IMHO, (it would not be a good idea to allow implicit upcasts
> in the MLDS) so we need some way of representing such conversions.
> And that is what the MLDS box/1 rval is for: representing conversions
> to mlds__generic_type.
> 
> Maybe the problem is that the box/1 and unbox/1 rvals weren't documented
> well.  The names suggest that they always allocate a new value on the
> heap, but that is not the case.  I've just committed a patch to mlds.m
> which improves the documentation for those.  I used the names box/1 and unbox/1
> since they were concise and conveyed the general intention, but maybe they
> are a little misleading -- suggestions for alternative names would be welcome.

I think I was just a little mislead, my comment is unjustified.

Tyson.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list