[m-dev.] diff: Compiler support for stack_layouts

Zoltan Somogyi zs at cs.mu.oz.au
Mon Oct 27 19:09:04 AEDT 1997


> This diff is the compiler support for stack layouts. 
> 
> This support is not yet complete, but is functional enough to
> be useful.
> 
> Could someone review this please?

> Index: compiler/base_type_layout.m
> -		base_type_layout__get_next_label(LayoutInfo0, NextLabel),
> -		base_type_layout__incr_next_label(LayoutInfo0, LayoutInfo1),
>  		Pseudo0 = yes(const(data_addr_const(data_addr(TypeModule,
>  			base_type(info, TypeName, Arity))))),
> +		Label1 = Label0 + 1,

Why is arithmetic being done here? Why don't you use an access predicate?

> Index: compiler/code_gen.m
> ===================================================================
> +	( GC_Method = accurate ->
> +		code_info__get_total_stackslot_count(StackSize0, CodeInfo, _),
> +		( MaybeSuccipSlot = yes(_) -> 
> +			StackSize = StackSize0 + 1 
> +		;
> +			StackSize = StackSize0
> +		),
> +		code_util__make_proc_label(ModuleInfo, PredId, ProcId,
> +			ProcLabel),
> +		continuation_info__add_proc_info(proc(PredId, ProcId),
> +			ProcLabel, StackSize, CodeModel, MaybeSuccipSlot,
> +			ContInfo1, ContInfo)
> +	;
> +		ContInfo = ContInfo1
> +	),

You should not recalculate the stack frame size. The results may be incorrect
if some slots are used by tracing info.

> -% XXX What about redirected labels? Since we just traverse the call
> -% instructions looking for return addresses, it's possible that we will 
> -% store the information about a code address that actually occurs
> -% outside the procedure.
> -% We might want to check that code addresses are inside the procedure,
> -% and keep seperate any that aren't until we can find out where they
> -% really belong. Then again, maybe they aren't needed at all, since this
> -% information is likely to be the same elsewhere.
> -% Check this out - it's likely that tredirections aren't introduced
> -% until optimization.

In case you are still wondering, call return redirections are done by
jumpopt.m. Disabling the forwarding of returns to different procedures
would be quite easy.

> +	%
> +	% Information for any procedure, includes information about the
> +	% procedure itself, and any internal labels within it.
> +	%
> +:- type entry_layout_info

The name somewhat misleading here; it should probably have "proc" somewhere
in it.

> +		entry_layout_info(
> +			proc_label,	% the proc label
> +			int,		% number of stack slots
> +			code_model,	% which stack is used
> +			maybe(int),	% location of succip on stack
> +			map(code_addr, internal_layout_info) 
> +					% info for each internal label

I think the key type should be label, not code_addr. It does not make sense
to me to have layout information about e.g. do_redo. Am I missing something?

> +:- type internal_layout_info
> +	--->
> +		internal_layout_info(
> +			code_addr,	% what label is this associated with
> +			maybe(continuation_label_info)
> +		).

Ditto. Also, why is the key repeated here?

Ditto for continuation_info.

> +	%
> +	% Information for a label that is a continuation.
> +	%
> +	% Different calls can assign slightly
> +	% different liveness annotations to the labels after the call.
> +	% (Two different paths of computation can leave different
> +	% information live).
> +	% We take the intersection of these annotations.  Intersecting
> +	% is easy if we represent the live values and type infos as
> +	% sets.

Please give an example how two different computation paths can leave different
information alive and yet still end up at the same label. Offhand, I can't
think of any.

State that the reason why taking the intersection is correct is that if
something is not live on one path, the code following the label is guaranteed
not to depend on it.

> +	{ Proc = c_procedure(_, _, _, PredProcId, Instrs) },
> +        continuation_info__process_instructions(PredProcId, Instrs),
> +	continuation_info__process_llds(Procs).

spaces => tab

> +	%
> +	% Process the list of instructions for this proc, adding
> +	% all internal label information to the continuation_info.
> +	%
> +continuation_info__process_instructions(PredProcId, Instructions) -->
> +
> +		% Get all the continutation info from the call instructions

Spelling error.

> +		% Insert all labels into the internals
> +	{ list__filter_map(GetAllInternalLabels, Instructions, Labels) },
> +	continuation_info__add_non_continuation_labels(Labels),
> +	continuation_info__get_internal_info(InternalInfo),
> +	continuation_info__add_internal_info_to_proc(PredProcId, InternalInfo)
       .

The comment is misleading

> +	%
> +	% Add a the info for this proc (an entry_layout) to the
> +	% continuation_info.
> +	%

a the

> +continuation_info__add_proc_info(PredProcId, ProcLabel, StackSize, CodeMode
       l,
> +		SuccipLocation, ContInfo0, ContInfo) :-
> +		
> +	map__init(InternalMap),
> +	EntryLayout = entry_layout_info(ProcLabel, StackSize, CodeModel,
> +		SuccipLocation, InternalMap),
> +	continuation_info__add_proc_layout(PredProcId, EntryLayout,
> +		ContInfo0, ContInfo).

Why is the internal map empty here?

> +continuation_info__add_non_continuation_labels(Labels) -->
> +	continuation_info__get_internal_info(InternalInfo0),
> +	{ list__foldl(continuation_info__maybe_add_label, Labels, InternalInfo
       0, InternalInfo) },
> +	continuation_info__set_internal_info(InternalInfo).

Long line.

> +	%
> +	% Add a label to the internals, if it isn't already there.
> +	%
> +:- pred continuation_info__maybe_add_label(label,
> +		map(code_addr, internal_layout_info),
> +		map(code_addr, internal_layout_info)).
> +:- mode continuation_info__maybe_add_label(in, in, out) is det.
> +continuation_info__maybe_add_label(Label, InternalMap0, InternalMap) :-
> +	( map__contains(InternalMap0, label(Label)) ->
> +		InternalMap = InternalMap0
> +	;
> +		Internal = internal_layout_info(label(Label), no),
> +		map__det_insert(InternalMap0, label(Label), 
> +			Internal, InternalMap)
> +	).

ensure_label_is_present would be a better predicate name.

> +	%
>  	% Process a single continuation label.
> -
> +	%
>  :- pred continuation_info__process_internal_info(pair(code_addr,
>  		list(liveinfo)), continuation_info, continuation_info).
>  :- mode continuation_info__process_internal_info(in, in, out) is det.

Process how?

> +	%
> +	% Merge the continuation label information of two labels.
> +	%
> +	% If there are two continuation infos to be merged, we take
> +	% the intersection -- we can't have data that is live if you
> +	% return to this label from one piece of code that isn't live
> +	% if you return from another.
> +	%

If the stuff after -- is correct, then you don't need the intersections,
because then the two pairs of sets are always equal.

> +continuation_info__add_proc_layout(PredProcId, EntryLayout, 
> +		ContInfo0, ContInfo) :-
> +	ContInfo0 = continuation_info(ProcLayoutMap0, B),
> +	map__det_insert(ProcLayoutMap0, PredProcId, EntryLayout, ProcLayoutMap
       ),
> +	ContInfo = continuation_info(ProcLayoutMap, B).

B is not a meaningful name. Similarly for As below.

> -:- pred continuation_info__set_internal_info(list(internal_layout_info),
> -		continuation_info, continuation_info).
> +	%
> +	% Initialize the internal info.
> +	%
> +:- pred continuation_info__initialize_internal_info(
> +	continuation_info, continuation_info).
> +:- mode continuation_info__initialize_internal_info(in, out) is det.
> +
> +continuation_info__initialize_internal_info(ContInfo0, ContInfo) :-
> +	ContInfo0 = continuation_info(A, _),
> +	map__init(Internals),
> +	ContInfo = continuation_info(A, Internals).

Why do you need to initialize the second field independently of the first?

> --- llds.m	1997/10/12 13:32:31	1.211
> +++ llds.m	1997/10/21 01:54:18
> @@ -425,7 +426,8 @@
>  	;	float_const(float)
>  	;	string_const(string)
>  	;	code_addr_const(code_addr)
> -	;	data_addr_const(data_addr).
> +	;	data_addr_const(data_addr)
> +	;	label_entry(label).

The argument here provides no information that is associated with the label.
Why?

>  :- type data_addr
>  	--->	data_addr(string, data_name).
> @@ -433,8 +435,11 @@
>  
>  :- type data_name
>  	--->	common(int)
> -	;	base_type(base_data, string, arity).
> +	;	base_type(base_data, string, arity)
>  			% base_data, type name, type arity
> +	;	general(string).	
> +			% any sort of data, string provides a name
> +			% for it

I don't like this approach; we should define a specific data_name for each
use. In fact, in a similar case, I was thinking that there are arguments
in favor of removing the c_code LLDS instruction altogether (e.g. the fact
that tracing is now done via c_code means that a program compiled with
tracing cannot benefit from value numbering).

> --- llds_out.m	1997/10/12 13:32:34	1.59
> +++ llds_out.m	1997/10/22 02:44:56
> @@ -2665,6 +2696,10 @@
>  	% we need to cast them here to avoid type errors
>  	io__write_string("(const Word *) &"),
>  	output_data_addr(BaseName, VarName).
> +output_rval_const(label_entry(Label)) -->
> +	io__write_string("ENTRY("),
> +	output_label(Label),
> +	io__write_string(")").

Have you considered forwarding this case to the existing code that writes
out labels e.g. for return addresses in calls?

> Index: compiler/notes/compiler_design.html
> ===================================================================
> RCS file: /home/staff/zs/imp/mercury/compiler/notes/compiler_design.html,v
> retrieving revision 1.7
> diff -u -r1.7 compiler_design.html
> --- compiler_design.html	1997/09/23 16:48:43	1.7
> +++ compiler_design.html	1997/09/29 04:18:58
> @@ -618,6 +618,10 @@
>    the functors of a given type. The base_type_layout and base_type_functors
>    structures of each declared type constructor are added to the LLDS.
>  
> +<li> stack_layout.m generates the stack_layout structures that for
> +  accurate garbage collection. Tables are created from the data
> +  collected in continuation_info.m.
> +

that for

> % Data Stucture: stack_layouts
> %
> % For each procedure,
> % 	mercury_data__stack_layout__mercury__<proc_label>
> % containing:
> %	code address		(Code *) - address of entry
> % 	number of stack slots	(Integer) 
> % 	code_model		(Integer) actually, type MR_Code_Model 
> % 					0 = DET, 1 = NONDET
> % 	succip stack location	(Integer) actually, type MR_LIVE_LVAL

Is there a special value meaning no succip in stack frame? Or is any out
of range integer supposed to mean this?

> % For each internal label in a procedure (that is a continuation point)

Not all internal labels are continuation points. In addition, for tracing
you want info about the entry and exit points, which are in general not
continuation points.

> % 	mercury_data__stack_layout__mercury__<proc_label>_i<label number>
> % containing:
> %	code address		(Code *) - address of label
> %	procedure info		(Word *) - pointer to procedure stack layout
> %	number of live vars	(Integer)
> %	live data locations and (Word *) - pointer to vector of 
> %		types			MR_LIVE_LVAL and MR_LIVE_TYPE pairs
> %	type parameters		(Word *) - pointer to vector of 
> %					MR_LIVE_LVAL
> %
> % If the number of live vars is 0, there could be two explanations. The
> % continuation label might actually have no live data, or (more likely)
> % it isn't a continuation label at all.
> %
> % If you need to know the live variables at non-continuation labels,
> % this code will not be sufficient.

But the entry and exit point info is what Erwan needs.

You should document why you don't need a slot to indicate how many type
params there are, and that that part of the vector is not yet present.

> :- module stack_layout.
> 
> :- interface.
> 
> :- import_module hlds_module.
> 
> :- pred stack_layout__generate_llds(module_info, list(c_module)).
> :- mode stack_layout__generate_llds(in, out) is det.
> 
> :- implementation.
> 
> :- import_module llds, globals, options, continuation_info, llds_out.
> :- import_module base_type_layout.
> :- import_module assoc_list, bool, string, int, list, map, std_util, require
       .
> :- import_module set.
> 
> :- type stack_layout_info 	--->	
> 	stack_layout_info(
> 		string,		% module name
> 		bool,		% are 2-bit tags available?

If not, does this mean no tags or 3-bit tags?

> 		int,		% next available label 

I don't think this comment can be right.

> 		string,		% string representation of the proc_label
> 				% being currently examined

It is the proc, not the proc_label that is being examined.

> :- pred stack_layout__construct_entry_layout(proc_label, int, code_model, 
> 		maybe(int), stack_layout_info, stack_layout_info).
> :- mode stack_layout__construct_entry_layout(in, in, in, in, in, out) is det
       .
> stack_layout__construct_entry_layout(ProcLabel, StackSlots, CodeModel, 
> 		SuccipLoc) -->
> 	{
> 		SuccipLoc = yes(Location0)
> 	->
> 		Location = Location0
> 	;
> 			% XXX
> 			% Assume a dummy location of -1 if there is
> 			% no succip on the stack. The runtime system
> 			% might be able to work around this, depending
> 			% upon what it is using the stack layouts for.
> 		Location = -1
> 	},

What is the XXX for?

"Assume" is the wrong word here.

> 	{
> 		CodeModel = model_det,
> 		SuccipLval = stackvar(Location)
> 	;
> 		CodeModel = model_semi,
> 		SuccipLval = stackvar(Location)
> 	;
> 		CodeModel = model_non,
> 		SuccipLval = framevar(Location)
> 	},
> 	{ stack_layout__construct_lval(SuccipLval, SuccipRval) },

I suggest stack_layout__construct_lval => stack_layout__represent_lval

and similarly in other pred names.

> 	{ StackSlotsRval = const(int_const(StackSlots)) },
> 	{ CodeAddrRval = const(code_addr_const(label(local(ProcLabel)))) },
> 	stack_layout__construct_code_model(CodeModel, CodeModelRval),
> 	{ MaybeRvals = [yes(CodeAddrRval), yes(StackSlotsRval), 
> 		yes(CodeModelRval), yes(SuccipRval)] },
> 	stack_layout__get_module_name(ModuleName),
> 
> 	{ llds_out__get_proc_label(ProcLabel, no, LabelString0) },
> 	{ string__append("stack_layout__mercury__", LabelString0, 
> 		LabelString) },

Why the "no" in llds_out__get_proc_label if you stick on the prefix yourself?

> :- pred stack_layout__construct_internal_layout(internal_layout_info,  
> 		stack_layout_info, stack_layout_info).
> :- mode stack_layout__construct_internal_layout(in, in, out) is det.
> stack_layout__construct_internal_layout(Internal) -->
> 	{ Internal = internal_layout_info(CodeAddr, ContinuationLabelInfo) },
> 	{
> 		ContinuationLabelInfo = yes(continuation_label_info(
> 			LiveLvalSet0, _TVars))
> 	->
> 		LiveLvalSet = LiveLvalSet0
> 	;
> 		% We record no live values here. This might not be
> 		% true, however this label is not being used as a
> 		% continuation, so it shouldn't be relied upon.
> 		
> 		set__init(LiveLvalSet)
> 	},
> 		
> 		% XXX Need to do something with TVars?!?

I presume this is part of what you mean by "not yet complete".

> stack_layout__construct_liveval_pairs(LiveLvals, Rval) -->
> 	list__map_foldl(stack_layout__construct_liveval_pair, LiveLvals, 
> 		RvalsList),
> 	{ list__condense(RvalsList, Rvals) },
> 	stack_layout__get_next_label(Label),
> 	stack_layout__incr_next_label,
> 	{ Rval = create(0, Rvals, no, Label) }.
> 
> 	% Construct a pair of (lval, live_value_type) representations.
> 
> :- pred stack_layout__construct_liveval_pair(pair(lval, live_value_type),
> 		list(maybe(rval)), stack_layout_info, stack_layout_info).
> :- mode stack_layout__construct_liveval_pair(in, out, in, out) is det.
> 
> stack_layout__construct_liveval_pair(Lval - LiveValueType, Rvals) -->
> 	{ stack_layout__construct_lval(Lval, Rval0) },
> 	stack_layout__construct_live_value_type(LiveValueType, Rval1),
> 	{ Rvals = [yes(Rval0), yes(Rval1)] }.

Don't these create rvals lead to an array of pointers, one per live value,
to be constructed for each label, instead of a 2D array?

> stack_layout__construct_live_value_type(var(Type, _Inst), Rval) -->
> 	stack_layout__get_next_label(Label0),
> 	{ base_type_layout__construct_pseudo_type_info(Type, Rval0, Label0,
> 		Label1) },
> 	stack_layout__set_next_label(Label1),
> 		% XXX hack - don't yet write out insts
> 	{ Rval1 = const(int_const(-1)) },
> 	stack_layout__get_next_label(Label2),
> 	stack_layout__incr_next_label,
> 	{ Rval = create(0, [yes(Rval0), yes(Rval1)], no, Label2) }.

Why don't you invent representations for ground and free, and require
that every inst be one of those two for the time being?

I don't like the mixing of get, set and incr, and they are not label
numbers in any case.

> 	% Some things in this module are encoded using a low tag.
> 	% This is not done using the normal compiler mkword, but by
> 	% doing the bit shifting here.
> 	%
> 	% This allows us to use more than the usual 2 or 3 bits, but
> 	% we have to use low tags and cannot tag pointers this way.

Why do you want to use more than 2 or 3 bits, and how?

> 	% Construct a represntation of  the code model.

Spelling.

> :- pred stack_layout__construct_code_model(code_model, rval, stack_layout_in
       fo, 
> 		stack_layout_info).
> :- mode stack_layout__construct_code_model(in, out, in, out) is det.
> stack_layout__construct_code_model(CodeModel, Rval) -->
> 	(
> 		{ CodeModel = model_det },
> 		{ Rval = const(int_const(0)) }
> 	;
> 		{ CodeModel = model_semi },
> 		{ Rval = const(int_const(0)) }
> 	;
> 		{ CodeModel = model_non },
> 		{ Rval = const(int_const(1)) }
> 	).
> 
> :- pred stack_layout__code_model(code_model::in, int::out) is det.
> stack_layout__code_model(model_det, 0).
> stack_layout__code_model(model_semi, 0).
> stack_layout__code_model(model_non, 1).

These look like duplicate code.

Zoltan.



More information about the developers mailing list