[m-dev.] for review: nondet pragma C codes (part 2 of 2)

Fergus Henderson fjh at cs.mu.oz.au
Fri Jan 9 20:01:31 AEDT 1998


On 09-Jan-1998, Zoltan Somogyi <zs at cs.mu.oz.au> wrote:
> 
> +:- mode mercury_output_pragma_c_code(in, in, in, in, in, in, di, uo) is det.
...
>  		mercury_output_pragma_c_body_code(Code)
>  	;
>  		{ Pragma = c_code(MayCallMercury, Pred, PredOrFunc, Vars,
> -			VarSet, C_CodeString) }, 
> +			VarSet, PragmaCode) }, 

I'd prefer `Pragma_C_Code' (or perhaps `Pragma_C_Code_Components')
rather than `PragmaCode'.

>  	% to modecheck a pragma_c_code, we just modecheck the proc for 
>  	% which it is the goal.
> -modecheck_goal_expr(pragma_c_code(IsRecursive, C_Code, PredId, _ProcId0, Args0,
> -		ArgNameMap, OrigArgTypes, ExtraPragmaInfo), GoalInfo, Goal) -->
> +modecheck_goal_expr(pragma_c_code(IsRecursive, PredId, _ProcId0, Args0,
> +		ArgNameMap, OrigArgTypes, PragmaCode), GoalInfo, Goal) -->

Ditto.

> -	{ Pragma = pragma_c_code(IsRecursive, C_Code, PredId, ProcId, Args0,
> -			ArgNameMap, OrigArgTypes, ExtraPragmaInfo) },
> +	{ Pragma = pragma_c_code(IsRecursive, PredId, ProcId, Args0,
> +			ArgNameMap, OrigArgTypes, PragmaCode) },
>  	{ handle_extra_goals(Pragma, ExtraGoals, GoalInfo, Args0, Args,
> -				InstMap0, ModeInfo, Goal) },
> +			InstMap0, ModeInfo, Goal) },

Ditto.

> +opt_util__can_instr_branch_away(pragma_c(_, Components, _, _), BranchAway) :-
> +	(
> +		list__member(Component, Components),
> +		Component = pragma_c_raw_code(_)
> +	->
> +		BranchAway = yes
> +	;
> +		BranchAway = no
> +	).

A comment here please?

Also it might be better to have a separate det procedure
can_component_branch_away(Component, Bool).

> +:- pred opt_util__touches_nondet_ctrl_component(pragma_c_component, bool).
> +:- mode opt_util__touches_nondet_ctrl_component(in, out) is det.
> +
> +opt_util__touches_nondet_ctrl_component(pragma_c_inputs(_), no).
> +opt_util__touches_nondet_ctrl_component(pragma_c_outputs(_), no).
> +opt_util__touches_nondet_ctrl_component(pragma_c_raw_code(_), no).
> +opt_util__touches_nondet_ctrl_component(pragma_c_user_code(_, _), yes).

Can you explain why raw_code doesn't touch nondet_ctrl_components
but user_code does?

Also (this is not related to this diff, but still)
it would be helpful if the documentation for opt_util__touches_nondet_ctrl
explained what a "nondet stack control" was.

> -polymorphism__process_goal_expr(pragma_c_code(IsRecursive, C_Code, PredId,
> -		ProcId, ArgVars0, ArgNames0, OrigArgTypes0, ExtraInfo),
> +polymorphism__process_goal_expr(pragma_c_code(IsRecursive, PredId, ProcId,
> +		ArgVars0, ArgInfo0, OrigArgTypes0, PragmaCode),

Another occurrence of `PragmaCode'.

> +% The code we generate for nondet pragma_c_code assumes that this code is
> +% the only thing between the procedure prolog and epilog; such pragma_c_codes
> +% therefore cannot be inlined. The code of the procedure is of one of the
> +% following two forms:
> +%
> +% form 1:

It would be better to name them, e.g.

	"form 1 (shared code):"
	"form 2 (duplicated code):"

> +% <proc entry label and comments>
> +% <mkframe including space for the save struct>

, with redoip set to do_fail.

...
> +%	#define SUCCEED()	goto callsuccesslabel
> +%	#define SUCCEED_LAST()	goto calllastsuccesslabel
> +%	#define FAIL()		fail()

s/SUCCEED/MR_SUCCEED/g
s/FAIL/MR_FAIL/g

> +% form 2:
> +% <proc entry label and comments>
> +% <mkframe including space for the save struct>
> +% <#define MR_ORDINARY_SLOTS>
> +% <--- boundary between prolog and code generated here --->
> +% <set redoip to point to &&xxx_i1>
> +% <code for entry to a disjunction and first disjunct>
> +% {
> +%	<declaration of one local variable for each input and output arg>
> +%	<declaration of one local variable to point to save struct>
> +%	<assignment of input values from registers to local variables>
> +%	<assignment to save struct pointer>
> +%	save_registers(); /* see notes (1) and (2) below */
> +%	#define SUCCEED()	goto callsuccesslabel
> +%	#define SUCCEED_LAST()	goto calllastsuccesslabel
> +%	#define FAIL()		fail()
> +%	{ <the user-written call c code> }
> +%	GOTO_LABEL(xxx_i2)
> +% callsuccesslabel: /* see note (4) below */
> +%	restore_registers(); /* see notes (1) and (3) below */
> +%	<assignment of the output values from local variables to registers>
> +%	succeed()
> +% calllastsuccesslabel: /* see note (4) below */
> +%	restore_registers(); /* see notes (1) and (3) below */
> +%	<assignment of the output values from local variables to registers>
> +%	succeed_discard()
> +% 	#undef SUCCEED
> +% 	#undef SUCCEED_LAST
> +% 	#undef FAIL
> +% }
> +% Define_label(xxx_i1)
> +% <code for entry to a later disjunct>
> +% {
> +%	<declaration of one local variable for each output arg>
> +%	<declaration of one local variable to point to save struct>
> +%	<assignment to save struct pointer>
> +%	save_registers(); /* see notes (1) and (2) below */
> +%	#define SUCCEED()	goto retrysuccesslabel
> +%	#define SUCCEED_LAST()	goto retrylastsuccesslabel
> +%	#define FAIL()		fail()
> +%	{ <the user-written retry c code> }
> +%	GOTO_LABEL(xxx_i2)
> +% retrysuccesslabel: /* see note (4) below */
> +%	restore_registers(); /* see notes (1) and (3) below */
> +%	<assignment of the output values from local variables to registers>
> +%	succeed()
> +% retrylastsuccesslabel: /* see note (4) below */
> +%	restore_registers(); /* see notes (1) and (3) below */
> +%	<assignment of the output values from local variables to registers>
> +%	succeed_discard()
> +% 	#undef SUCCEED
> +% 	#undef SUCCEED_LAST
> +% 	#undef FAIL
> +% }
> +% Define_label(xxx_i2)
> +% {
> +%	<declaration of one local variable for each output arg>
> +%	<declaration of one local variable to point to save struct>
> +%	<assignment to save struct pointer>
> +%	#define SUCCEED()	goto sharedsuccesslabel
> +%	#define SUCCEED_LAST()	goto sharedlastsuccesslabel
> +%	#define FAIL()		fail()
> +%	{ <the user-written shared c code> }
> +% sharedsuccesslabel:
> +%	restore_registers(); /* see notes (1) and (3) below */
> +%	<assignment of the output values from local variables to registers>
> +%	succeed()
> +% sharedlastsuccesslabel: /* see note (4) below */
> +%	restore_registers(); /* see notes (1) and (3) below */
> +%	<assignment of the output values from local variables to registers>
> +%	succeed_discard()
> +% 	#undef SUCCEED
> +% 	#undef SUCCEED_LAST
> +% 	#undef FAIL
> +% }
> +% <--- boundary between code generated here and epilog --->
> +% <#undef MR_ORDINARY_SLOTS>

This code does not preserve the values of the local variables which hold
the output arguments when jumping to the shared code.

> @@ -83,22 +255,59 @@
>  %	through C back to Mercury.  In that case, we need to
>  %	keep the value of `hp' that was set by the recursive
>  %	invocation of Mercury.  The Mercury calling convention
> +%	guarantees that when calling det or demi code, the values

s/demi/semidet/

> +% (4)	These labels and the code following them can be optimized away

s/can be/can hopefully be/ ;-)

> +:- pred pragma_c_gen__ordinary_pragma_c_code(code_model::in,
> +	may_call_mercury::in, pred_id::in, proc_id::in, list(var)::in,
> +	list(maybe(pair(string, mode)))::in, list(type)::in,
> +	string::in, term__context::in, code_tree::out,
> +	code_info::in, code_info::out) is det.
> +
> +pragma_c_gen__ordinary_pragma_c_code(CodeModel, MayCallMercury,
> +		PredId, ProcId, ArgVars, ArgInfo, OrigArgTypes,
> +		C_Code, Context, Code) -->
>  	% First we need to get a list of input and output arguments
>  	code_info__get_pred_proc_arginfo(PredId, ProcId, ArgInfos),
> -	{ make_c_arg_list(ArgVars, Names, OrigArgTypes, ArgInfos, Args) },
> +	{ make_c_arg_list(ArgVars, ArgInfo, OrigArgTypes, ArgInfos, Args) },

I'm not very happy with the use of the names "ArgInfo" and "ArgInfos"
to mean two quite different things.  How about renaming "ArgInfo"
as "Pragma_C_Arg_Info" or something like that?

> -	( { MayCallMercury = will_not_call_mercury } ->
> -		{ SaveVarsCode = empty }
> -	;
> +	( { MayCallMercury = may_call_mercury } ->
>  		% the C code might call back Mercury code
>  		% which clobbers the succip
>  		code_info__succip_is_used,
> @@ -109,10 +318,11 @@
>  		{ get_c_arg_list_vars(OutArgs, OutArgs1) },
>  		{ set__list_to_set(OutArgs1, OutArgsSet) },
>  		call_gen__save_variables(OutArgsSet, SaveVarsCode)
> +	;
> +		{ SaveVarsCode = empty }
>  	),

Any particular reason for that change?

I personally find

	( condition ->
		short code
	;
		start of long code
		...
		lots of lines
		...
		end of long code
	)

to be a lot easier to read than than

	( condition ->
		start of long code
		...
		lots of lines
		...
		end of long code
	;
		short code
	)

because it reduces the distance between the code
and the condition, avoiding this problem:


	--- top of screen --------------------
		...
		end of long code
	;
		short code	% when does this code get called?
				% you need to scroll up to find out...
	)

> @@ -156,45 +373,333 @@
>  
>  		pragma_acquire_regs(OutArgs, Regs)
>  	),
> -	place_pragma_output_args_in_regs(OutArgs, Regs, Outputs),
> +	place_pragma_output_args_in_regs(OutArgs, Regs, OutputDescs),
>  
> -	( { MayCallMercury = will_not_call_mercury } ->
> -		{ Wrapped_C_Code = C_Code }
> +	{ C_Code_Comp = pragma_c_user_code(Context, C_Code) },
> +	{ MayCallMercury = will_not_call_mercury ->
> +		WrappedComp = [C_Code_Comp]

I think `C_Code_Component' would be clearer here.

> +pragma_c_gen__nondet_pragma_c_code(CodeModel, MayCallMercury,
> +	{ SaveStructDecl = pragma_c_struct_ptr_decl(StructName, "LOCALS") },
> +	{ string__format("\tLOCALS = (struct %s *) (

Hmm, `LOCALS' should have a prefix.  Perhaps the prefix should be `MC_'
rather than `MR_' or `ML_' since it is generated by the Mercury Compiler
rather than being part of the Mercury Runtime or Mercury Library.
Ditto for SUCCEED(), SUCCEED_LAST() and FAIL(), I guess.

> +	code_info__get_maybe_trace_info(MaybeTraceInfo),
> +	( { MaybeTraceInfo = yes(TraceInfo) } ->
> +		trace__generate_event_code(disj([disj(1)]), TraceInfo,
> +			FirstTraceCode),
> +		trace__generate_event_code(disj([disj(2)]), TraceInfo,
> +			LaterTraceCode)

Hmm, the fact that the "LaterTraceCode" can get entered multiple times
may cause the debugger major confusion.  It's not the same as a disjunction.
Perhaps this should have its own trace code?

> +	(
> +		{
> +			Treat = duplicate
> +		;
> +			Treat = automatic,
> +			string__length(Shared, Len),
> +			Len < 1024
> +		}

A comment here please.

Also, I think counting the number of semicolons would be a better heuristic.

> +pragma_c_gen__struct_name(ModuleName, PredName, Arity, ProcId, StructName) :-
> +	proc_id_to_int(ProcId, ProcNum),
> +	string__int_to_string(Arity, ArityStr),
> +	string__int_to_string(ProcNum, ProcNumStr),
> +	string__append_list(["mercury_save__", ModuleName, "__",
> +		PredName, "__", ArityStr, "_", ProcNumStr], StructName).

You need to call llds_out__name_mangle for ModuleName and PredName.

> +	% All the strings in this type are accompanied by the context
> +	% of their appearance in the source code. These contexts are
> +	% used to tell the C compiler where the included C code comes from,
> +	% to allow it to generate error messages that refer to the original
> +	% appearance of the code in the Mercury program.
> +:- type pragma_code

I think the name should be `pragma_c_code', `pragma_c_code_info',
or something like that.

> +	--->	ordinary(		% This is a C definition of a model_det
> +					% or model_semi procedure. (We also
> +					% allow model_non, until everyone has
> +					% had time to adapt to the new way
> +					% of handling model_non pragmas.

Missing ')'.

> +			string,		% The C code of the procedure.
> +			term__context
> +		)
> +	;	nondet(			% This is a C definition of a model_non
> +					% procedure.

> +			string,		% The info saved for the time when
> +			term__context,	% backtracking reenters this procedure
> +					% is stored in a C struct. This arg
> +					% contains the field declarations.

> +			string,		% Gives the code to be executed when
> +			term__context,	% the procedure is called for the first 
> +					% time. This code may access the input
> +					% variables.

> +			string,		% Gives the code to be executed when
> +			term__context,	% control backtracks into the procedure.
> +					% This code may not access the input
> +					% variables.

> +			pragma_shared_code_treatment,
> +					% How should the shared code be
> +					% treated during code generation.

> +			string,		% Shared code that is executed after
> +			term__context	% both the previous code fragments.
> +					% May not access the input variables.
> +		).

I think it would be better if you inserted blank lines at the places
shown, or formatted the comments differently, to show more clearly
the groupings.

Another alternative would be to use

	:- type c_code_fragment == pair(string, term__context).

> +:- type pragma_shared_code_treatment
> +	--->	duplicate
> +	;	share
> +	;	automatic.

There should be a comment here (perhaps just a pointer to documentation
elsewhere).

> --- prog_io_pragma.m	1997/12/22 09:56:18	1.10
> +	    (
> +		C_CodeTerm = term__functor(term__string(C_Code), [], Context)
> +	    ->
> +	        parse_pragma_c_code(ModuleName, MayCallMercury, PredAndVarsTerm,
> +	    	    ordinary(C_Code, Context), VarSet, Result)
> +	    ;
> +		Result = error("invalid `:- pragma c_code' declaration -- expecting either `may_call_mercury' or `will_not_call_mercury', and a string for C code",
> +		    C_CodeTerm)
> +	    )

Hmm, does the error message come out looking OK?
Why does it say it is expecting <this> *and* <that>?
That's not all it is expecting, after all you have to
specify the predicate name and mode too.
Is this the "wrong number of arguments" case?
If so, then I think the error message should say just that.

> +		                Result = error("invalid sixth argument in `:- pragma c_code' declaration -- expecting `shared_code(<code>')",

Shouldn't that message say "... expecting `common_code(<code>)'",
or alternatively "... expecting `shared_code(<code>)' or
`duplicate_code(<code>)' or `common_code(<code>)'"?
If you only list one, list the one which leaves it up to the compiler
to make the space/time trade-off.

> +:- pred parse_pragma_keyword(string, term, string, term__context).
> +:- mode parse_pragma_keyword(in, in, out, out) is semidet.
> +
> +parse_pragma_keyword(ExpectedKeyword, Term, StringArg, StartContext) :-
> +	Term = term__functor(term__atom(ExpectedKeyword), [Arg], _),
> +	Arg = term__functor(term__string(StringArg), [], StartContext).
> +% 	EndContext = term__context(File, EndLine),
> +% 	AddOneIfNewline = lambda([Char::in, Count0::in, Count::out] is det, (
> +% 		( Char = '\n' ->
> +% 			Count is Count0 + 1
> +% 		;
> +% 			Count = Count0
> +% 		)
> +% 	)),
> +% 	string__foldl(AddOneIfNewline, StringArg, 0, LinesInString),
> +% 	StartLine is EndLine - LinesInString - 1,
> +% 	StartContext = term__context(File, StartLine).

I think you can safely delete the commented-out code now.

> +#define	mkpragmaframe(prednm, numslots, structname, redoip)	\

Please document this.

> cvs diff: Diffing tests
> cvs diff: Diffing tests/benchmarks
> cvs diff: Diffing tests/general
> cvs diff: Diffing tests/hard_coded
> cvs diff: Diffing tests/invalid
> cvs diff: Diffing tests/misc_tests
> cvs diff: Diffing tests/term
> cvs diff: Diffing tests/valid
> cvs diff: Diffing tests/warnings

Some test cases would be a good idea.

Here's a few things to test:
	tests/hard_coded:
	- test both `may_call_mercury' and `will_not_call_mercury'
	- test all of `duplicate', `shared', and `common'
	tests/invalid:
	- a nondet-style pragma c_code declaration for a predicate
	  declared with determinism cc_multi
	- both a `pragma inline' declaration and a nondet-style pragma c_code
	  for the same procedure
	- both nondet-style and ordinary-style `pragma c_code' declarations
	  for the same procedure

-- 
Fergus Henderson <fjh at cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3         |     -- the last words of T. S. Garp.



More information about the developers mailing list