[m-dev.] diff: fix setjmp() bug in MLDS->C back-end

Fergus Henderson fjh at cs.mu.OZ.AU
Sun Dec 10 05:52:31 AEDT 2000


This one fixes a long-standing bug in the MLDS->C back-end.
I'll commit this one as soon as it passes a bootcheck.

----------

Estimated hours taken: 8

Fix a bug with the handling of commits in the high-level C back-end.

It was implementing commits via calls to setjmp()/longjmp(),
but ANSI/ISO C says that longjmp() is allowed to clobber the values
of any non-volatile local variables in the function that called setjmp()
which have been modified between the setjmp() and the longjmp().

The fix is that whenever we generate a commit, we put it in its own
nested function, with the local variables (e.g. `succeeded',
plus any outputs from the goal that we're committing over)
remaining in the containing function.  This ensures that
none of the variables which get modified between the setjmp()
and the longjmp() and which get referenced after the longjmp()
are local variables in the function containing the setjmp().

[The obvious alternative of declaring the local variables in the
function containing setjmp() as `volatile' doesn't work, since
the assignments to those output variables may be deep in some
function called indirectly from the goal that we're committing
across, and assigning to a volatile-qualified variable via a
non-volatile pointer is undefined behaviour.  The only way to
make it work would be to be to declare *every* output argument
that we pass by reference as `volatile T *'.  But that would
impose distributed fat and would make interoperability difficult.]

compiler/options.m:
	Add a new option `--put-commits-in-own-function'.

compiler/handle_options.m:
	If the target is high-level C, set the
	`--put-commits-in-own-function' option.

compiler/ml_code_util.m:
	Add a routine for accessing the new option.

compiler/ml_code_gen.m:
	If the new option is set, generate each commit in its own
	nested function.

tests/hard_coded/Mmakefile:
tests/hard_coded/setjmp_test.m:
tests/hard_coded/setjmp_test.exp:
	A regression test.
	Previous versions of the compiler generated code for this test
	case that did the wrong thing on *some* systems.

Workspace: /home/pgrad/fjh/ws/hg3
Index: compiler/handle_options.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/handle_options.m,v
retrieving revision 1.97
diff -u -d -r1.97 handle_options.m
--- compiler/handle_options.m	2000/11/17 17:47:12	1.97
+++ compiler/handle_options.m	2000/12/09 17:12:40
@@ -294,6 +294,15 @@
 		[]
 	),
 
+	% Generating high-level C code requires putting each commit
+	% in its own function, to avoid problems with setjmp() and
+	% non-volatile local variables.
+	( { Target = c } ->
+		option_implies(highlevel_code, put_commit_in_own_func, bool(yes))
+	;
+		[]
+	),
+
 	% --high-level-code disables the use of low-level gcc extensions
 	option_implies(highlevel_code, gcc_non_local_gotos, bool(no)),
 	option_implies(highlevel_code, gcc_global_registers, bool(no)),
Index: compiler/ml_code_gen.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/ml_code_gen.m,v
retrieving revision 1.68
diff -u -d -r1.68 ml_code_gen.m
--- compiler/ml_code_gen.m	2000/11/23 04:32:43	1.68
+++ compiler/ml_code_gen.m	2000/12/09 18:42:23
@@ -659,12 +659,9 @@
 % BUGS:
 %	- XXX parameter passing problem for abstract equivalence types
 %         that are defined as float (or anything which doesn't map to `Word')
-%	- XXX setjmp() and volatile: local variables in functions that
-%	      call setjmp() need to be declared volatile
-%	- XXX problem with unboxed float on DEC Alphas.
 %
 % TODO:
-%	- XXX define compare & unify preds for array and RTTI types
+%	- XXX define compare & unify preds RTTI types
 %	- XXX need to generate correct layout information for closures
 %	      so that tests/hard_coded/copy_pred works.
 %	- XXX fix ANSI/ISO C conformance of the generated code (i.e. port to lcc)
@@ -682,12 +679,19 @@
 %	- support accurate GC
 %
 % POTENTIAL EFFICIENCY IMPROVEMENTS:
-%	- generate better code for switches
-%	- allow inlining of `pragma c_code' goals (see inlining.m)
+%	- optimize unboxed float on DEC Alphas.
+%	- generate better code for switches:
+%		- optimize switches so that the recursive case comes first
+%		  (see switch_gen.m).
+%		- apply the reverse tag test optimization
+%		  for types with two functors (see unify_gen.m)
+%		- binary search switches
+%		- lookup switches
 %	- generate local declarations for the `succeeded' variable;
 %	  this would help in nondet code, because it would avoid
 %	  the need to access the outermost function's `succeeded'
 %	  variable via the environment pointer
+%	  (be careful about the interaction with setjmp(), though)
 
 %-----------------------------------------------------------------------------%
 
@@ -1479,14 +1483,22 @@
 		%		<succeeded = Goal>
 		% 	===>
 		%		bool succeeded;
+		%	#ifdef NONDET_COPY_OUT
+		%		<local var decls>
+		%	#endif
+		%	#ifdef PUT_COMMIT_IN_OWN_FUNC
+		%	    /*
+		%	    ** to avoid problems with setjmp() and non-volatile
+		%	    ** local variables, we need to put the call to
+		%	    ** setjmp() in its own nested function
+		%	    */
+		%	    void commit_func()
+		%	    {
+		%       #endif
 		%		MR_COMMIT_TYPE ref;
 		%		void success() {
-		%			succeeded = TRUE;
 		%			MR_DO_COMMIT(ref);
 		%		}
-		%	#ifdef NONDET_COPY_OUT
-		%		<local var decls>
-		%	#endif
 		%		MR_TRY_COMMIT(ref, {
 		%			<Goal && success()>
 		%			succeeded = FALSE;
@@ -1496,6 +1508,10 @@
 		%	#endif
 		%			succeeded = TRUE;
 		%		})
+		%	#ifdef PUT_COMMIT_IN_OWN_FUNC
+		%	    }
+		%	    commit_func();
+		%       #endif
 
 		ml_gen_maybe_make_locals_for_output_args(GoalInfo,
 			LocalVarDecls, CopyLocalsToOutputArgs,
@@ -1540,10 +1556,12 @@
 				[SetSuccessTrue]), Context)) },
 		{ TryCommitStatement = mlds__statement(TryCommitStmt,
 			MLDS_Context) },
-
-		{ MLDS_Decls = list__append([CommitRefDecl,
-			SuccessFunc | LocalVarDecls], GoalStaticDecls) },
-		{ MLDS_Statements = [TryCommitStatement] },
+		{ CommitFuncLocalDecls = [CommitRefDecl, SuccessFunc |
+			GoalStaticDecls] },
+		maybe_put_commit_in_own_func(CommitFuncLocalDecls,
+			[TryCommitStatement], Context,
+			CommitFuncDecls, MLDS_Statements),
+		{ MLDS_Decls = LocalVarDecls ++ CommitFuncDecls },
 
 		ml_gen_info_set_var_lvals(OrigVarLvalMap)
 
@@ -1552,19 +1570,33 @@
 		%	model_non in det context: (using try_commit/do_commit)
 		%		<do Goal>
 		%	===>
+		%	#ifdef NONDET_COPY_OUT
+		%		<local var decls>
+		%	#endif
+		%	#ifdef PUT_COMMIT_IN_NESTED_FUNC
+		%	    /*
+		%	    ** to avoid problems with setjmp() and non-volatile
+		%	    ** local variables, we need to put the call to
+		%	    ** setjmp() in its own nested functions
+		%	    */
+		%	    void commit_func()
+		%	    {
+		%       #endif
 		%		MR_COMMIT_TYPE ref;
 		%		void success() {
 		%			MR_DO_COMMIT(ref);
 		%		}
-		%	#ifdef NONDET_COPY_OUT
-		%		<local var decls>
-		%	#endif
 		%		MR_TRY_COMMIT(ref, {
+		%			<Goal && success()>
+		%		}, {
 		%	#ifdef NONDET_COPY_OUT
 		%			<copy local vars to output args>
 		%	#endif
-		%			<Goal && success()>
-		%		}, {})
+		%		})
+		%	#ifdef PUT_COMMIT_IN_NESTED_FUNC
+		%	    }
+		%	    commit_func();
+		%       #endif
 
 		ml_gen_maybe_make_locals_for_output_args(GoalInfo,
 			LocalVarDecls, CopyLocalsToOutputArgs,
@@ -1605,15 +1637,99 @@
 			ml_gen_block([], CopyLocalsToOutputArgs, Context)) },
 		{ TryCommitStatement = mlds__statement(TryCommitStmt,
 			MLDS_Context) },
-
-		{ MLDS_Decls = list__append([CommitRefDecl,
-			SuccessFunc | LocalVarDecls], GoalStaticDecls) },
-		{ MLDS_Statements = [TryCommitStatement] },
+		{ CommitFuncLocalDecls = [CommitRefDecl, SuccessFunc |
+			GoalStaticDecls] },
+		maybe_put_commit_in_own_func(CommitFuncLocalDecls,
+			[TryCommitStatement], Context,
+			CommitFuncDecls, MLDS_Statements),
+		{ MLDS_Decls = LocalVarDecls ++ CommitFuncDecls },
 
 		ml_gen_info_set_var_lvals(OrigVarLvalMap)
 	;
 		% no commit required
 		ml_gen_goal(CodeModel, Goal, MLDS_Decls, MLDS_Statements)
+	).
+
+	% maybe_put_commit_in_own_func(Defns0, Stmts0, Defns, Stmts):
+	% if the --put-commit-in-own-func option is set, put
+	% the commit in its own function.  This is needed for
+	% the high-level C back-end, to handle problems with
+	% setjmp()/longjmp() clobbering non-volatile local variables.
+	%
+	% Detailed explanation:
+	%   For the high-level C back-end, we implement commits using
+	%   setjmp()/longjmp().  Unfortunately for us, ANSI/ISO C says
+	%   that longjmp() is allowed to clobber the values of any
+	%   non-volatile local variables in the function that called
+	%   setjmp() which have been modified between the setjmp()
+	%   and the longjmp().
+	%
+	%   To avoid this, whenever we generate a commit, we put
+	%   it in its own nested function, with the local variables
+	%   (e.g. `succeeded', plus any outputs from the goal that
+	%   we're committing over) remaining in the containing function.
+	%   This ensures that none of the variables which get modified
+	%   between the setjmp() and the longjmp() and which get
+	%   referenced after the longjmp() are local variables in the
+	%   function containing the setjmp().
+	%
+	%   [The obvious alternative of declaring the local variables in
+	%   the function containing setjmp() as `volatile' doesn't work,
+	%   since the assignments to those output variables may be deep
+	%   in some function called indirectly from the goal that we're
+	%   committing across, and assigning to a volatile-qualified
+	%   variable via a non-volatile pointer is undefined behaviour.
+	%   The only way to make it work would be to be to declare
+	%   *every* output argument that we pass by reference as
+	%   `volatile T *'.  But that would impose distributed fat and
+	%   would make interoperability difficult.]
+	%
+:- pred maybe_put_commit_in_own_func(mlds__defns, mlds__statements,
+		prog_context, mlds__defns, mlds__statements,
+		ml_gen_info, ml_gen_info).
+:- mode maybe_put_commit_in_own_func(in, in, in, out, out, in, out) is det.
+
+maybe_put_commit_in_own_func(CommitFuncLocalDecls, TryCommitStatements,
+		Context, MLDS_Decls, MLDS_Statements) -->
+	ml_gen_info_put_commit_in_own_func(PutCommitInOwnFunc),
+	( { PutCommitInOwnFunc = yes } ->
+		%
+		% Generate the `void commit_func() { ... }' wrapper
+		% around the main body that we generated above
+		%
+		ml_gen_new_func_label(no, CommitFuncLabel, 
+			CommitFuncLabelRval),
+		/* push nesting level */
+		{ CommitFuncBody = ml_gen_block(CommitFuncLocalDecls,
+			TryCommitStatements, Context) },
+		/* pop nesting level */
+		ml_gen_nondet_label_func(CommitFuncLabel, Context,
+			CommitFuncBody, CommitFunc),
+		%
+		% Generate the call to `commit_func();'
+		%
+		ml_gen_info_use_gcc_nested_functions(UseNestedFuncs),
+		( { UseNestedFuncs = yes } ->
+			{ ArgRvals = [] },
+			{ ArgTypes = [] }
+		;
+			ml_get_env_ptr(EnvPtrRval),
+			{ ArgRvals = [EnvPtrRval] },
+			{ ArgTypes = [mlds__generic_env_ptr_type] }
+		),
+		{ RetTypes = [] },
+		{ Signature = mlds__func_signature(ArgTypes, RetTypes) },
+		{ CallOrTailcall = call },
+		{ CallStmt = call(Signature, CommitFuncLabelRval, no,
+			ArgRvals, [], CallOrTailcall) },
+		{ CallStatement = mlds__statement(CallStmt,
+			mlds__make_context(Context)) },
+		% Package it all up
+		{ MLDS_Statements = [CallStatement] },
+		{ MLDS_Decls = [CommitFunc] }
+	;
+		{ MLDS_Statements = TryCommitStatements },
+		{ MLDS_Decls = CommitFuncLocalDecls }
 	).
 
 	%
Index: compiler/ml_code_util.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/ml_code_util.m,v
retrieving revision 1.31
diff -u -d -r1.31 ml_code_util.m
--- compiler/ml_code_util.m	2000/11/23 04:32:44	1.31
+++ compiler/ml_code_util.m	2000/12/09 17:14:11
@@ -490,9 +490,13 @@
 :- pred ml_gen_info_get_globals(globals, ml_gen_info, ml_gen_info).
 :- mode ml_gen_info_get_globals(out, in, out) is det.
 
+	% lookup the --gcc-nested-functions option
 :- pred ml_gen_info_use_gcc_nested_functions(bool, ml_gen_info, ml_gen_info).
 :- mode ml_gen_info_use_gcc_nested_functions(out, in, out) is det.
 
+	% lookup the --put-commit-in-nested-func option
+:- pred ml_gen_info_put_commit_in_own_func(bool, ml_gen_info, ml_gen_info).
+:- mode ml_gen_info_put_commit_in_own_func(out, in, out) is det.
 
 	% Generate a new label number for use in label statements.
 	% This is used to give unique names to the case labels generated
@@ -1813,6 +1817,11 @@
 	ml_gen_info_get_globals(Globals),
 	{ globals__lookup_bool_option(Globals, gcc_nested_functions,
 		UseNestedFuncs) }.
+
+ml_gen_info_put_commit_in_own_func(PutCommitInNestedFunc) -->
+	ml_gen_info_get_globals(Globals),
+	{ globals__lookup_bool_option(Globals, put_commit_in_own_func,
+		PutCommitInNestedFunc) }.
 
 ml_gen_info_get_globals(Globals) -->
 	=(Info),
Index: compiler/options.m
===================================================================
RCS file: /home/mercury1/repository/mercury/compiler/options.m,v
retrieving revision 1.303
diff -u -d -r1.303 options.m
--- compiler/options.m	2000/12/08 06:50:14	1.303
+++ compiler/options.m	2000/12/09 16:37:47
@@ -174,6 +174,7 @@
 		;	gcc_nested_functions
 		;	det_copy_out
 		;	nondet_copy_out
+		;	put_commit_in_own_func
 		;	unboxed_float
 		;       unboxed_enums
 		;       unboxed_no_tag_types
@@ -600,6 +601,7 @@
 	gcc_nested_functions	-	bool(no),
 	det_copy_out		-	bool(no),
 	nondet_copy_out		-	bool(no),
+	put_commit_in_own_func	-	bool(no),
 	unboxed_float           -       bool(no),
 	unboxed_enums           -       bool(yes),
 	unboxed_no_tag_types    -       bool(yes)
@@ -991,6 +993,7 @@
 long_option("gcc-nested-functions",	gcc_nested_functions).
 long_option("det-copy-out",		det_copy_out).
 long_option("nondet-copy-out",		nondet_copy_out).
+long_option("put-commit-in-own-func",	put_commit_in_own_func).
 long_option("unboxed-float",		unboxed_float).
 long_option("unboxed-enums",		unboxed_enums).
 long_option("unboxed-no-tag-types",	unboxed_no_tag_types).
@@ -1934,6 +1937,16 @@
 %		"\tSpecify whether to handle output arguments for nondet",
 %		"\tprocedures using pass-by-value rather than pass-by-reference.",
 %		"\tThis option is ignored if the `--high-level-code' option is not enabled.",
+% The --put-commit-in-own-func option is not documented because
+% it is enabled automatically (by handle_options) in the situations
+% where it is needed; the user should never need to set it.
+%		"--put-commit-in-own-func",
+%		"\tPut each commit in its own C function.",
+%		"\tThis option only affects the MLDS back-ends.",
+%		"\tIt is needed for the high-level C back-end,",
+%		"\twhere commits are implemented via setjmp()/longjmp(),",
+%		"\tsince longjmp() may clobber any non-volatile local vars",
+%		"\tin the function that called setjmp().",
 		"--gc {none, conservative, accurate}",
 		"--garbage-collection {none, conservative, accurate}",
 		"\t\t\t\t(`.gc' grades use `--gc conservative',",
Index: tests/hard_coded/Mmakefile
===================================================================
RCS file: /home/mercury1/repository/tests/hard_coded/Mmakefile,v
retrieving revision 1.104
diff -u -d -r1.104 Mmakefile
--- tests/hard_coded/Mmakefile	2000/11/12 05:51:14	1.104
+++ tests/hard_coded/Mmakefile	2000/12/09 18:34:27
@@ -99,6 +99,7 @@
 	rnd \
 	rtc_bug \
 	rtti_strings \
+	setjmp_test \
 	shift_test \
 	solve_quadratic \
 	space \
Index: tests/hard_coded/setjmp_test.exp
===================================================================
RCS file: setjmp_test.exp
diff -N setjmp_test.exp
--- /dev/null	Thu Mar 30 14:06:13 2000
+++ setjmp_test.exp	Sun Dec 10 05:34:13 2000
@@ -0,0 +1 @@
+X = 2
Index: tests/hard_coded/setjmp_test.m
===================================================================
RCS file: setjmp_test.m
diff -N setjmp_test.m
--- /dev/null	Thu Mar 30 14:06:13 2000
+++ setjmp_test.m	Sun Dec 10 05:33:56 2000
@@ -0,0 +1,29 @@
+% This is a regression test:
+% the compiler used to generate code for this in the hlc.gc grade
+% which had undefined behaviour according to ANSI/ISO C,
+% and which would in practice fail on some systems
+% (e.g. alpha*-dec-osf3.2 with egcs-1.1.2).
+
+:- module setjmp_test.
+:- interface.
+:- import_module io.
+
+:- pred main(io__state::di, io__state::uo) is cc_multi.
+
+:- implementation.
+
+:- import_module std_util.
+
+main -->
+	(if { p(X) } then
+		print("X = "), print(X), nl
+	;
+		print("no")
+	).
+
+:- pragma inline(p/1).
+:- pred p(int).
+:- mode p(out) is nondet.
+p(1) :- semidet_fail.
+p(2) :- semidet_succeed.
+p(3) :- semidet_fail.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list