[m-dev.] Re: pragma fact_table

Fergus Henderson fjh at cs.mu.oz.au
Sat Feb 22 02:12:10 AEDT 1997


David Matthew OVERTON, you wrote:
> 
> Could you please review the following changes.  The diff of
> fact_table.m is relative to the version you saw last week.  Other
> diffs are relative to the latest version in the CVS repository.

That's a significant improvement.
There's still a bit more work to do, though.

I want to see lots of test cases for this.

The changes so far to llds_out.m are fine; you can commit them now if
you want.  For the other changes, I would like to see another diff
after you've addressed my comments below.  (For fact_table.m I'd like
to see a diff from its state as of this diff.)

> +		% XXX This method of hashing floats may not work cross-compiling
> +		% between architectures that have different floating-point
> +		% representations.  Also, problems may arise if the
> +		% implementation of predicates in the `float' and `math'
> +		% modules changes because the generated C code needs to call
> +		% the same C math functions that these library predicates use.

There is also a potential problem if the compilers optimize these two
versions differently.  For example, on many machines floating point
registers may have more accuracy than floating point values stored
in memory, so that the exact end result of a floating point computation
may in fact depend on exactly how the compiler allocates registers.

The problem of the two hashing codes being inconsistent should be fixed
by avoiding the code duplication and only having one piece of code for
computing float hashes.

Hashing floats is something that is generally useful, and we already
have string__hash_string in library/string.m, so it would make sense to
define this float hashing function in library/float.m.  It is probably
easiest if you use the C code version and define the Mercury version to
call the C version using `pragma c_code'. The C code should be in a
`:- pragma c_code("....")' block in library/float.m.
(The alternative of using the Mercury code version and calling that
from C using `pragma export' is also possible but the other way is
likely to be more efficient.)

The cross-compilation issue is a lot more difficult to solve,
and so you should probably just document (with this XXX in the code,
and with an XXX at the top of the file, and also in the user
documentation) that we do not support cross-compilation for
fact_tables which must be indexed on floats.

On a related point, the actual code for the hash function should be
documented.  It's not clear what the code is trying to accomplish with
all that messing about with abs, log, ceil, and exp; obviously
it is computing a hash value, but it's not clear why the particular
code used is one that would make a good hash value.

> +% XXX Because of the way nondet and multidet code is currently generated, it
> +% will not work if the procedure is inlined.  Until this problem is fixed,
> +% modules containing `pragma fact_table' declarations should be compiled
> +% with the `--no-inlining' option.

You should fix this problem by changing inlining.m so that it doesn't
inline such procedures.

> +	% fact tables should always be compiled with `--no-inlining'.  This
> +	% problem should be fixed when model_non pragma C is implemented
> +	% correctly.

That makes it sound like implementing model_non pragma C will automatically
solve the problem.  Please restate in a way that avoids this ambiguity.

> +fact_table_size(FactTableSize) -->
> +	globals__io_lookup_option(fact_table_max_array_size, OptionData),
> +	{ OptionData = int(FactTableSize0) ->
> +		FactTableSize = FactTableSize0
> +	;
> +		error("fact_table_size: invalid option data")
> +	}.

It would be simpler to just use globals__io_lookup_int_option.

> -			globals__io_lookup_bool_option(verbose, Verbose),
> +			globals__io_lookup_bool_option(very_verbose, Verbose),
>  			( { Verbose = yes } ->
> -				io__format("% Read fact %d\n", [i(NumFacts0)])
> +				io__format("%% Read fact %d\n", [i(NumFacts0)])

The variable name should match the option name: call it `VeryVerbose'.

> +#if TAGBITS >= 2
> +	#define FACT_TABLE_MAKE_TAGGED_INDEX(i,t)    (mkbody(i) | mktag(t))
> +	#define FACT_TABLE_MAKE_TAGGED_POINTER(p,t)  (((Word) p) + mktag(t))

You should use `mkword()' for both of these.

Your code for MAKE_TAGGED_POINTER is not portable
if you use it in the initializers of static constants, as you do,
because it casts a pointer to an integeral type.
Using mkword() will solve this.

>  	Info0 = fact_arg_info(Type, IsInput, _IsOutput),
> -	(
> -		Mode = user_defined_mode(qualified("mercury_builtin", "in"), 
> -			[])
> -	->
> +	( mode_is_input(ModuleInfo, Mode) ->

That's not correct.  As is documented in mode_util.m, `mode_is_input'
succeeds for modes such as `list_skel -> list_skel' where the values
are not fully ground.

You need

	mode_get_insts(ModuleInfo, Mode, InitialInst, _FinalInst),
	inst_is_ground(ModuleInfo, InitialInst)

> -	;
> -		Mode = user_defined_mode(qualified("mercury_builtin", "out"), 
> -			[])
> -	->
> +
> +	; mode_is_output(ModuleInfo, Mode) ->

That is also wrong.  You need

	mode_get_insts(ModuleInfo, Mode, InitialInst, FinalInst),
	inst_is_free(ModuleInfo, InitialInst),
	inst_is_ground(ModuleInfo, FinalInst)

It is probably best to add predicates `mode_is_fully_input'
and `mode_is_fully_output' that do this to mode_util.m.

> +fact_table_mode_type([Mode | Modes], ModuleInfo, ModeType) :-
> +	( mode_is_input(ModuleInfo, Mode) ->
> +		ModeType0 = all_in
> +	; mode_is_output(ModuleInfo, Mode) ->
> +		ModeType0 = all_out

Same bug as above.

> -		Mode = user_defined_mode(SymName, []),
> -		SymName = qualified("mercury_builtin", "in"),
> +		mode_is_input(ModuleInfo, Mode),

Ditto.

> -		globals__io_lookup_bool_option(verbose, Verbose),
> +		globals__io_lookup_bool_option(very_verbose, Verbose),
>  		( { Verbose = yes } ->
> -			io__format("% Writing fact %d\n", [i(FactNum)])
> +			io__format("%% Writing fact %d\n", [i(FactNum)])
>  		;
>  			[]

The variable name should be `VeryVerbose'.

> +	% Write out the closing brace of an array.
> +:- pred write_closing_brace(io__output_stream, io__state, io__state).
> +:- mode write_closing_brace(in, di, uo) is det.
> +
> +write_closing_brace(OutputStream) -->
> +	io__write_string(OutputStream, "};\n\n").
> +
> +	% Write out the declaration of a new data array followed by " = {\n"
> +:- pred write_new_data_array(io__output_stream, string, int,
> +		io__state, io__state).
> +:- mode write_new_data_array(in, in, in, di, uo) is det.
> +
> +write_new_data_array(OutputStream, StructName, FactNum) -->
> +	io__format(OutputStream, "struct %s_struct %s%d[] = {\n", 
> +		[s(StructName), s(StructName), i(FactNum)]).

Opening braces come before closing braces.  It would make more sense,
therefore, to swap the order of those two predicates in your source file.

> -:- mode build_hash_table(in, in, in, in, in, in, in, in, in, in, in, in, in,
> +:- mode build_hash_table(in, in, in, in, in, in, in, in, in, in, in, in, in, in,

Please keep lines to less than 80 columns.

> +calculate_hash_table_size(NumEntries, HashTableSize) -->
> +	globals__io_lookup_option(fact_table_hash_percent_full, OptionData),
> +	{ 
> +		OptionData = int(PercentFull),

Use globals__io_lookup_int_option.

> +		0 < PercentFull,
> +		PercentFull =< 100
> +	->
...
> +	;
> +		error("calculate_hash_table_size: invalid option data")
> +	}.

Passing an invalid option should not cause a software error.
This error should be checked for in `postprocess_options'
in handle_options.m.

>  			(
>  				{ Index = fact(I) },
>  				io__format(
>  				    ", FACT_TABLE_MAKE_TAGGED_INDEX(%d,1), ",
>  				    [i(I)])
>  			;
>  				{ Index = hash_table(I, H) },
>  				io__format(
>  				", FACT_TABLE_MAKE_TAGGED_POINTER(&%s%d,2), ",
>  				    [s(H), i(I)])

s/,1/, 1/
s/,2/, 2/

> +			io__write_string(
> +				"0, FACT_TABLE_MAKE_TAGGED_POINTER(0,0), -1 ")

The second `0' on that line should be `NULL'.
(Style-wise, it's always a good idea to use `NULL' rather than `0'
for null pointer constants.)

> +		io__write_strings([
> +			"Error deleting file `",
> +			FileName,
> +			"':\n  ",
> +			ErrorMessage,
> +			".\n"]),

The error message should include the program name (which you
can obtain via io__progname_base), i.e.

			io__progname_base("mc", ProgName),
			io__write_strings([
				Progname, 
				": error deleting file `",
				FileName,
				"':\n  ",
				ErrorMessage,
				".\n"]),

>  fact_table_generate_c_code(PredName, PragmaVars, ProcID, PrimaryProcID, 
...
> +	{ proc_info_inferred_determinism(ProcInfo, Determinism) },

Should you be using `proc_info_interface_determinism' here?

What happens if the declared determinism doesn't match the inferred
determinism?

> +	{
> +		PredName = unqualified(PredNameString)
> +	;
> +		PredName = qualified(_, PredNameString)
> +	},

You should use unqualify_name from prog_util.m to do that.

But, is it right to discard the module name?
Shouldn't the name be fully module qualified?
What happens if there are two fact tables with the same
predicate name in different modules?

> +	/* Mention arguments %s to stop the compiler giving a warning
> +	**

Use the standard comment layout

	/*
	** ...
	** ...
	*/

not

	/* ...
	** ...
	*/

Also please end sentences with a full stop.

> +			/* initialise current_table to the top level hash table
> +			** for this ProcID
> +			*/

Ditto.

> +	( mode_is_input(ModuleInfo, Mode) ->

Same bug as earlier.

> +:- pred get_c_type_from_mercury_type((type)::in, string::out) is semidet.
>  
> +get_c_type_from_mercury_type(term__functor(term__atom("string"), [], _), 
> +		"String").

Hmm... shouldn't you be using `ConstString' rather than `String' here
(and in various other places)?

>  	(
>  		Type = term__functor(term__atom("string"), [], _)
>  	->
> -		Template = "\t\t%s = (String)r%d;\n"
> +		Template = "\t\t%s = (String)%s;\n"
>  	;
>  		Type = term__functor(term__atom("int"), [], _)
>  	->
> -		Template = "\t\t%s = r%d;\n"
> +		Template = "\t\t%s = %s;\n"
>  	;
>  		Type = term__functor(term__atom("float"), [], _)
>  	->
> -		Template = "\t\t%s = word_to_float(r%d);\n"
> +		Template = "\t\t%s = word_to_float(%s);\n"
>  	;
>  		error("generate_arg_input_code: invalid type")
>  	),

I'm sure I've seen code to handle that sort of thing in export.m;
you should reuse it.

In fact, your code duplicates the functionality of the following
predicates in export.m:

	export.m:				fact_table.m:

	argloc_to_type				get_reg_name
	export__term_to_type_string		get_c_type_from_mercury_type
	convert_type_to_mercury			generate_arg_output_code
	convert_type_from_mercury		generate_arg_input_code

You should do something to avoid all this code duplication.

> +	( mode_is_input(ModuleInfo, Mode) ->

Same bug as earlier.

make_hlds.m:
> +	globals__io_get_args_method(ArgsMethod),
> +	fact_table_generate_c_code(SymName, PragmaVars, ProcID, PrimaryProcID,
> +		ProcInfo, ArgTypes, ArgsMethod, Module0,
> +		C_ProcCode, C_ExtraCode),

It would be simpler to call globals__io_get_args_method from
inside fact_table_generate_c_code; that's one less parameter to pass.

modules.m:
> +		( { FactDeps \= [] } ->
> +			io__write_strings(DepStream, [
> +				" \\\n\t$(", ModuleName, ".fact_tables)\n\n",
> +				"$(", ModuleName, ".fact_tables:%=%.o) : $(",
> +				ModuleName, ".fact_tables)\n\n",
> +				"$(", ModuleName, ".fact_tables:%=%.c) : ",
> +				ModuleName, ".o\n"

That won't work with `--no-assume-gmake'.  Please fix it so that it does.
(It's OK to keep that code, but if so you need to conditionalize it
on the setting of the assume_gmake option.)
I'd like to try to minimize the number of dependencies on GNU Make.

options.m:
> +		;	fact_table_max_array_size
> +				% maximum number of elements in a single 
> +				% fact table data array
> +
> +		;	fact_table_hash_percent_full
> +				% how full the fact table hash tables should
> +				% be allowed to get, given as an integer
> +				% percentage.
> +

You should add documentation to the help message in options.m and to
doc/user_guide.texi; until fact tables are ready for prime time, this
documentation should be disabled by commenting it out (`@c' is the
comment prefix in TexInfo), but it's still better to add it now, even
though it will be commented out.

Cheers,
	Fergus.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3         |     -- the last words of T. S. Garp.



More information about the developers mailing list