[m-rev.] for review: debugging tabling procedures

Zoltan Somogyi zs at cs.mu.OZ.AU
Fri Nov 8 18:31:55 AEDT 2002


On 08-Nov-2002, Fergus Henderson <fjh at cs.mu.OZ.AU> wrote:
> > 	Make tabling pointer variables their natural type.
> 
> Don't you need a similar modification to mlds_to_c.m?

I don't think so. mlds_to_c.m doesn't mention the types of tabling pointers.
ml_code_gen.m says they are mlds__generic_type and ml_unify_gen.m casts them
to a type supplied by the caller. All of that should continue to work.
I will test the diff in hlc.gc though.

The change in the type from MR_Word to MR_TrieNode in the LLDS backend
was to allow type safety when the debugger's data structures refer to tabling
pointers. This is not a concern for the MLDS backend.

> > trace/mercury_trace_util.[ch]:
> > 	Add a new function MR_trace_is_integer that reads in signed integers.
> 
> This function doesn't work for MININT.
> Also it doesn't check for overflow.

It doesn't need to. These are only for debugging tabling; I don't think
I have caused that function to be called by any number greater than 100.
I don't expect to have to exceed this limit in the future either. Other users
may, but the limitation about argument types is more likely to be a problem
in practice than MININT. In fact, the new mdb command is inherently limited,
because once you allow arguments of user-defined types, printing even a single
entry in the call table can generate an arbitrary amount of output. Users
therefore are naturally forced to use reasonably small test cases. Of course
those can be very useful.

> At very least these flaws should be documented in XXX comments.

Done.

> s/k1/(k1)/
> s/k2/(k2)/
> 
> Likewise elsewhere in this file.

Done.

> > +typedef	struct {
> > +	MR_Integer			*MR_ctai_values;
> > +	int				MR_ctai_value_next;
> > +	int				MR_ctai_cur_index;
> > +	MR_Integer			MR_ctai_cur_value;
> > +} MR_Int_Table_Arg_Values;
> 
> Some comments here would be helpful.

Done.

> This coding style -- with use of gotos, fall-through, and "continue" --
> is not the easiest to grok.  Please consider rewriting it so that it
> has a less convoluted control flow.

It implements backtracking across an unbounded number of domains without
using an unbounded number of loops, so there is no way to write it without
*something* being convoluted. However, I have simplified it a bit further
and transferred the remaining convolution to data flow, not control flow,
so there are no gotos or continues.

I have a new diff, but interdiff screws up on it, so I have appended
the nontrivial parts of a diff of the old and new diffs on
mercury_trace_internal.c. I haven't tested the new code yet, but I
will tonight.

> Also, this function is very long.  Consider breaking it into sub-routines.

I already did, to the maximum extent I considered practicable and useful.
The intricate loop fits on one 60-line X windows screen. All the other parts
are reasonably simple code, and their content is tightly tied to this mdb
command, so separating them out would replace the readability problem
associated with long functions with the readability problem associated
with tight coupling across function boundaries.

Zoltan.

--- DIFF.fjh0	Fri Nov  8 18:10:36 2002
+++ DIFF.fjh	Fri Nov  8 18:13:26 2002
@@ -1,10 +1,10 @@
-Index: trace/mercury_trace_internal.c
+Index: mercury_trace_internal.c
 ===================================================================
 RCS file: /home/mercury1/repository/mercury/trace/mercury_trace_internal.c,v
 retrieving revision 1.147
-diff -u -r1.147 mercury_trace_internal.c
---- trace/mercury_trace_internal.c	6 Nov 2002 02:02:38 -0000	1.147
-+++ trace/mercury_trace_internal.c	6 Nov 2002 14:56:09 -0000
+diff -u -b -r1.147 mercury_trace_internal.c
+--- mercury_trace_internal.c	6 Nov 2002 02:02:38 -0000	1.147
++++ mercury_trace_internal.c	8 Nov 2002 07:13:24 -0000
-@@ -257,6 +258,67 @@
+@@ -257,6 +258,112 @@
  	const MR_Make_Completer		MR_cmd_arg_completer;
  } MR_Trace_Command_Info;
  
++/*
++** The following data structures describe the information we have about the
++** input arguments of tabled procedures. We use them to decode the call tables
++** of such procedures.
++**
++** We use one MR_Call_Table_Arg structure for each input argument.
++**
++** The step field specifies what data structure the tabling system uses to
++** implement the trie nodes at the level of the call table corresponding to
++** the relevant argument. At the moment, we support only three values of this
++** field, MR_TABLE_STEP_INT, MR_TABLE_STEP_FLOAT and MR_TABLE_STEP_STRING;
++** each of those implicitly selects the corresponding alternative in the
++** arg_values union.
++**
++** The start_node field specifies the start node of the relevant trie. For the
++** first input argument, this will be the tabling pointer variable for the given
++** procedure. For later input arguments, it will be the trie node you reach
++** after following the current values of the previous arguments through the
++** call table.
++**
++** The MR_{Int,Float,String}_Table_Arg_Values structs have the same fields and
++** the same meanings, differing only in the types of the values they store.
++** Each struct is used for one of two things.
++**
++** 1. To describe a value supplied by the user on the mdb command line.
++**    In this case, the only field that matters is the cur_value field.
++**
++** 2. To describe the set of values you can find in a trie node, the one given
++**    by the start_node field, and to specify which is the current one.
++**    In this case, all the fields matter.
++**
++** The code that manipulates these structures distinguishes between the two
++** uses based on argument number.
++**
++** The values array is managed with the macros in mercury_array_macros.h,
++** so its size is given by the value_next field. The cur_index field gives the
++** index of the current value, while the cur_value field gives the current
++** value itself. (The contents of the cur_value field can be deduced from the
++** contents of the other fields with use 2, but not with use 1.)
++**
++** The valid field in the MR_Call_Table_Arg structure gives the validity
++** of the values subfield of its arg_values field; if it is false, then the
++** array is logically considered empty.
++*/
++
 +typedef	struct {
 +	MR_Integer			*MR_ctai_values;
 +	int				MR_ctai_value_next;
@@ -44,6 +89,13 @@
 +	MR_String_Table_Arg_Values	MR_cta_values_string;
 +} MR_Table_Arg_Values;
 +
++typedef struct {
++	MR_Table_Trie_Step		MR_cta_step;
++	MR_TrieNode			MR_cta_start_node;
++	MR_bool				MR_cta_valid;
++	MR_Table_Arg_Values		MR_cta_arg_values;
++} MR_Call_Table_Arg;
++
 +#define	MR_cta_int_values		MR_cta_arg_values.MR_cta_values_int.\
 +					MR_ctai_values
 +#define	MR_cta_int_value_next		MR_cta_arg_values.MR_cta_values_int.\
@@ -71,17 +123,10 @@
 +#define	MR_cta_string_cur_value		MR_cta_arg_values.MR_cta_values_string.\
 +					MR_ctas_cur_value
 +
-+typedef struct {
-+	MR_Table_Trie_Step		MR_cta_step;
-+	MR_TrieNode			MR_cta_start_node;
-+	MR_bool				MR_cta_valid;
-+	MR_Table_Arg_Values		MR_cta_arg_values;
-+} MR_Call_Table_Arg;
-+
  static	void	MR_trace_internal_ensure_init(void);
  static	MR_bool	MR_trace_internal_create_mdb_window(void);
  static	void	MR_trace_internal_init_from_env(void);
-@@ -3417,6 +3506,666 @@
+@@ -3417,6 +3551,739 @@
  }
  
  static MR_Next
@@ -538,6 +583,11 @@
 +		call_table_args[cur_arg].MR_cta_valid = MR_FALSE;
 +	}
 +
++	/*
++	** Set up the values of the input arguments supplied on the command
++	** line, to enable us to print them out in each call table entry.
++	*/
++
 +	for (cur_arg = 0; cur_arg < word_count; cur_arg++) {
 +		MR_bool	success;
 +
@@ -579,12 +629,23 @@
 +	}
 +
 +	if (word_count == num_inputs) {
++		/*
++		** The user specified values for all the input arguments,
++		** so what we print is a single entry, not a table of entries,
++		** and we don't need to loop over all the entries.
++		*/
++
 +		MR_trace_cmd_table_print_tip(proc, num_inputs,
 +			call_table_args, table_cur);
 +		MR_GC_free(call_table_args);
 +		return KEEP_INTERACTING;
 +	}
 +
++	/*
++	** The user left the values of some input arguments unspecified,
++	** so we print a table of entries. Here we print the header.
++	*/
++
 +	switch (MR_sle_eval_method(proc)) {
 +		case MR_EVAL_METHOD_LOOP_CHECK:
 +			fprintf(MR_mdb_out, "loopcheck table for ");
@@ -608,24 +669,61 @@
 +			MR_fatal_error("MR_trace_cmd_table: bad eval method");
 +	}
 +
++	/*
++	** This loop prints the entries in the table.
++	**
++	** If we knew in advance that the user left (say) two input argument
++	** positions unspecified, we could use a loop structure such as:
++	**
++	** 	for value1 in <values in the trie at node start_node[0]>
++	** 		cur_value[1] = value1
++	** 		start_node[1] = follow value1 in start_node[0]
++	** 		for value2 in <values in the trie at node start_node[1]>
++	** 			cur_value[2] = value2
++	** 			start_node[2] = follow value2 in start_node[1]
++	** 			print <fixed args>, cur_value[1], cur_value[2]
++	** 		end for
++	** 	end for
++	**
++	** However, we don't know in advance how many input arguments the user
++	** left unspecified. We therefore simulate the above with a single
++	** loop, which can function as any one of the above nested loops.
++	**
++	** The value of cur_arg controls which one it is simulating at any
++	** given time. Initially, cur_arg grows as we enter each of the above
++	** loops one after another, at each stage recording the set of values
++	** in the current trie node in the values array of the relevant
++	** argument.
++	**
++	** We number the input arguments from 0 to num_inputs-1. When cur_arg
++	** becomes equal to num_inputs, this means that we have values for all
++	** the input arguments, so we print the corresponding call table entry.
++	** We then initiate backtracking: we decrement cur_arg to get the next
++	** value of the last argument. We also do this whenever we run out of
++	** values in any trie.
++	**
++	** We step when we are about to backtrack out of the outermost loop.
++	*/
++
 +	cur_arg = word_count;
 +	num_tips = 0;
 +	for (;;) {
-+		MR_bool	goto_prev_arg;
++		MR_bool	no_more;
++		MR_bool	start_backtrack;
 +
 +		switch (call_table_args[cur_arg].MR_cta_step) {
 +			case MR_TABLE_STEP_INT:
-+				goto_prev_arg = MR_update_int_table_arg_slot(
++				no_more = MR_update_int_table_arg_slot(
 +					&table_cur, &call_table_args[cur_arg]);
 +				break;
 +
 +			case MR_TABLE_STEP_FLOAT:
-+				goto_prev_arg = MR_update_float_table_arg_slot(
++				no_more = MR_update_float_table_arg_slot(
 +					&table_cur, &call_table_args[cur_arg]);
 +				break;
 +
 +			case MR_TABLE_STEP_STRING:
-+				goto_prev_arg = MR_update_string_table_arg_slot(
++				no_more = MR_update_string_table_arg_slot(
 +					&table_cur, &call_table_args[cur_arg]);
 +				break;
 +
@@ -635,31 +733,39 @@
 +					"after check");
 +		}
 +
-+		if (goto_prev_arg) {
-+			goto prev_arg;
-+		}
-+
-+		cur_arg++;
++		if (no_more) {
++			/*
++			** There aren't any more values in the current trie
++			** of input argument cur_arg.
++			*/
 +
-+		if (cur_arg >= num_inputs) {
-+			MR_trace_cmd_table_print_tip(proc, num_inputs,
-+				call_table_args, table_cur);
-+			num_tips++;
-+			goto prev_arg_last;
-+		}
-+
-+		continue;
-+
-+	prev_arg:
-+		call_table_args[cur_arg].MR_cta_valid = MR_FALSE;
-+		/* fall through */
-+
-+	prev_arg_last:
-+		cur_arg--;
-+		table_cur = call_table_args[cur_arg].MR_cta_start_node;
++			start_backtrack = MR_TRUE;
++		} else {
++			/*
++			** There is at least one more value in the current trie
++			** of input argument cur_arg, so go on to the next trie
++			** (if there is one).
++			*/
++
++			cur_arg++;
++
++			if (cur_arg >= num_inputs) {
++				MR_trace_cmd_table_print_tip(proc, num_inputs,
++					call_table_args, table_cur);
++				num_tips++;
++				start_backtrack = MR_TRUE;
++			} else {
++				start_backtrack = MR_FALSE;
++			}
++		}
++
++		if (start_backtrack) {
++			cur_arg--;
++			table_cur = call_table_args[cur_arg].MR_cta_start_node;
 +
-+		if (cur_arg < word_count) {
-+			break;
++			if (cur_arg < word_count) {
++				break;
++			}
 +		}
 +	}
 +
@@ -782,6 +888,8 @@
 +		if (! MR_int_hash_contents(*table_cur_ptr,
 +			&values, &value_next))
 +		{
++			/* there are no values in this trie node */
++			call_table_arg_ptr->MR_cta_valid = MR_FALSE;
 +			return MR_TRUE;
 +		}
 +
@@ -795,6 +903,8 @@
 +	if (call_table_arg_ptr->MR_cta_int_cur_index
 +		>= call_table_arg_ptr->MR_cta_int_value_next)
 +	{
++		/* we have already returned all the values in this trie node */
++		call_table_arg_ptr->MR_cta_valid = MR_FALSE;
 +		return MR_TRUE;
 +	}
 +
@@ -830,6 +940,8 @@
 +		if (! MR_float_hash_contents(*table_cur_ptr,
 +			&values, &value_next))
 +		{
++			/* there are no values in this trie node */
++			call_table_arg_ptr->MR_cta_valid = MR_FALSE;
 +			return MR_TRUE;
 +		}
 +
@@ -843,6 +955,8 @@
 +	if (call_table_arg_ptr->MR_cta_float_cur_index
 +		>= call_table_arg_ptr->MR_cta_float_value_next)
 +	{
++		/* we have already returned all the values in this trie node */
++		call_table_arg_ptr->MR_cta_valid = MR_FALSE;
 +		return MR_TRUE;
 +	}
 +
@@ -878,6 +992,8 @@
 +		if (! MR_string_hash_contents(*table_cur_ptr,
 +			&values, &value_next))
 +		{
++			/* there are no values in this trie node */
++			call_table_arg_ptr->MR_cta_valid = MR_FALSE;
 +			return MR_TRUE;
 +		}
 +
@@ -891,6 +1007,8 @@
 +	if (call_table_arg_ptr->MR_cta_string_cur_index
 +		>= call_table_arg_ptr->MR_cta_string_value_next)
 +	{
++		/* we have already returned all the values in this trie node */
++		call_table_arg_ptr->MR_cta_valid = MR_FALSE;
 +		return MR_TRUE;
 +	}
 +
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list