[m-rev.] for review: dd_speedtest

Ian MacLarty maclarty at cs.mu.OZ.AU
Thu Aug 4 15:03:38 AEST 2005


On Thu, 4 Aug 2005, Zoltan Somogyi wrote:

> On 03-Aug-2005, Ian MacLarty <maclarty at cs.mu.OZ.AU> wrote:
> > -			help_system		:: help.system
> > +			help_system		:: help.system,
> > +
> > +				% If this following flag is set to yes then
> > +				% user responses will be simulated and will
> > +				% always be `no', except when confirmining a
> > +				% bug in which case the response will be `yes'.
> > +			testing			:: bool
> >  		).
>
> s/confirmining/confirming/
>

Fixed.

> > Index: doc/user_guide.texi
> > ===================================================================
> > RCS file: /home/mercury1/repository/mercury/doc/user_guide.texi,v
> > retrieving revision 1.448
> > diff -u -b -r1.448 user_guide.texi
> > --- doc/user_guide.texi	1 Aug 2005 02:40:05 -0000	1.448
> > +++ doc/user_guide.texi	2 Aug 2005 14:04:56 -0000
> > @@ -3352,17 +3352,31 @@
> >  @sp 1
> >  @table @code
> >  @item dd [-r] [-n at var{nodes}] [-s at var{search-mode}]
> > - at c @item dd [--assume-all-io-is-tabled] [-d at var{depth}]
> > + at c @item dd [--assume-all-io-is-tabled] [-d at var{depth}] [-t]
> > + at c       [--debug [filename]]
> >  @c The --assume-all-io-is-tabled option is for developers only. Specifying it
> >  @c makes an assertion, and if the assertion is incorrect, the resulting
> >  @c behaviour would be hard for non-developers to understand. The option is
> >  @c therefore deliberately not documented.
> > + at c @sp 1
> >  @c The value of the @samp{-d} or @samp{--depth} option determines
> >  @c how much of the annotated trace to build initially.  Subsequent runs
> >  @c will try to add @var{nodes} events to the annotated trace, but initially
> >  @c there is not enough information available to do this.  We do not document
> >  @c this option since it requires an understanding of the internal workings of
> >  @c the declarative debugger.
> > + at c @sp 1
> > + at c The @samp{-t} or @samp{--test} option causes the declarative debugger
> > + at c to simulate a user who answers `no' to all questions, except for
> > + at c `Is this a bug?' questions to which the simulated user answers `yes'.
>
> You should add a sentence here about the intended use.
>

Added:
@c This is useful for benchmarking the declarative debugger.

> > + at c The @samp{--debug} option causes events generated by the declarative
> > + at c debugger to become visible.  This allows the declarative debugger to be
> > + at c debugged.
> > + at c If a filename is provided, the front end of the debugger is not called
> > + at c at all.  Instead a representation of the debugging tree is dumped to
> > + at c the file.
> > + at c @sp 1
> >  Starts declarative debugging using the current event as the initial symptom.
>
> > - at c If a filename is provided, the front end of the debugger is not called
> > - at c at all.  Instead a representation of the debugging tree is dumped to
> > - at c the file, which may help diagnose problems in the debugger itself.
>
> Is there some reason why the second commented out sentence above is no longer
> applicable?
>

Only the second part of the second sentence was removed (i.e. the "which may
help ..." part), because it seemed redundant.

> > +# This script will append data to dd.stats and dd.stdout, so they should be
> > +# deleted first if this behaviour is not desired.
>
> In some of my benchmarking scripts, I handle this problem with code like this:
>
> if test -f TIMES
> then
> 	ci -l TIMES < /dev/null
> 	/bin/rm -f TIMES
> fi
>
> You could do the same for dd.stats and dd.stdout. You don't lose old info,
> but there is no clutter either.
>

Sometimes I want the output to be appended to the same file though, for
example when I run the test several times with different dd options and I
want the results all in the same table.

> > +usage="Usage: dd_speedtest -c cmd [-n num_tests] [-d dd_options]"
> > +cmd=""
> > +limit=6
> > +ddopts="-s divide_and_query -n 50000 -d 1"
> > +
> > +while test $# -gt 0
> > +do
> > +	case $1 in
> > +
> > +	-c|--cmd)
> > +		cmd="$2" ; shift ;;
> > +
> > +	-d|--ddopts)
> > +		ddopts="$2" ; shift ;;
> > +
> > +	-n)
> > +		limit="$2" ; shift ;;
> > +
> > +	-*)
> > +		echo "$0: unknown option \`$1'" 2>&1
> > +		echo $usage
> > +		exit 1 ;;
> > +
> > +	*)	break ;;
> > +	esac
> > +	shift
> > +done
>
> This should probably use getopts. See slide set 1 in the 252 lecture notes.
>

Okay.  See the interdiff at the end.

> > Index: tools/extract_dd_stats.awk
> > ===================================================================
> > RCS file: tools/extract_dd_stats.awk
> > diff -N tools/extract_dd_stats.awk
> > --- /dev/null	1 Jan 1970 00:00:00 -0000
> > +++ tools/extract_dd_stats.awk	3 Aug 2005 13:04:07 -0000
> ...
> > @@ -0,0 +1,126 @@
> > +#
> > +# To run this script do:
> > +# awk -f extract_dd_stats.awk dd.stats
>
> Instead of this, make a file called extract_dd_stats containing
>
> 	#/bin/sh
> 	awk '
> 	THE CONTENTS OF extract_dd_stats.awk
> 	with any 's escaped by a backslash
> 	and with that comment about invocation modified, of course
> 	' "$@"
>
> since that is easier to invoke.
>

Done.

> > +
> > +BEGIN {
> > +	FS = " = "
> > +	printf("%6s %11s %10s %7s %7s %7s %7s\n", \
> > +		"reexec", "nodes", "ratio", "CPU", \
> > +		"WC", "RSS", "VSZ");
> > +}
> ...
> > +	printf("%6i %11i %10d %7.1f %7.1f %7s %7.1f\n", \
> > +		reexec, \
> > +		total_nodes, total_nodes/reexec, \
> > +		total_time/num_runs, total_wc_time/num_runs, \
> > +		mem, vsz/1024);
> > +}
>
> With a shell script, you can have an option that, if set, causes the
> information printed in these places to be printed with latex column separators
> ready for inclusion in the paper, but we probably don't need that anymore;
> but add this to your bag of tricks for next time.
>

I had it outputting the table in latex format, but thought that wouldn't be
a good idea if I was going to commit it to the repository, since it's more
difficult to read.  It would be easy to write an awk script to convert the
output to a latex table.

> > +function reset_per_run() {
> > +	actual_nodes_for_run = 0;
> > +	nodes_in_first_reexecution = 0;
> > +	start_of_run = 1;
> > +	last_nodes_constructed_in_run = 0;
> > +}
>
> I don't know whether all the versions of awk on our system support functions,
> but if not, that can be fixed later.
>
> > @@ -2396,8 +2406,8 @@
> >      fprintf(stderr, "Total CPU time = %.2f\n",
> >          MR_get_user_cpu_miliseconds() / 1000.0);
> >      pid = getpid();
> > -    sprintf(cmdstr, "ps -o pid,rss | grep %i | awk '{print $2}' 1>&2", pid);
> > -    fprintf(stderr, "RSS = ");
> > +    sprintf(cmdstr, "ps -hp %i -o rss,vsz | "
> > +    	"awk '{print \"RSS = \" $1 \"\\nVSZ = \" $2}' 1>&2", pid);
> >      system(cmdstr);
>
> The old code here worked on aral, but the new one doesn't. Neither the old
> nor the new code works on mundula (Solaris).
>

Should I just revert to the old version then?  I don't know a general way to
get this information that will work on all platforms.

Here's an interdiff:

diff -u browser/declarative_user.m browser/declarative_user.m
--- browser/declarative_user.m	3 Aug 2005 12:59:48 -0000
+++ browser/declarative_user.m	4 Aug 2005 03:53:53 -0000
@@ -119,7 +119,7 @@

 				% If this following flag is set to yes then
 				% user responses will be simulated and will
-				% always be `no', except when confirmining a
+				% always be `no', except when confirming a
 				% bug in which case the response will be `yes'.
 			testing			:: bool
 		).
diff -u doc/user_guide.texi doc/user_guide.texi
--- doc/user_guide.texi	2 Aug 2005 14:04:56 -0000
+++ doc/user_guide.texi	4 Aug 2005 03:55:02 -0000
@@ -3369,6 +3369,7 @@
 @c The @samp{-t} or @samp{--test} option causes the declarative debugger
 @c to simulate a user who answers `no' to all questions, except for
 @c `Is this a bug?' questions to which the simulated user answers `yes'.
+ at c This is useful for benchmarking the declarative debugger.
 @c @sp 1
 @c The @samp{--debug} option causes events generated by the declarative
 @c debugger to become visible.  This allows the declarative debugger to be
diff -u tools/dd_speedtest tools/dd_speedtest
--- tools/dd_speedtest	3 Aug 2005 12:55:27 -0000
+++ tools/dd_speedtest	4 Aug 2005 04:41:24 -0000
@@ -26,7 +26,7 @@
 # data for each reexecution of the program performed by the declarative
 # debugger; and dd.stdout which records the output of the debugging session.
 #
-# The script extract_dd_stats.awk in this directory can be used to summarize
+# The script extract_dd_stats in this directory can be used to summarize
 # the data in dd.stats.
 #
 # This script will append data to dd.stats and dd.stdout, so they should be
@@ -38,28 +38,16 @@
 limit=6
 ddopts="-s divide_and_query -n 50000 -d 1"

-while test $# -gt 0
-do
-	case $1 in
-
-	-c|--cmd)
-		cmd="$2" ; shift ;;
-
-	-d|--ddopts)
-		ddopts="$2" ; shift ;;
-
-	-n)
-		limit="$2" ; shift ;;
-
-	-*)
-		echo "$0: unknown option \`$1'" 2>&1
-		echo $usage
-		exit 1 ;;
-
-	*)	break ;;
+while getopts c:n:d: flag; do
+	case $flag in
+	c)  cmd="$OPTARG" ;;
+	d)  ddopts="$OPTARG" ;;
+	n)  limit="$OPTARG" ;;
+	\?) echo $usage; exit 1 ;;
+	*)  echo internal error in getopts; exit 2 ;;
 	esac
-	shift
 done
+shift `expr "$OPTIND" - 1`

 if test "$cmd" == ""; then
 	echo $usage
reverted:
--- tools/extract_dd_stats.awk	3 Aug 2005 13:04:07 -0000
+++ /dev/null	1 Jan 1970 00:00:00 -0000
@@ -1,126 +0,0 @@
-#---------------------------------------------------------------------------#
-# Copyright (C) 2005 The University of Melbourne.
-# This file may only be copied under the terms of the GNU General
-# Public License - see the file COPYING in the Mercury distribution.
-#---------------------------------------------------------------------------#
-#
-# This awk script summarizes the stats generated by the dd_speedtest script.
-#
-# It prints the options passed to the dd command followed by the following
-# fields:
-#
-# reexec = number of reexecutions minus first and last reexecutions.
-# nodes = number of nodes constructed in all reexecutions except first and
-#         last.
-# ratio = nodes/reexec
-# CPU = total CPU time from start of session to end of last reexecution.
-# WC = total time from start of session to end of session.
-# RSS = resident set size at end of last reexecution.
-# VSZ = virtual size of process at end of last reexecution.
-#
-# To run this script do:
-# awk -f extract_dd_stats.awk dd.stats
-#
-
-BEGIN {
-	FS = " = "
-	printf("%6s %11s %10s %7s %7s %7s %7s\n", \
-		"reexec", "nodes", "ratio", "CPU", \
-		"WC", "RSS", "VSZ");
-}
-
-/^START / {
-	reset_per_run();
-	num_runs = 0;
-	total_time = 0;
-	total_wc_time = 0;
-	last_time = 0;
-
-	start = 1;
-	final = 0;
-	during = 0;
-
-	match($0, /START (.*)$/, a);
-	dd_opts = a[1];
-	printf("Options = %s\n", dd_opts);
-}
-/^DURING/ {
-	reset_per_run();
-	num_runs++;
-	if (start != 1) {
-		total_time += last_time;
-	}
-
-	start = 0;
-	final = 0;
-	during = 1;
-
-	last_time = 0;
-}
-/^FINAL/ {
-	reset_per_run();
-	total_time += last_time;
-
-	start = 0;
-	final = 1;
-	during = 0;
-}
-/^END$/ {
-	# discard the first and last re-execution.
-	reexec -= 2;
-	total_nodes = actual_nodes_for_run - nodes_in_first_reexecution \
-		- last_nodes_constructed_in_run;
-
-	if (out_of_memory == 1){
-		mem = "Out of Memory";
-		out_of_memory = 0;
-	} else {
-		mem = sprintf("%.1f", rss/1024);
-	}
-	printf("%6i %11i %10d %7.1f %7.1f %7s %7.1f\n", \
-		reexec, \
-		total_nodes, total_nodes/reexec, \
-		total_time/num_runs, total_wc_time/num_runs, \
-		mem, vsz/1024);
-}
-
-/Total CPU time/ {last_time = $2}
-
-/Nodes constructed in this run/ {
-	if (start_of_run) {
-		nodes_in_first_reexecution = $2;
-		start_of_run = 0;
-	}
-	actual_nodes_for_run += $2
-	last_nodes_constructed_in_run = $2;
-}
-/Total reexecutions so far/ {
-	reexec = $2
-}
-
-/RSS =/ {
-	rss = $2
-}
-/VSZ =/ {
-	vsz = $2
-}
-/Out of Memory/ {
-	out_of_memory = 1;
-}
-
-/^STARTWCTIME/ {
-	start_wc_time = $2;
-}
-
-/^ENDWCTIME/ {
-	if (during == 1) {
-		total_wc_time += ($2 - start_wc_time);
-	}
-}
-
-function reset_per_run() {
-	actual_nodes_for_run = 0;
-	nodes_in_first_reexecution = 0;
-	start_of_run = 1;
-	last_nodes_constructed_in_run = 0;
-}
only in patch2:
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ tools/extract_dd_stats	4 Aug 2005 04:21:42 -0000
@@ -0,0 +1,129 @@
+#!/bin/sh
+#---------------------------------------------------------------------------#
+# Copyright (C) 2005 The University of Melbourne.
+# This file may only be copied under the terms of the GNU General
+# Public License - see the file COPYING in the Mercury distribution.
+#---------------------------------------------------------------------------#
+#
+# This script summarizes the stats generated by the dd_speedtest script.
+#
+# It prints the options passed to the dd command followed by the following
+# fields:
+#
+# reexec = number of reexecutions minus first and last reexecutions.
+# nodes = number of nodes constructed in all reexecutions except first and
+#         last.
+# ratio = nodes/reexec
+# CPU = total CPU time from start of session to end of last reexecution.
+# WC = total time from start of session to end of session.
+# RSS = resident set size at end of last reexecution.
+# VSZ = virtual size of process at end of last reexecution.
+#
+# To run this script do:
+# extract_dd_stats dd.stats
+#
+
+awk '
+BEGIN {
+	FS = " = "
+	printf("%6s %11s %10s %7s %7s %7s %7s\n", \
+		"reexec", "nodes", "ratio", "CPU", \
+		"WC", "RSS", "VSZ");
+}
+
+/^START / {
+	reset_per_run();
+	num_runs = 0;
+	total_time = 0;
+	total_wc_time = 0;
+	last_time = 0;
+
+	start = 1;
+	final = 0;
+	during = 0;
+
+	match($0, /START (.*)$/, a);
+	dd_opts = a[1];
+	printf("Options = %s\n", dd_opts);
+}
+/^DURING/ {
+	reset_per_run();
+	num_runs++;
+	if (start != 1) {
+		total_time += last_time;
+	}
+
+	start = 0;
+	final = 0;
+	during = 1;
+
+	last_time = 0;
+}
+/^FINAL/ {
+	reset_per_run();
+	total_time += last_time;
+
+	start = 0;
+	final = 1;
+	during = 0;
+}
+/^END$/ {
+	# discard the first and last re-execution.
+	reexec -= 2;
+	total_nodes = actual_nodes_for_run - nodes_in_first_reexecution \
+		- last_nodes_constructed_in_run;
+
+	if (out_of_memory == 1){
+		mem = "Out of Memory";
+		out_of_memory = 0;
+	} else {
+		mem = sprintf("%.1f", rss/1024);
+	}
+	printf("%6i %11i %10d %7.1f %7.1f %7s %7.1f\n", \
+		reexec, \
+		total_nodes, total_nodes/reexec, \
+		total_time/num_runs, total_wc_time/num_runs, \
+		mem, vsz/1024);
+}
+
+/Total CPU time/ {last_time = $2}
+
+/Nodes constructed in this run/ {
+	if (start_of_run) {
+		nodes_in_first_reexecution = $2;
+		start_of_run = 0;
+	}
+	actual_nodes_for_run += $2
+	last_nodes_constructed_in_run = $2;
+}
+/Total reexecutions so far/ {
+	reexec = $2
+}
+
+/RSS =/ {
+	rss = $2
+}
+/VSZ =/ {
+	vsz = $2
+}
+/Out of Memory/ {
+	out_of_memory = 1;
+}
+
+/^STARTWCTIME/ {
+	start_wc_time = $2;
+}
+
+/^ENDWCTIME/ {
+	if (during == 1) {
+		total_wc_time += ($2 - start_wc_time);
+	}
+}
+
+function reset_per_run() {
+	actual_nodes_for_run = 0;
+	nodes_in_first_reexecution = 0;
+	start_of_run = 1;
+	last_nodes_constructed_in_run = 0;
+}
+' "$@"
only in patch2:
--- tests/debugger/mdb_command_test.inp	1 Aug 2005 02:40:11 -0000	1.44
+++ tests/debugger/mdb_command_test.inp	4 Aug 2005 03:41:58 -0000
@@ -85,7 +85,6 @@
 stats                xyzzy xyzzy xyzzy xyzzy xyzzy
 print_optionals      xyzzy xyzzy xyzzy xyzzy xyzzy
 unhide_events        xyzzy xyzzy xyzzy xyzzy xyzzy
-dd_dd                xyzzy xyzzy xyzzy xyzzy xyzzy
 table                xyzzy xyzzy xyzzy xyzzy xyzzy
 type_ctor            xyzzy xyzzy xyzzy xyzzy xyzzy
 all_type_ctors       xyzzy xyzzy xyzzy xyzzy xyzzy

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list