[m-rev.] for review: dd_speedtest
Ian MacLarty
maclarty at cs.mu.OZ.AU
Thu Aug 4 15:03:38 AEST 2005
On Thu, 4 Aug 2005, Zoltan Somogyi wrote:
> On 03-Aug-2005, Ian MacLarty <maclarty at cs.mu.OZ.AU> wrote:
> > - help_system :: help.system
> > + help_system :: help.system,
> > +
> > + % If this following flag is set to yes then
> > + % user responses will be simulated and will
> > + % always be `no', except when confirmining a
> > + % bug in which case the response will be `yes'.
> > + testing :: bool
> > ).
>
> s/confirmining/confirming/
>
Fixed.
> > Index: doc/user_guide.texi
> > ===================================================================
> > RCS file: /home/mercury1/repository/mercury/doc/user_guide.texi,v
> > retrieving revision 1.448
> > diff -u -b -r1.448 user_guide.texi
> > --- doc/user_guide.texi 1 Aug 2005 02:40:05 -0000 1.448
> > +++ doc/user_guide.texi 2 Aug 2005 14:04:56 -0000
> > @@ -3352,17 +3352,31 @@
> > @sp 1
> > @table @code
> > @item dd [-r] [-n at var{nodes}] [-s at var{search-mode}]
> > - at c @item dd [--assume-all-io-is-tabled] [-d at var{depth}]
> > + at c @item dd [--assume-all-io-is-tabled] [-d at var{depth}] [-t]
> > + at c [--debug [filename]]
> > @c The --assume-all-io-is-tabled option is for developers only. Specifying it
> > @c makes an assertion, and if the assertion is incorrect, the resulting
> > @c behaviour would be hard for non-developers to understand. The option is
> > @c therefore deliberately not documented.
> > + at c @sp 1
> > @c The value of the @samp{-d} or @samp{--depth} option determines
> > @c how much of the annotated trace to build initially. Subsequent runs
> > @c will try to add @var{nodes} events to the annotated trace, but initially
> > @c there is not enough information available to do this. We do not document
> > @c this option since it requires an understanding of the internal workings of
> > @c the declarative debugger.
> > + at c @sp 1
> > + at c The @samp{-t} or @samp{--test} option causes the declarative debugger
> > + at c to simulate a user who answers `no' to all questions, except for
> > + at c `Is this a bug?' questions to which the simulated user answers `yes'.
>
> You should add a sentence here about the intended use.
>
Added:
@c This is useful for benchmarking the declarative debugger.
> > + at c The @samp{--debug} option causes events generated by the declarative
> > + at c debugger to become visible. This allows the declarative debugger to be
> > + at c debugged.
> > + at c If a filename is provided, the front end of the debugger is not called
> > + at c at all. Instead a representation of the debugging tree is dumped to
> > + at c the file.
> > + at c @sp 1
> > Starts declarative debugging using the current event as the initial symptom.
>
> > - at c If a filename is provided, the front end of the debugger is not called
> > - at c at all. Instead a representation of the debugging tree is dumped to
> > - at c the file, which may help diagnose problems in the debugger itself.
>
> Is there some reason why the second commented out sentence above is no longer
> applicable?
>
Only the second part of the second sentence was removed (i.e. the "which may
help ..." part), because it seemed redundant.
> > +# This script will append data to dd.stats and dd.stdout, so they should be
> > +# deleted first if this behaviour is not desired.
>
> In some of my benchmarking scripts, I handle this problem with code like this:
>
> if test -f TIMES
> then
> ci -l TIMES < /dev/null
> /bin/rm -f TIMES
> fi
>
> You could do the same for dd.stats and dd.stdout. You don't lose old info,
> but there is no clutter either.
>
Sometimes I want the output to be appended to the same file though, for
example when I run the test several times with different dd options and I
want the results all in the same table.
> > +usage="Usage: dd_speedtest -c cmd [-n num_tests] [-d dd_options]"
> > +cmd=""
> > +limit=6
> > +ddopts="-s divide_and_query -n 50000 -d 1"
> > +
> > +while test $# -gt 0
> > +do
> > + case $1 in
> > +
> > + -c|--cmd)
> > + cmd="$2" ; shift ;;
> > +
> > + -d|--ddopts)
> > + ddopts="$2" ; shift ;;
> > +
> > + -n)
> > + limit="$2" ; shift ;;
> > +
> > + -*)
> > + echo "$0: unknown option \`$1'" 2>&1
> > + echo $usage
> > + exit 1 ;;
> > +
> > + *) break ;;
> > + esac
> > + shift
> > +done
>
> This should probably use getopts. See slide set 1 in the 252 lecture notes.
>
Okay. See the interdiff at the end.
> > Index: tools/extract_dd_stats.awk
> > ===================================================================
> > RCS file: tools/extract_dd_stats.awk
> > diff -N tools/extract_dd_stats.awk
> > --- /dev/null 1 Jan 1970 00:00:00 -0000
> > +++ tools/extract_dd_stats.awk 3 Aug 2005 13:04:07 -0000
> ...
> > @@ -0,0 +1,126 @@
> > +#
> > +# To run this script do:
> > +# awk -f extract_dd_stats.awk dd.stats
>
> Instead of this, make a file called extract_dd_stats containing
>
> #/bin/sh
> awk '
> THE CONTENTS OF extract_dd_stats.awk
> with any 's escaped by a backslash
> and with that comment about invocation modified, of course
> ' "$@"
>
> since that is easier to invoke.
>
Done.
> > +
> > +BEGIN {
> > + FS = " = "
> > + printf("%6s %11s %10s %7s %7s %7s %7s\n", \
> > + "reexec", "nodes", "ratio", "CPU", \
> > + "WC", "RSS", "VSZ");
> > +}
> ...
> > + printf("%6i %11i %10d %7.1f %7.1f %7s %7.1f\n", \
> > + reexec, \
> > + total_nodes, total_nodes/reexec, \
> > + total_time/num_runs, total_wc_time/num_runs, \
> > + mem, vsz/1024);
> > +}
>
> With a shell script, you can have an option that, if set, causes the
> information printed in these places to be printed with latex column separators
> ready for inclusion in the paper, but we probably don't need that anymore;
> but add this to your bag of tricks for next time.
>
I had it outputting the table in latex format, but thought that wouldn't be
a good idea if I was going to commit it to the repository, since it's more
difficult to read. It would be easy to write an awk script to convert the
output to a latex table.
> > +function reset_per_run() {
> > + actual_nodes_for_run = 0;
> > + nodes_in_first_reexecution = 0;
> > + start_of_run = 1;
> > + last_nodes_constructed_in_run = 0;
> > +}
>
> I don't know whether all the versions of awk on our system support functions,
> but if not, that can be fixed later.
>
> > @@ -2396,8 +2406,8 @@
> > fprintf(stderr, "Total CPU time = %.2f\n",
> > MR_get_user_cpu_miliseconds() / 1000.0);
> > pid = getpid();
> > - sprintf(cmdstr, "ps -o pid,rss | grep %i | awk '{print $2}' 1>&2", pid);
> > - fprintf(stderr, "RSS = ");
> > + sprintf(cmdstr, "ps -hp %i -o rss,vsz | "
> > + "awk '{print \"RSS = \" $1 \"\\nVSZ = \" $2}' 1>&2", pid);
> > system(cmdstr);
>
> The old code here worked on aral, but the new one doesn't. Neither the old
> nor the new code works on mundula (Solaris).
>
Should I just revert to the old version then? I don't know a general way to
get this information that will work on all platforms.
Here's an interdiff:
diff -u browser/declarative_user.m browser/declarative_user.m
--- browser/declarative_user.m 3 Aug 2005 12:59:48 -0000
+++ browser/declarative_user.m 4 Aug 2005 03:53:53 -0000
@@ -119,7 +119,7 @@
% If this following flag is set to yes then
% user responses will be simulated and will
- % always be `no', except when confirmining a
+ % always be `no', except when confirming a
% bug in which case the response will be `yes'.
testing :: bool
).
diff -u doc/user_guide.texi doc/user_guide.texi
--- doc/user_guide.texi 2 Aug 2005 14:04:56 -0000
+++ doc/user_guide.texi 4 Aug 2005 03:55:02 -0000
@@ -3369,6 +3369,7 @@
@c The @samp{-t} or @samp{--test} option causes the declarative debugger
@c to simulate a user who answers `no' to all questions, except for
@c `Is this a bug?' questions to which the simulated user answers `yes'.
+ at c This is useful for benchmarking the declarative debugger.
@c @sp 1
@c The @samp{--debug} option causes events generated by the declarative
@c debugger to become visible. This allows the declarative debugger to be
diff -u tools/dd_speedtest tools/dd_speedtest
--- tools/dd_speedtest 3 Aug 2005 12:55:27 -0000
+++ tools/dd_speedtest 4 Aug 2005 04:41:24 -0000
@@ -26,7 +26,7 @@
# data for each reexecution of the program performed by the declarative
# debugger; and dd.stdout which records the output of the debugging session.
#
-# The script extract_dd_stats.awk in this directory can be used to summarize
+# The script extract_dd_stats in this directory can be used to summarize
# the data in dd.stats.
#
# This script will append data to dd.stats and dd.stdout, so they should be
@@ -38,28 +38,16 @@
limit=6
ddopts="-s divide_and_query -n 50000 -d 1"
-while test $# -gt 0
-do
- case $1 in
-
- -c|--cmd)
- cmd="$2" ; shift ;;
-
- -d|--ddopts)
- ddopts="$2" ; shift ;;
-
- -n)
- limit="$2" ; shift ;;
-
- -*)
- echo "$0: unknown option \`$1'" 2>&1
- echo $usage
- exit 1 ;;
-
- *) break ;;
+while getopts c:n:d: flag; do
+ case $flag in
+ c) cmd="$OPTARG" ;;
+ d) ddopts="$OPTARG" ;;
+ n) limit="$OPTARG" ;;
+ \?) echo $usage; exit 1 ;;
+ *) echo internal error in getopts; exit 2 ;;
esac
- shift
done
+shift `expr "$OPTIND" - 1`
if test "$cmd" == ""; then
echo $usage
reverted:
--- tools/extract_dd_stats.awk 3 Aug 2005 13:04:07 -0000
+++ /dev/null 1 Jan 1970 00:00:00 -0000
@@ -1,126 +0,0 @@
-#---------------------------------------------------------------------------#
-# Copyright (C) 2005 The University of Melbourne.
-# This file may only be copied under the terms of the GNU General
-# Public License - see the file COPYING in the Mercury distribution.
-#---------------------------------------------------------------------------#
-#
-# This awk script summarizes the stats generated by the dd_speedtest script.
-#
-# It prints the options passed to the dd command followed by the following
-# fields:
-#
-# reexec = number of reexecutions minus first and last reexecutions.
-# nodes = number of nodes constructed in all reexecutions except first and
-# last.
-# ratio = nodes/reexec
-# CPU = total CPU time from start of session to end of last reexecution.
-# WC = total time from start of session to end of session.
-# RSS = resident set size at end of last reexecution.
-# VSZ = virtual size of process at end of last reexecution.
-#
-# To run this script do:
-# awk -f extract_dd_stats.awk dd.stats
-#
-
-BEGIN {
- FS = " = "
- printf("%6s %11s %10s %7s %7s %7s %7s\n", \
- "reexec", "nodes", "ratio", "CPU", \
- "WC", "RSS", "VSZ");
-}
-
-/^START / {
- reset_per_run();
- num_runs = 0;
- total_time = 0;
- total_wc_time = 0;
- last_time = 0;
-
- start = 1;
- final = 0;
- during = 0;
-
- match($0, /START (.*)$/, a);
- dd_opts = a[1];
- printf("Options = %s\n", dd_opts);
-}
-/^DURING/ {
- reset_per_run();
- num_runs++;
- if (start != 1) {
- total_time += last_time;
- }
-
- start = 0;
- final = 0;
- during = 1;
-
- last_time = 0;
-}
-/^FINAL/ {
- reset_per_run();
- total_time += last_time;
-
- start = 0;
- final = 1;
- during = 0;
-}
-/^END$/ {
- # discard the first and last re-execution.
- reexec -= 2;
- total_nodes = actual_nodes_for_run - nodes_in_first_reexecution \
- - last_nodes_constructed_in_run;
-
- if (out_of_memory == 1){
- mem = "Out of Memory";
- out_of_memory = 0;
- } else {
- mem = sprintf("%.1f", rss/1024);
- }
- printf("%6i %11i %10d %7.1f %7.1f %7s %7.1f\n", \
- reexec, \
- total_nodes, total_nodes/reexec, \
- total_time/num_runs, total_wc_time/num_runs, \
- mem, vsz/1024);
-}
-
-/Total CPU time/ {last_time = $2}
-
-/Nodes constructed in this run/ {
- if (start_of_run) {
- nodes_in_first_reexecution = $2;
- start_of_run = 0;
- }
- actual_nodes_for_run += $2
- last_nodes_constructed_in_run = $2;
-}
-/Total reexecutions so far/ {
- reexec = $2
-}
-
-/RSS =/ {
- rss = $2
-}
-/VSZ =/ {
- vsz = $2
-}
-/Out of Memory/ {
- out_of_memory = 1;
-}
-
-/^STARTWCTIME/ {
- start_wc_time = $2;
-}
-
-/^ENDWCTIME/ {
- if (during == 1) {
- total_wc_time += ($2 - start_wc_time);
- }
-}
-
-function reset_per_run() {
- actual_nodes_for_run = 0;
- nodes_in_first_reexecution = 0;
- start_of_run = 1;
- last_nodes_constructed_in_run = 0;
-}
only in patch2:
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ tools/extract_dd_stats 4 Aug 2005 04:21:42 -0000
@@ -0,0 +1,129 @@
+#!/bin/sh
+#---------------------------------------------------------------------------#
+# Copyright (C) 2005 The University of Melbourne.
+# This file may only be copied under the terms of the GNU General
+# Public License - see the file COPYING in the Mercury distribution.
+#---------------------------------------------------------------------------#
+#
+# This script summarizes the stats generated by the dd_speedtest script.
+#
+# It prints the options passed to the dd command followed by the following
+# fields:
+#
+# reexec = number of reexecutions minus first and last reexecutions.
+# nodes = number of nodes constructed in all reexecutions except first and
+# last.
+# ratio = nodes/reexec
+# CPU = total CPU time from start of session to end of last reexecution.
+# WC = total time from start of session to end of session.
+# RSS = resident set size at end of last reexecution.
+# VSZ = virtual size of process at end of last reexecution.
+#
+# To run this script do:
+# extract_dd_stats dd.stats
+#
+
+awk '
+BEGIN {
+ FS = " = "
+ printf("%6s %11s %10s %7s %7s %7s %7s\n", \
+ "reexec", "nodes", "ratio", "CPU", \
+ "WC", "RSS", "VSZ");
+}
+
+/^START / {
+ reset_per_run();
+ num_runs = 0;
+ total_time = 0;
+ total_wc_time = 0;
+ last_time = 0;
+
+ start = 1;
+ final = 0;
+ during = 0;
+
+ match($0, /START (.*)$/, a);
+ dd_opts = a[1];
+ printf("Options = %s\n", dd_opts);
+}
+/^DURING/ {
+ reset_per_run();
+ num_runs++;
+ if (start != 1) {
+ total_time += last_time;
+ }
+
+ start = 0;
+ final = 0;
+ during = 1;
+
+ last_time = 0;
+}
+/^FINAL/ {
+ reset_per_run();
+ total_time += last_time;
+
+ start = 0;
+ final = 1;
+ during = 0;
+}
+/^END$/ {
+ # discard the first and last re-execution.
+ reexec -= 2;
+ total_nodes = actual_nodes_for_run - nodes_in_first_reexecution \
+ - last_nodes_constructed_in_run;
+
+ if (out_of_memory == 1){
+ mem = "Out of Memory";
+ out_of_memory = 0;
+ } else {
+ mem = sprintf("%.1f", rss/1024);
+ }
+ printf("%6i %11i %10d %7.1f %7.1f %7s %7.1f\n", \
+ reexec, \
+ total_nodes, total_nodes/reexec, \
+ total_time/num_runs, total_wc_time/num_runs, \
+ mem, vsz/1024);
+}
+
+/Total CPU time/ {last_time = $2}
+
+/Nodes constructed in this run/ {
+ if (start_of_run) {
+ nodes_in_first_reexecution = $2;
+ start_of_run = 0;
+ }
+ actual_nodes_for_run += $2
+ last_nodes_constructed_in_run = $2;
+}
+/Total reexecutions so far/ {
+ reexec = $2
+}
+
+/RSS =/ {
+ rss = $2
+}
+/VSZ =/ {
+ vsz = $2
+}
+/Out of Memory/ {
+ out_of_memory = 1;
+}
+
+/^STARTWCTIME/ {
+ start_wc_time = $2;
+}
+
+/^ENDWCTIME/ {
+ if (during == 1) {
+ total_wc_time += ($2 - start_wc_time);
+ }
+}
+
+function reset_per_run() {
+ actual_nodes_for_run = 0;
+ nodes_in_first_reexecution = 0;
+ start_of_run = 1;
+ last_nodes_constructed_in_run = 0;
+}
+' "$@"
only in patch2:
--- tests/debugger/mdb_command_test.inp 1 Aug 2005 02:40:11 -0000 1.44
+++ tests/debugger/mdb_command_test.inp 4 Aug 2005 03:41:58 -0000
@@ -85,7 +85,6 @@
stats xyzzy xyzzy xyzzy xyzzy xyzzy
print_optionals xyzzy xyzzy xyzzy xyzzy xyzzy
unhide_events xyzzy xyzzy xyzzy xyzzy xyzzy
-dd_dd xyzzy xyzzy xyzzy xyzzy xyzzy
table xyzzy xyzzy xyzzy xyzzy xyzzy
type_ctor xyzzy xyzzy xyzzy xyzzy xyzzy
all_type_ctors xyzzy xyzzy xyzzy xyzzy xyzzy
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list