Addition to string/performance bug fix

Thomas Charles CONWAY conway at cs.mu.oz.au
Mon Jun 30 09:29:00 AEST 1997


Hi

This change fixes the peformance bug I found in io__write which was taking
3 cpu minutes (yes - cpu, not just elapsed time), to write out a term
containing an 85K string.

-- 
ZZ:wq!
^X^C
Thomas Conway               				      conway at cs.mu.oz.au
AD DEUM ET VINUM	  			      Every sword has two edges.

Fix a performance bug which meant that it took about 3 cpu minutes for
hydra to quote an 85K string in order to io__write it out. The bug was
that string__first_char(in, out, out) has a cost of O(n^2) in the length
of the string, and term_io__write_escaped_string called it successively
to escape each character of the string. The fix was to add a predicate
string__foldl which is equivalent to
	string__to_char_list(Str, Chars), list__foldl(Closure, Chars, Acc0, Acc)
except that its implementation avoids the intermediate list.

library/string.m:
	add string__foldl (described above)
	add the local predicate string__unsafe_index which doesn't do a
	range check to make sure that the index lies within the string.
	It is called by string__foldl.

library/term_io.m:
	implement term_io__write_escaped_string in terms of string__foldl
	instead of a loop calling string__first_char. The old version had
	a cost O(n^2), the new version a cost of O(n).

NEWS:
	mention string__foldl


cvs diff: Diffing .
Index: NEWS
===================================================================
RCS file: /home/staff/zs/imp/mercury/NEWS,v
retrieving revision 1.56
diff -u -r1.56 NEWS
--- NEWS	1997/06/30 06:45:40	1.56
+++ NEWS	1997/06/30 23:16:26
@@ -200,3 +200,7 @@
     option-lookup predicates: getopt__lookup_accumulating_option/3 and
     getopt__lookup_maybe_string_option/3.
 
+  - We've added string__foldl to the library. It has the same semantics as
+    (string__to_char_list(String, Chars), list__foldl(Pred, Chars, Acc0, Acc))
+    but is implemented more efficiently.
+
cvs diff: Diffing bindist
cvs diff: Diffing boehm_gc
cvs diff: Diffing boehm_gc/Mac_files
cvs diff: Diffing boehm_gc/cord
cvs diff: Diffing boehm_gc/cord/private
cvs diff: Diffing boehm_gc/include
cvs diff: Diffing boehm_gc/include/private
cvs diff: Diffing bytecode
cvs diff: Diffing compiler
cvs diff: Diffing compiler/notes
cvs diff: Diffing doc
cvs diff: Diffing library
Index: library/string.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/library/string.m,v
retrieving revision 1.92
diff -u -r1.92 string.m
--- string.m	1997/06/02 07:00:18	1.92
+++ string.m	1997/06/30 23:07:08
@@ -190,6 +190,21 @@
 %	Calls error/1 if `Index' is out of range (negative, or greater than or
 %	equal to the length of `String').
 
+:- pred string__foldl(pred(char, T, T), string, T, T).
+:- mode string__foldl(pred(in, in, out) is det, in, in, out) is det.
+:- mode string__foldl(pred(in, di, uo) is det, in, di, uo) is det.
+:- mode string__foldl(pred(in, in, out) is semidet, in, in, out) is semidet.
+:- mode string__foldl(pred(in, in, out) is nondet, in, in, out) is nondet.
+:- mode string__foldl(pred(in, in, out) is multi, in, in, out) is multi.
+%	string__foldl(Closure, String, Acc0, Acc):
+%	`Closure' is an accumulator predicate which is to be called for each
+%	character of the string `String' in turn. The initial value of the
+%	accumulator is `Acc0' and the final value is `Acc'.
+%	(string__foldl is equivalent to
+%		string__to_char_list(String, Chars),
+%		list__foldl(Closure, Chars, Acc0, Acc)
+%	but is implemented more efficiently.)
+
 :- pred string__split(string, int, string, string).
 :- mode string__split(in, in, out, out) is det.
 %	string__split(String, Count, LeftSubstring, RightSubstring):
@@ -432,6 +447,32 @@
 		error("string__index_det: index out of range")
 	).
 
+string__foldl(Closure, String, Acc0, Acc) :-
+	string__length(String, Length),
+	string__foldl2(Closure, String, 0, Length, Acc0, Acc).
+
+:- pred string__foldl2(pred(char, T, T), string, int, int, T, T).
+:- mode string__foldl2(pred(in, in, out) is det, in, in, in, in, out) is det.
+:- mode string__foldl2(pred(in, di, uo) is det, in, in, in, di, uo) is det.
+:- mode string__foldl2(pred(in, in, out) is semidet, in, in, in, in, out)
+		is semidet.
+:- mode string__foldl2(pred(in, in, out) is nondet, in, in, in, in, out)
+		is nondet.
+:- mode string__foldl2(pred(in, in, out) is multi, in, in, in, in, out)
+		is multi.
+
+string__foldl2(Closure, String, N, Max, Acc0, Acc) :-
+	(
+		N >= Max
+	->
+		Acc = Acc0
+	;
+		string__unsafe_index(String, N, Char),
+		call(Closure, Char, Acc0, Acc1),
+		N1 is N + 1,
+		string__foldl2(Closure, String, N1, Max, Acc1, Acc)
+	).
+
 string__left(String, Count, LeftString) :-
 	string__split(String, Count, LeftString, _RightString).
 
@@ -1589,6 +1630,15 @@
 		SUCCESS_INDICATOR = TRUE;
 		Ch = Str[Index];
 	}
+").
+
+/*-----------------------------------------------------------------------*/
+
+:- pred string__unsafe_index(string, int, char).
+:- mode string__unsafe_index(in, in, out) is det.
+
+:- pragma(c_code, string__unsafe_index(Str::in, Index::in, Ch::out), "
+	Ch = Str[Index];
 ").
 
 /*-----------------------------------------------------------------------*/
Index: library/term_io.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/library/term_io.m,v
retrieving revision 1.47
diff -u -r1.47 term_io.m
--- term_io.m	1997/05/28 07:41:30	1.47
+++ term_io.m	1997/06/26 04:35:52
@@ -407,13 +407,8 @@
 	term_io__write_escaped_string(S),
 	io__write_char('"').
 
-term_io__write_escaped_string(S0) -->
-	( { string__first_char(S0, Char, S1) } ->
-		term_io__write_escaped_char(Char),
-		term_io__write_escaped_string(S1)
-	;
-		[]
-	).
+term_io__write_escaped_string(String) -->
+	string__foldl(term_io__write_escaped_char, String).
 
 term_io__quote_single_char(Char) -->
 	term_io__write_escaped_char(Char).
cvs diff: Diffing lp_solve
cvs diff: Diffing lp_solve/lp_examples
cvs diff: Diffing profiler
cvs diff: Diffing runtime
cvs diff: Diffing runtime/machdeps
cvs diff: Diffing samples
cvs diff: Diffing samples/c_interface
cvs diff: Diffing samples/c_interface/c_calls_mercury
cvs diff: Diffing samples/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/c_interface/mercury_calls_c
cvs diff: Diffing samples/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/diff
cvs diff: Diffing scripts
cvs diff: Diffing tools
cvs diff: Diffing trial
cvs diff: Diffing util



More information about the developers mailing list