[m-dev.] for review: add code to read terms from strings

Fergus Henderson fjh at cs.mu.OZ.AU
Wed May 13 19:51:22 AEST 1998


On 13-May-1998, Peter Schachte <pets at students.cs.mu.oz.au> wrote:
> > Add code to parse terms from strings rather than from streams.
> 
> A little late for this comment, but anyway....
> 
> Wouldn't it be better to create a type class for input-stream-like-things,
> make read et al. work for the type class, and make a string-stream instance
> of this? 

The reason I didn't do that was efficiency.
The parser is a part of the system where efficiency is very important.
Unless/until the compiler is capable of generating specialized instances
of procedures that use type classes, this approach will be unacceptably
inefficient.

Even when we have this specialization, there would still be some
efficiency issues.  In the typeclass, the state type would need
to use di/uo modes, so that io__state can be a valid instance.
This implies that the procedure to get a single character must be
det and return maybe(char) or io__result(char) rather than being
semidet and returning char.  You can't make it semidet because

    (a)	the unique mode analysis isn't smart enough to notice that
	in the case where it fails, the state will be unmodified
and
    (b)	in the case of I/O, there are three possible results,
	namely ok(Char), eof, and error(Error), rather than just
	two the first two, as is the case for strings.

Unfortunately with the current data representations and calling
conventions, this means that you would get a memory allocation for
every character parsed.

Hmm, now that I think about it, I guess that typeclass specialization
*plus* inlining *plus* deforestation could potentially optimize away
this memory allocation.  (When I made this change, I didn't realize
this.)  Your code with start out as
	
	generic_read_char(Res, S0, S1),
	( Res = ok(Char), ...code for ok...
	; Res = eof, ...code for eof...
	; Res = error(...), ...code for error...
	)

and the after specialization you'd have

	string_read_char(Res, S0, S1),
	( Res = ok(Char), ...code for ok...
	; Res = eof, ...code for eof...
	; Res = error(...), ...code for error...
	)

and then after inlining you'd have

	S0 = string_state(String, Len, Offset, ...),
	( Offset < Len ->
		string__unsafe_index(String, Offset, Char),
		...code to check for newline and construct S1...
		Res = ok(Char)
	;
		Res = eof
	),
	( Res = ok(Char), ...code for ok...
	; Res = eof, ...code for eof...
	; Res = error(...), ...code for error...
	)

and then after deforestation (i.e. merging of adjacent branched
structures and elimination of constructions of unused variables)
you'd have

	S0 = string_state(String, Len, Offset, ...),
	( Offset < Len ->
		string__unsafe_index(String, Offset, Char),
		...code to check for newline and construct S1...
		...code for ok...
	;
		...code for eof...
	)

However, this chain of optimizations is sufficiently complicated
that I'd have to "see it to believe it". 

Note that even this optimized code would still be much less efficient than
it could be.  To improve it, you need to also avoid the
construction of S1, and instead either use destructive update on S0,
or "unfold" the string_state structure and pass around its fields in
registers rather than passing around a pointer to the struct.
Probably the latter optimization would improve performance the most.

My original aim was to improve performance.  I did in fact achieve a
50% speedup (on my x86 box at home) on the `count_terms' benchmark
[which just parses the terms in a file and counts them], but only by
omitting the code to compute line numbers.  When I changed the `posn'
type in my diff from just an offset to a triple containing an offset
and some line number information, I lost that speedup, because now
the string parsing code does an allocation for each character again.
Most of the speedup could be regained by doing the argument unfolding
mentioned above, but I didn't want to do that by hand, because it would
be very tedious and time-consuming, and because it would make the code
significantly less maintainable -- and having already duplicated all
that code for string parsing I thought I'd already spent enough time
on such efforts!

I do think that doing this sort of stuff is useful in terms of
helping us understand where our current optimizer hits its limitations.

> Or is this another one of those "until we have a debugger we can't really
> use type classes" deals?

No, that wasn't the reason.  The debugger is close enough to working
that if that had been the issue I would have gone ahead anyway, or
waited until the debugger was done.  The reason was efficiency.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.



More information about the developers mailing list