[m-dev.] for review: stream I/O

Fergus Henderson fjh at cs.mu.OZ.AU
Tue Oct 3 11:18:49 AEDT 2000


On 02-Oct-2000, Ralph Becket <rbeck at microsoft.com> wrote:
> For example, the current io library does no buffering;

That's not correct: the current IO library is implemented using C's stdio
(fopen(), fputc(), etc.), which does buffering.

> > > I'm not sure I understand.  Did you implement putback_char//1 in C and
> > > get a slowdown or did you use a wrapper supplying putback_char//1 and
> > > get a slowdown?  If it's the latter case, then I'm not surprised:
> > > default and/or wrapper implementations should be expected to be more 
> > > costly than `native' implementations.  If it's the former then I'm very
> > > surprised since it should be no different than what we have in io.m now,
> > > so I'd be interested to learn what was causing the problem.
> > > 
> > Depending on what stream type you are using you select the native method
> > to call from a table of function pointers.  In otherwords the typeclass
> > approach written in C.  That extra level of indirection caused a factor
> > of three slowdown.
> 
> So the problem was not in how or where putback_char//1 was implemented,
> but rather the overhead of doing a method lookup rather than a direct
> call.  It seems to me that this is going to bite us no matter what we
> do if we are going to use stream typeclasses - we can't avoid the cost
> of method lookup if we can't decide at compile time which procedure to 
> call.

The advantage of doing things at the Mercury level is that the
Mercury compiler can then specialize the code in the case when
we know at compile time which stream type is being used.

This would give you the best of both worlds.

(However, currently the Mercury compiler doesn't do proper intermodule
optimization if you're using nested modules, so if the stream modules
are implemented as nested modules, that might cause some problems...)

> That the cat program runs three times slower just due to the cost of
> indirecting through a method lookup table staggers me.  It's an IO
> bound problem!  Still, the sensible way to implement cat is not via a
> read_char/write_char loop, but rather to use some more efficient
> bulk IO operations implemented at a lower level (e.g. using write(2)
> under POSIX).

Sure, you can implement cat using io__read_file_as_string (which is
implemented using read()) and io__write_string (which is implemented
using printf() but which nevertheless should still be efficient
for large strings).  But there are many programs, e.g. rot13,
wc, lexers, and so forth, which need to process streams one character
at a time, and so it's important that we get good performance in that
case.  The performance of processing one character at a time will
almost certainly never be as good as processing whole files at
a time, but we still want it to be as fast as possible, and hopefully
competitive with C.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list