[m-dev.] for review: stream I/O

Fergus Henderson fjh at cs.mu.OZ.AU
Mon Oct 2 19:08:20 AEDT 2000


On 01-Oct-2000, Peter Ross <petdr at miscrit.be> wrote:
> Implement generic stream based I/O.

We're making progress, but this still needs work.
See comments below.

A couple of features that I notice are present in the current I/O
library but not in this library are support for associating
human-readable names (e.g.  file names) with streams, for use in error
messages, and counting line numbers for input streams.  The latter
feature could be provided by a new stream type (actually a stream type
constructor/combinator: given a stream type, it would give you back a
new stream type that was the same as the original stream but that also
recorded the line numbers).  This would mean that you only pay for it
if you use it.  But for the former feature, I think you need to
provide some support in the base stream library.

> stream.m:
>     Define the stream typeclass and provide some implementations of
>     routines which work on streams.

There's a design problem with stream__init/stream__destroy --
see below.

> stream.impure.m:
>     Define an impure interface which is useful for interfacing streams
>     provided by the foreign function interface.

I would much prefer to have a pure stream.lowlevel.m that provides
everything that stream.impure.m provides except the impurity.
I wouldn't mind having stream.impure.m so long as stream.lowlevel.m
is there, but I object to having stream.impure.m in the public
interface without stream.lowlevel.m being there, since it would
encourage people to make unnecessary use of impurity.

> stdio.m:
>     Define an instance of the stream typeclass which provides a
>     stdin/stdout stream.
...
> library.m:
>     Include the stream and stdio modules.

The stdio module is an interesting application of streams, but as is
IMHO it isn't complete enough (and hence not stable enough) to be in
the standard library.  In particular, unlike C's <stdio.h>, it doesn't
provide access to stderr or to file streams.  Adding appropriate
support for those is IMHO likely to result in interface changes.

> string.m:
>     Define an instance of the stream typeclass which works on strings.

I agree with Simon Taylor's comments.

> +:- module stdio.
> +
> +:- interface.
> +:- import_module stream, stream__impure.
> +:- import_module io.
> +
> +:- type stdio.
> +
> +	% XXX This should be an abstract type, but currently you aren't
> +	% allowed to have instance declarations that refer to abstract
> +	% types which are defined as equivalence types, and this needs
> +	% to be an equivalence type so that we can use the typeclass
> +	% interface defined in `stream.impure.m'.
> +:- type stdio_stream == impure(stdio).

Why not define it as

	:- type stdio_stream ---> stdio_stream(impure(stdio)).

and then implement the instance definitions as


> +	% Define the high-level operations using the generic operations
> +	% from the stream__impure module.
> +:- instance stream(stdio_stream) where [].
> +:- instance stream__input(stdio_stream) where [
> +	pred(read_character/3) is pure_read_char
> +].
> +
> +:- instance stream__output(stdio_stream) where [
> +	pred(write_character/3) is pure_write_char
> +].

	:- instance stream(stdio_stream) where [].
	:- instance stream__input(stdio_stream) where [
		read_character(C, stdio_stream(S0), stdio_stream(S)) :-
			pure_read_char(C, S0, S)
	].

	:- instance stream__output(stdio_stream) where [
		write_character(C, stdio_stream(S0), stdio_stream(S)) :-
			pure_write_char(C, S0, S)
	].

?

This is only two extra lines of code, and it saves you five lines of
XXX comments ;-)

> +% File: stream.impure.m.
> +% Main author: petdr
> +% Stability: exceptionally low.
> +%
> +% An impure interface for describing streams.
> +%
> +% This file provides a typeclass for people who want to map streams
> +% to a foreign language binding while doing the minimum amount of work.  In
> +% particular you need to write much less foreign language code, since
> +% you only need to implement a few impure predicates with a well defined
> +% interface.
> +%
> +% This file provides throwing exceptions, grabbing error messages,
> +% results packaged into ok/error/eof, and turning C style handle based

s/provides/provides a layer which handles/
s/results packaged/packaging results/

> +:- module stream__impure.
> +
> +:- interface.
> +
> +:- import_module char.
> +
> +:- type impure(S) ---> impure(S).

I think it would be clearer to name this `impure_stream'
rather than just `impure'.

Also you should have some comments here explaining what this type is,
and whether it is a handle or a state type.

> +:- typeclass impure(S) where [
> +		% Did an error occur processing the stream?
> +	semipure pred impure__is_error(S::ui, string::out) is semidet
> +].

Hmm... you haven't provided any way to reset the error indicator,
have you?  It might be a good idea to provide something like ANSI C's
clearerr() function.  Otherwise there is no way to recover once
an error has occurred on a stream, except by closing the stream
and re-opening it.

> +	% Define read/write one character whose signatures obey the
> +	% constraints of the stream type classes.
> +
> +:- pred pure_read_char(stream__result(char),
> +		impure(S), impure(S)) <= impure__input(S).
> +:- mode pure_read_char(out, di, uo) is det.
> +
> +:- pred pure_write_char(char, impure(S), impure(S)) <= impure__output(S).
> +:- mode pure_write_char(in, di, uo) is det.

The comment here doesn't make grammatical sense, and doesn't really
explain what these procedures do.

> +% File: stream.m.
> +% Main author: petdr
> +% Stability: exceptionally low.
> +%
> +% This file provides a typeclass for defining streams in Mercury.
> +% It is completely pure and you are encouraged to use it to write
> +% streams in Mercury.  If however you are a library implementor then you
> +% may want to look at the impure interface described in stream.impure.m

I suggest s/want to/also want to/

> +	% Given a handle to a stream construct a unique stream object
> +	% which can be used to do IO on the stream.
> +	% This object initialises its state from the io__state.
> +:- pred stream__init(S::in, stream(S)::uo, io__state::di, io__state::uo) is det.
> +
> +	% Tie the stream back into the io__state.  Omitting this call
> +	% may allow the compiler to optimize away the stream I/O.
> +:- pred stream__destroy(stream(S)::di, io__state::di, io__state::uo) is det.

There are some nasty semantic problems with these routines.
At very least, the documentation is incomplete.
More likely the interface declared here won't work (i.e. can't be
implemented in a way that is logically sound and that matches
their intended semantics).

The comment for `stream__destroy' explains its purpose, but
doesn't describe its effect.  What are the declarative and
operational semantics of `stream__destroy'?

In the case where there is no call to stream__destroy, the logically
correct behaviour is for there to be no side effects, except those caused
by stream__init, and currently stream__init is documented (and
implemented) in a way that seems to mean that it has no side effects.
So the comment "Omitting this call may allow the compiler to optimize
away the stream I/O" is not really correct -- in fact omitting this
call may result in the implementation *inserting* I/O side effects
that are not in the program's declarative semantics.

One alternative would be to say that each call to stream__init
must be matched by a corresponding call to stream__destroy
and that if the call to stream__destroy is omitted, then the
behaviour is undefined.  After all, we already document that
you get undefined behaviour in the case where you try to write
to a stream that has already been closed, so why not here too?

However, I wouldn't consider that acceptable.  There's a fundamental
difference between the two cases.

For writes to closed files, an implementation can easily ensure that
the behaviour is defined, e.g. throwing an exception, it's just that
doing so comes at a performance or portability cost, so we don't want
to require implementations to do so.  For the same reason, we don't
require array bounds checks for array__unsafe_lookup.  For both writes
to closed files and array__unsafe_lookup, the performance cost of
doing the checks is only a constant, so implementations which need
safety can easily do the checks.  But it may be an important constant,
so implementations which need maximum performance are not required to
do the checks.

In contrast, with this case, there is no easy way for a compiler to
ensure soundness.  Disabling the optimizations in question would
only ensure *unsoundness*.  In general, if a compiler using
separate compilation sees

	foo(S0, S) -->
		stream__init(S0, S1),
		{ write_string("hello\n", S1, S2) },
		bar(S2, S).

then without seeing the body of bar/4, it can't know whether
bar/4 calls stream__terminate, and so it can't know whether
`write_string/3' should have any side effects.

So, in summary, specifying that certain things result in undefined
behaviour is IMHO OK only so long as it is feasible for an
implementation to check whether those things occur and to provide a
sound and well-defined behaviour in those cases, preferably with only
constant-time overhead for the checking.  I don't think that is the
case here with missing calls to stream__terminate.

A related problem is with the ordering of I/O.
The procedures here are all declared `det', but there's
nothing which ensures that you don't try to open two
different output streams to the same file or device
and try to write to them at the same time.  If you do,
then the order of the output will be unspecified,
and so you won't necessarily get the same result each time.
Likewise, there is a problem with reading from a file
at the same time as writing to it.

There's two possible solutions that I can see.
One is to ensure that files/devices get locked when you create
input or output streams to them, and attempts to access a locked file 
from more than one stream at the same time (apart from multiple input
streams, which should be allowed) result in I/O errors.
The other is to make some of the procedures concerned `cc_multi'
rather than `det'.

> +% Predicates which require an output stream.
> +% On failure these predicates will throw an stream_error exception.
> +
> +	% Write one char to the output stream.
> +:- pred stream__write_char(char::in,
> +		stream(S)::di, stream(S)::uo) is det <= stream__output(S).

You document these predicates as throwing (only) stream_error exceptions
on failure, but there is nothing in the documentation for
stream__write_character (which stream__write_char calls)
that says it should only throw stream_error exceptions.

> +:- implementation.
> +
> +:- import_module exception, string.
> +
> +:- type stream(S)
> +	--->	stream(
> +			S,		% the stream state
> +			list(char)	% Putback characters
> +		).
> +
> +stream__init(S, stream(Stream, [])) --> { copy(S, Stream) }.
> +stream__init(S, stream(Stream, [])) :- copy(S, Stream).
> +
> +stream__destroy(_S) --> [].
> +stream__destroy(stream(S, _), S).

This code for stream__destroy could be inlined or specialized and
then optimized away, and then you'd have the same semantic problems
that you get when there's no call to stream__destroy.

> +stream__write_string(String) -->
> +	{ string__to_char_list(String, CharList) },
> +	list__foldl(stream__write_char, CharList).

Why not use `string__foldl'?

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list