[m-dev.] for review: stream I/O

Peter Ross petdr at miscrit.be
Mon Oct 2 20:12:53 AEDT 2000


On Mon, Oct 02, 2000 at 07:08:20PM +1100, Fergus Henderson wrote:
> On 01-Oct-2000, Peter Ross <petdr at miscrit.be> wrote:
> > Implement generic stream based I/O.
> 
> We're making progress, but this still needs work.
> See comments below.
> 
> A couple of features that I notice are present in the current I/O
> library but not in this library are support for associating
> human-readable names (e.g.  file names) with streams, for use in error
> messages, and counting line numbers for input streams.  The latter
> feature could be provided by a new stream type (actually a stream type
> constructor/combinator: given a stream type, it would give you back a
> new stream type that was the same as the original stream but that also
> recorded the line numbers).  This would mean that you only pay for it
> if you use it.  But for the former feature, I think you need to
> provide some support in the base stream library.
> 
I intend to add these features in when the interface settles down.

> > stream.m:
> >     Define the stream typeclass and provide some implementations of
> >     routines which work on streams.
> 
> There's a design problem with stream__init/stream__destroy --
> see below.
> 
> > stream.impure.m:
> >     Define an impure interface which is useful for interfacing streams
> >     provided by the foreign function interface.
> 
> I would much prefer to have a pure stream.lowlevel.m that provides
> everything that stream.impure.m provides except the impurity.
> I wouldn't mind having stream.impure.m so long as stream.lowlevel.m
> is there, but I object to having stream.impure.m in the public
> interface without stream.lowlevel.m being there, since it would
> encourage people to make unnecessary use of impurity.
> 
> > stdio.m:
> >     Define an instance of the stream typeclass which provides a
> >     stdin/stdout stream.
> ...
> > library.m:
> >     Include the stream and stdio modules.
> 
> The stdio module is an interesting application of streams, but as is
> IMHO it isn't complete enough (and hence not stable enough) to be in
> the standard library.  In particular, unlike C's <stdio.h>, it doesn't
> provide access to stderr or to file streams.  Adding appropriate
> support for those is IMHO likely to result in interface changes.
> 
File streams should be a seperate type to the stdio streams.  You will
want a type rofile which is only a member of stream__input and a type
wofile which is only a member of stream__output.

Regards handling stderr I haven't thought enough about it to see if
there is a neat way to do it.  Perhaps there should be a seperate stderr
stream, I don't know yet.

> > string.m:
> >     Define an instance of the stream typeclass which works on strings.
> 
> I agree with Simon Taylor's comments.
> 
Not a problem, I just put it in there as an example.

> > +:- module stdio.
> > +
> > +:- interface.
> > +:- import_module stream, stream__impure.
> > +:- import_module io.
> > +
> > +:- type stdio.
> > +
> > +	% XXX This should be an abstract type, but currently you aren't
> > +	% allowed to have instance declarations that refer to abstract
> > +	% types which are defined as equivalence types, and this needs
> > +	% to be an equivalence type so that we can use the typeclass
> > +	% interface defined in `stream.impure.m'.
> > +:- type stdio_stream == impure(stdio).
> 
> Why not define it as
> 
> 	:- type stdio_stream ---> stdio_stream(impure(stdio)).
> 
> and then implement the instance definitions as
> 
> 
> > +	% Define the high-level operations using the generic operations
> > +	% from the stream__impure module.
> > +:- instance stream(stdio_stream) where [].
> > +:- instance stream__input(stdio_stream) where [
> > +	pred(read_character/3) is pure_read_char
> > +].
> > +
> > +:- instance stream__output(stdio_stream) where [
> > +	pred(write_character/3) is pure_write_char
> > +].
> 
> 	:- instance stream(stdio_stream) where [].
> 	:- instance stream__input(stdio_stream) where [
> 		read_character(C, stdio_stream(S0), stdio_stream(S)) :-
> 			pure_read_char(C, S0, S)
> 	].
> 
> 	:- instance stream__output(stdio_stream) where [
> 		write_character(C, stdio_stream(S0), stdio_stream(S)) :-
> 			pure_write_char(C, S0, S)
> 	].
> 
> ?
> 
> This is only two extra lines of code, and it saves you five lines of
> XXX comments ;-)
> 
Good point.

> > +% File: stream.impure.m.
> > +% Main author: petdr
> > +% Stability: exceptionally low.
> > +%
> > +% An impure interface for describing streams.
> > +%
> > +% This file provides a typeclass for people who want to map streams
> > +% to a foreign language binding while doing the minimum amount of work.  In
> > +% particular you need to write much less foreign language code, since
> > +% you only need to implement a few impure predicates with a well defined
> > +% interface.
> > +%
> > +% This file provides throwing exceptions, grabbing error messages,
> > +% results packaged into ok/error/eof, and turning C style handle based
> 
> s/provides/provides a layer which handles/
> s/results packaged/packaging results/
> 
> > +:- module stream__impure.
> > +
> > +:- interface.
> > +
> > +:- import_module char.
> > +
> > +:- type impure(S) ---> impure(S).
> 
> I think it would be clearer to name this `impure_stream'
> rather than just `impure'.
> 
> Also you should have some comments here explaining what this type is,
> and whether it is a handle or a state type.
> 
> > +:- typeclass impure(S) where [
> > +		% Did an error occur processing the stream?
> > +	semipure pred impure__is_error(S::ui, string::out) is semidet
> > +].
> 
> Hmm... you haven't provided any way to reset the error indicator,
> have you?  It might be a good idea to provide something like ANSI C's
> clearerr() function.  Otherwise there is no way to recover once
> an error has occurred on a stream, except by closing the stream
> and re-opening it.
> 
Document that the is_error predicate is responsible for clearing the
error status of a stream once it has reported an error?

> > +	% Define read/write one character whose signatures obey the
> > +	% constraints of the stream type classes.
> > +
> > +:- pred pure_read_char(stream__result(char),
> > +		impure(S), impure(S)) <= impure__input(S).
> > +:- mode pure_read_char(out, di, uo) is det.
> > +
> > +:- pred pure_write_char(char, impure(S), impure(S)) <= impure__output(S).
> > +:- mode pure_write_char(in, di, uo) is det.
> 
> The comment here doesn't make grammatical sense, and doesn't really
> explain what these procedures do.
> 

Read one character of input.  This read character implementation can be
used in instance declarations for the stream__input type class.

Write one character of input.  This write character implementation can be
used in instance declarations for the stream__output type class.


> > +% File: stream.m.
> > +% Main author: petdr
> > +% Stability: exceptionally low.
> > +%
> > +% This file provides a typeclass for defining streams in Mercury.
> > +% It is completely pure and you are encouraged to use it to write
> > +% streams in Mercury.  If however you are a library implementor then you
> > +% may want to look at the impure interface described in stream.impure.m
> 
> I suggest s/want to/also want to/
> 
> > +	% Given a handle to a stream construct a unique stream object
> > +	% which can be used to do IO on the stream.
> > +	% This object initialises its state from the io__state.
> > +:- pred stream__init(S::in, stream(S)::uo, io__state::di, io__state::uo) is det.
> > +
> > +	% Tie the stream back into the io__state.  Omitting this call
> > +	% may allow the compiler to optimize away the stream I/O.
> > +:- pred stream__destroy(stream(S)::di, io__state::di, io__state::uo) is det.
> 
> There are some nasty semantic problems with these routines.
> At very least, the documentation is incomplete.
> More likely the interface declared here won't work (i.e. can't be
> implemented in a way that is logically sound and that matches
> their intended semantics).
> 
> The comment for `stream__destroy' explains its purpose, but
> doesn't describe its effect.  What are the declarative and
> operational semantics of `stream__destroy'?
> 
> In the case where there is no call to stream__destroy, the logically
> correct behaviour is for there to be no side effects, except those caused
> by stream__init, and currently stream__init is documented (and
> implemented) in a way that seems to mean that it has no side effects.
> So the comment "Omitting this call may allow the compiler to optimize
> away the stream I/O" is not really correct -- in fact omitting this
> call may result in the implementation *inserting* I/O side effects
> that are not in the program's declarative semantics.
> 
> One alternative would be to say that each call to stream__init
> must be matched by a corresponding call to stream__destroy
> and that if the call to stream__destroy is omitted, then the
> behaviour is undefined.  After all, we already document that
> you get undefined behaviour in the case where you try to write
> to a stream that has already been closed, so why not here too?
> 
> However, I wouldn't consider that acceptable.  There's a fundamental
> difference between the two cases.
> 
> For writes to closed files, an implementation can easily ensure that
> the behaviour is defined, e.g. throwing an exception, it's just that
> doing so comes at a performance or portability cost, so we don't want
> to require implementations to do so.  For the same reason, we don't
> require array bounds checks for array__unsafe_lookup.  For both writes
> to closed files and array__unsafe_lookup, the performance cost of
> doing the checks is only a constant, so implementations which need
> safety can easily do the checks.  But it may be an important constant,
> so implementations which need maximum performance are not required to
> do the checks.
> 
> In contrast, with this case, there is no easy way for a compiler to
> ensure soundness.  Disabling the optimizations in question would
> only ensure *unsoundness*.  In general, if a compiler using
> separate compilation sees
> 
> 	foo(S0, S) -->
> 		stream__init(S0, S1),
> 		{ write_string("hello\n", S1, S2) },
> 		bar(S2, S).
> 
> then without seeing the body of bar/4, it can't know whether
> bar/4 calls stream__terminate, and so it can't know whether
> `write_string/3' should have any side effects.
> 
> So, in summary, specifying that certain things result in undefined
> behaviour is IMHO OK only so long as it is feasible for an
> implementation to check whether those things occur and to provide a
> sound and well-defined behaviour in those cases, preferably with only
> constant-time overhead for the checking.  I don't think that is the
> case here with missing calls to stream__terminate.
> 
> A related problem is with the ordering of I/O.
> The procedures here are all declared `det', but there's
> nothing which ensures that you don't try to open two
> different output streams to the same file or device
> and try to write to them at the same time.  If you do,
> then the order of the output will be unspecified,
> and so you won't necessarily get the same result each time.
> Likewise, there is a problem with reading from a file
> at the same time as writing to it.
> 
> There's two possible solutions that I can see.
> One is to ensure that files/devices get locked when you create
> input or output streams to them, and attempts to access a locked file 
> from more than one stream at the same time (apart from multiple input
> streams, which should be allowed) result in I/O errors.
> The other is to make some of the procedures concerned `cc_multi'
> rather than `det'.
> 
> > +% Predicates which require an output stream.
> > +% On failure these predicates will throw an stream_error exception.
> > +
> > +	% Write one char to the output stream.
> > +:- pred stream__write_char(char::in,
> > +		stream(S)::di, stream(S)::uo) is det <= stream__output(S).
> 
> You document these predicates as throwing (only) stream_error exceptions
> on failure, but there is nothing in the documentation for
> stream__write_character (which stream__write_char calls)
> that says it should only throw stream_error exceptions.
> 
> > +:- implementation.
> > +
> > +:- import_module exception, string.
> > +
> > +:- type stream(S)
> > +	--->	stream(
> > +			S,		% the stream state
> > +			list(char)	% Putback characters
> > +		).
> > +
> > +stream__init(S, stream(Stream, [])) --> { copy(S, Stream) }.
> > +stream__init(S, stream(Stream, [])) :- copy(S, Stream).
> > +
> > +stream__destroy(_S) --> [].
> > +stream__destroy(stream(S, _), S).
> 
> This code for stream__destroy could be inlined or specialized and
> then optimized away, and then you'd have the same semantic problems
> that you get when there's no call to stream__destroy.
> 
> > +stream__write_string(String) -->
> > +	{ string__to_char_list(String, CharList) },
> > +	list__foldl(stream__write_char, CharList).
> 
> Why not use `string__foldl'?
> 
Because I like lists a lot better ;), I will change that.

Pete
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list