[m-rev.] io.m latest full diff

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Jan 22 13:56:08 AEDT 2004


On 22-Jan-2004, James Goddard <goddardjames at yahoo.com> wrote:
>  --- Fergus Henderson <fjh at cs.mu.OZ.AU> wrote: 
> > > +		/*
> > > +		** read_char():
> > > +		**	Reads one character in from a text input file using the
> > > +		**	default charset decoding.  Returns -1 at end of file.
> > > +		*/
> > > +		public int read_char() {
> > > +			int c;
> > > +			if (input == null) {
> > > +				throw new java.lang.RuntimeException(
> > > +					""read_char_code may only be called"" +
> > > +					"" on text input streams"");
> > 
> > This code here, and quite a bit of code elsewhere, is going to break if
> > text mode operations are attempted on a binary file.  The implementations
> > of io__read_binary and io__write_binary do exactly that.
> > 
> > Fixing this problem would require a major redesign, so I suggest that
> > for now, you just document the problem with XXX comments, both at the
> > definition of the MercuryFile structure and also at the definition
> > of io__read_binary/io__write_binary.
> 
> I think that io__write_binary will actually succeed..

Well, having io__write_binary succeed is not much help if
the result can't be read in with io__read_binary.

> The put() and write()
> functions work for both text and binary streams, using the default text
> encoding for text streams and stripping away all but the lower 8 bits for each
> character for binary streams.  Whether this is the behaviour we want in this
> case is another matter.

Stripping away all but the lower 8 bits would have two problems:
(1) it would not be possible to read/write terms that use Unicode
characters which do not fit into 8 bits
(2) even for those that do fit into 8 bits, the encoding of those
characters might be different than the platform's native encoding
for "char", so the resulting binary files would not necessarily be
interoperable with Mercury code compiled for the C back-end.

> I could merge read_char() and read_byte() into a single read() function, so
> that you could actually call read_char_code on binary streams, (and
> read_byte_val on text streams) and it would behave in a symmetrical manner. 
> Would this be appropriate and/or solve the problem?

I'm not sure.

Actually, thinking about it some more, I think I have found a bug in
the current implementation of io__read_binary: it doesn't read in the
trailing newline written by io__write_binary (or rather, it reads it in
but then calls putback_char to put it back).

Hence the following patch.

----------

Estimated hours taken: 0.5
Branches: main

library/io.m:
	Fix a bug in io__read_binary, where it was leaving the trailing
	newline written by io__write_binary unread.

Index: io.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/io.m,v
retrieving revision 1.315
diff -u -d -u -r1.315 io.m
--- io.m	2 Dec 2003 10:02:05 -0000	1.315
+++ io.m	22 Jan 2004 02:55:14 -0000
@@ -3914,14 +3914,26 @@
 	% (not really binary!)
 	io__binary_input_stream(Stream),
 	io__read(Stream, ReadResult),
-	{ io__convert_read_result(ReadResult, Result) }.
-
-:- pred io__convert_read_result(io__read_result(T), io__result(T)).
-:- mode io__convert_read_result(in, out) is det.
-
-io__convert_read_result(ok(T), ok(T)).
-io__convert_read_result(eof, eof).
-io__convert_read_result(error(Error, _Line), error(io_error(Error))).
+	(
+		{ ReadResult = ok(T) },
+		% We've read the newline and the trailing full stop.
+		% Now skip the newline after the full stop.
+		io__read_char(Stream, NewLineRes),
+		( { NewLineRes = error(Error) } ->
+			{ Result = error(Error) }
+		; { NewLineRes = ok('\n') } ->
+			{ Result = ok(T) }
+		;
+			{ Result = error(io_error(
+				"io.read_binary: missing newline")) }
+		)
+	;
+		{ ReadResult = eof },
+		{ Result = eof }
+	;
+		{ ReadResult = error(ErrorMsg, _Line) },
+		{ Result = error(io_error(ErrorMsg)) }
+	).
 
 %-----------------------------------------------------------------------------%
 
-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list