[m-rev.] For review: Java implementation of IO library

James Goddard goddardjames at yahoo.com
Wed Jan 14 14:08:42 AEDT 2004


> Please put in some comments somewhere about the representation of
> text files.  The representation of binary files is pretty straight-forward,
> but for text files it is not obvious, because Java characters are (16-bit)
> Unicode, and text files are sequences of characters, and there's several
> different commonly-used mappings between sequences of 16-bit Unicode
> characters and sequences of bytes, and also several different commonly-used
> representations for the newline character.

Ok, all text file handling is now done via Readers/Writers, which do all this
stuff automatically.  I've got them set to use the default charset
encoding/decoding atm, so that the files produced will be compatible.

> > +		public int getc() {
> > +			int c;

> 
> It's not clear whether this function is supposed to get a single byte,
> or a single Unicode character.  I suppose it gets a single byte, but in
> that case the name getc() is potentially a bit misleading, so there should
> at very least be some comments here explaining what this is supposed to do.
> Also it would help to have a comment about what this does when it reaches
> end-of-file.

I've added that comment, and split the function into two methods: read_char()
and read_byte() which are for text and binary files respectively.

> 
> > +		public void ungetc(int c) {
> 
> Again, it's not clear whether this function is supposed to put back a single
> byte, or a single Unicode character (which might not fit in a single byte).
> 
> If it is supposed to put back a single byte, then you should check that the
> value of `c' is in range (or document that it is the responsibility of the
> caller to check this).

ungetc() takes an int, which is enough room to hold a Unicode character, so
there's no need to check this.


> > +				} else {
> > +					randomaccess.write(s.getBytes());
> 
> The documentation for getBytes says "The behavior of this method when
> this string cannot be encoded in the default charset is unspecified."
> That is not good.  At very least there should be an XXX here.
> But it would be better to throw an exception if the string cannot
> be encoded.

The new implementation only uses getBytes() for binary files, which should be
ok since we don't expect people to call write(String) on a binary file.  I've
added that XXX just in case.  Would it be better just to throw an exception if
they try to write text to a binary file?

> > +	} catch (java.lang.Exception e) {
> > +		Stream = null;
> > +		StreamId = -1;
> > +		ResultCode = -1;
> > +	}
> > +").
> 
> The exception here should not be thrown away,
> it may contain useful information.
> 
> I think it needs to be assigned to MR_io_exception,
> so that io__get_system_error will report it properly.

Done.


> > +:- pragma foreign_proc("Java",
> > +	io__progname(_Default::in, PrognameOut::out, _IO0::di, _IO::uo),
> > +	[will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> > +"
> > +	PrognameOut = ""java "" + mercury.runtime.JavaInternal.progname;
> > +").
> 
> I don't think that should prepend "java ".

Ok, it's gone.

> > +:- pragma foreign_proc("Java",
> > +	io__call_system_code(Command::in, Status::out, Msg::out,
> > +			_IO0::di, _IO::uo),
> > +	[will_not_call_mercury, promise_pure, tabled_for_io],
> > +"
> > +	try {
> > +		java.lang.Process command = java.lang.Runtime.getRuntime().
> > +				exec(Command);
> 
> Does this do the right thing with regard to stdin/stdout/stderr
> for the invoked process?
>

It does now!

> Please make sure that there is a test case for that, e.g. a test that
> invokes "sort" as a subcommand (I suggest that because there's a "sort"
> command on both Windows and on Posix systems), and has a .inp file
> containing some input to sort and a corresponding .exp file containing
> the expected sorted output.

Done.

> The documentation for io__make_temp should be updated to say that the
> directory used is system-dependent, and should say how this is chosen
> for Java.

Done.  I've also enforced the use of the environmental variable java.io.tmpdir.

> I suggest using "mtmp" rather than "mercury", for consistency with the
> C back-end.

Done.

> > +			Prefix = Prefix + ""xxx"";
>
> I suggest "tmp" rather than "xxx".
> Also, it would help to have a comment here
>
>			// Java's createTempFile requires length >= 3
>
> > +		} else if (Prefix.length() > 5) {
> > +			Prefix = Prefix.substring(0, 5);
>
> Likewise a comment here would help

Done and done.

> > +		if (file.exists()) {
> > +			if (file.delete()) {
> > +				RetVal = 0;
> > +				RetStr = """";
> > +			} else {
> > +				RetVal = -1;
> > +				RetStr = ""remove_file failed"";
> > +			}
> > +		} else {
> > +			RetVal = -1;
> > +			RetStr = ""remove failed: No such file or directory"";
>
> s/remove/remove_file/
> 
> Is it really necessary to call file.exists()?

No, not really.  I've removed that check.

I'll post a relative diff soon.

James

http://personals.yahoo.com.au - Yahoo! Personals
New people, new possibilities. FREE for a limited time.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list