[m-rev.] For review: Java implementation of IO library

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Jan 8 20:30:04 AEDT 2004


On 08-Jan-2004, James Goddard <goddardjames at yahoo.com> wrote:
> Implement some library procedures for the Java back end.
> 
> library/io.m:
> 	Implement the following procedures:
...
> 		io__read_char_code/4
> 		io__read_byte_val/4
> 		io__putback_char/4
> 		io__putback_byte/4
> 		io__write_byte/3
> 		io__write_bytes/3
...
> 		io__write_char/4

Please put in some comments somewhere about the representation of
text files.  The representation of binary files is pretty straight-forward,
but for text files it is not obvious, because Java characters are (16-bit)
Unicode, and text files are sequences of characters, and there's several
different commonly-used mappings between sequences of 16-bit Unicode
characters and sequences of bytes, and also several different commonly-used
representations for the newline character.

See <http://java.sun.com/docs/books/tutorial/i18n/text/convertintro.html>
for more about Unicode encoding/decoding in Java.

> +		public int getc() {
> +			int c;
> +			if (mode == OUTPUT) {
> +				throw new java.lang.RuntimeException(
> +					""Attempted to read output stream"");
> +			}
> +			if (pushback.empty()) {
> +				try {
> +					if (input != null) {
> +						c = input.read();
> +					} else {
> +						c = randomaccess.read();
> +					}

It's not clear whether this function is supposed to get a single byte,
or a single Unicode character.  I suppose it gets a single byte, but in
that case the name getc() is potentially a bit misleading, so there should
at very least be some comments here explaining what this is supposed to do.
Also it would help to have a comment about what this does when it reaches
end-of-file.

> +		public void ungetc(int c) {
> +			if (mode == OUTPUT) {
> +				throw new java.lang.RuntimeException(
> +				""Attempted to unget char to output stream"");
> +			}
> +			if (c == '\\n') {
> +				line_number--;
> +			}
> +
> +			pushback.push(new Integer(c));
> +		}

Again, it's not clear whether this function is supposed to put back a single
byte, or a single Unicode character (which might not fit in a single byte).

If it is supposed to put back a single byte, then you should check that the
value of `c' is in range (or document that it is the responsibility of the
caller to check this).

> +		public void putc(int c) {

Similar issues here.

> +		public void write(java.lang.String s) {
> +			if (mode == INPUT) {
> +				throw new java.lang.RuntimeException(
> +					""Attempted to write to input stream"");
> +			}
> +			for (int i = 0; i < s.length(); i++) {
> +				if (s.charAt(i) == '\\n') {
> +					line_number++;
> +				}
> +			}
> +			
> +			try {
> +				if (output != null) {
> +					output.write(s.getBytes());
> +				} else {
> +					randomaccess.write(s.getBytes());

The documentation for getBytes says "The behavior of this method when
this string cannot be encoded in the default charset is unspecified."
That is not good.  At very least there should be an XXX here.
But it would be better to throw an exception if the string cannot
be encoded.

> @@ -5387,6 +5877,35 @@
...
> +:- pragma foreign_proc("Java", 
> +	io__read_char_code(File::in, CharCode::out, _IO0::di, _IO::uo),
> +	[will_not_call_mercury, promise_pure],
> +"
> +	CharCode = File.getc();
> +").

I don't think that will do the right thing on Windows.
It just reads a single byte, but on Windows a newline may
be represented as two bytes "\r\n".

> +:- pragma foreign_proc("Java",
> +	io__putback_char(File::in, Character::in, _IO0::di, _IO::uo),
> +	[may_call_mercury, promise_pure],
> +"
> +	File.ungetc(Character);
> +").

Similar issues here with representation of Unicode characters and newlines.

> +:- pragma foreign_proc("Java",
> +	io__write_bytes(Message::in, _IO0::di, _IO::uo),
> +	[may_call_mercury, promise_pure, thread_safe, tabled_for_io],
> +"{
> +	mercury_current_binary_output.write(Message);
> +}").

And here.

> +:- pragma foreign_proc("Java",
> +	io__write_string(Stream::in, Message::in, _IO0::di, _IO::uo),
> +	[may_call_mercury, promise_pure, thread_safe, tabled_for_io], 
> +"
> +	Stream.write(Message);
> +").

And here.

> +:- pragma foreign_proc("Java",
> +	io__write_char(Stream::in, Character::in, _IO0::di, _IO::uo),
> +	[may_call_mercury, promise_pure, thread_safe, tabled_for_io],
> +"
> +	Stream.putc(Character);
> +").

And here.

> +:- pragma foreign_proc("Java",
> +	io__do_open_text(FileName::in, Mode::in, ResultCode::out,
> +		StreamId::out, Stream::out, _IO0::di, _IO::uo),
> +	[will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> +"
> +	try {
> +		Stream = mercury_open(FileName, Mode);
> +		StreamId = Stream.id;
> +		ResultCode = 0;
> +	} catch (java.lang.Exception e) {
> +		Stream = null;
> +		StreamId = -1;
> +		ResultCode = -1;
> +	}
> +").

The exception here should not be thrown away,
it may contain useful information.

I think it needs to be assigned to MR_io_exception,
so that io__get_system_error will report it properly.

> +:- pragma foreign_proc("Java",
> +	io__do_open_binary(FileName::in, Mode::in, ResultCode::out,
> +		StreamId::out, Stream::out, _IO0::di, _IO::uo),
> +	[will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> +"
> +	try {
> +		Stream = mercury_open(FileName, Mode);
> +		StreamId = Stream.id;
> +		ResultCode = 0;
> +	} catch (java.lang.Exception e) {
> +		Stream = null;
> +		StreamId = -1;
> +		ResultCode = -1;
> +	}
> +").

Likewise.

> @@ -6426,6 +7238,76 @@
>  
> +:- pragma foreign_proc("Java",
> +	io__progname(_Default::in, PrognameOut::out, _IO0::di, _IO::uo),
> +	[will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> +"
> +	PrognameOut = ""java "" + mercury.runtime.JavaInternal.progname;
> +").

I don't think that should prepend "java ".

> +:- pragma foreign_proc("Java",
> +	io__call_system_code(Command::in, Status::out, Msg::out,
> +			_IO0::di, _IO::uo),
> +	[will_not_call_mercury, promise_pure, tabled_for_io],
> +"
> +	try {
> +		java.lang.Process command = java.lang.Runtime.getRuntime().
> +				exec(Command);
> +		command.waitFor();
> +		Status	= command.exitValue();
> +		Msg	= null;
> +	} catch (java.lang.Exception e) {
> +		Status	= 127;
> +		Msg	= e.getMessage();
> +	}
> +").

Does this do the right thing with regard to stdin/stdout/stderr
for the invoked process?

Please make sure that there is a test case for that, e.g. a test that
invokes "sort" as a subcommand (I suggest that because there's a "sort"
command on both Windows and on Posix systems), and has a .inp file
containing some input to sort and a corresponding .exp file containing
the expected sorted output.

> +:- pragma foreign_proc("Java",
> +	io__getenv(Var::in, Value::out),
> +	[will_not_call_mercury, tabled_for_io],
> +"
> +	// Note that this approach only works for environmental variables that
> +	// are recognized by Java and hava a special format.
> +	// (eg os.name, user.timezone etc)
> +	// To add custom environmental variables, they must be set at the
> +	// command line with java's -D option
> +	// (ie java -DENV_VARIABLE_NAME=$ENV_VARIABLE_NAME ClassName)
> +	// XXX Perhaps a better approach would be to determine the OS at
> +	// runtime and then Runtime.exec() the equivalent of 'env'?
> +
> +	try {
> +		Value = java.lang.System.getProperty(Var);
> +		succeeded = (Value != null);
> +	} catch (java.lang.Exception e) {
> +		Value = null;
> +		succeeded = false;
> +	}
> +").

Ugh.  Java sucks.

I guess this will do for now.

> +:- pragma foreign_proc("Java",
> +        io__make_temp(FileName::out, _IO0::di, _IO::uo),
> +        [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> +"
> +	try {
> +		FileName = java.io.File.createTempFile(""mercury"", null).
> +				getName();
> +	} catch (java.lang.Exception e) {
> +		throw new RuntimeException(e.getMessage());
> +	}
> +").

The documentation for io__make_temp should be updated to say that the
directory used is system-dependent, and should say how this is chosen
for Java.

I suggest using "mtmp" rather than "mercury", for consistency with the
C back-end.

> +:- pragma foreign_proc("Java",
> +        io__make_temp(Dir::in, Prefix::in, FileName::out, _IO0::di, _IO::uo),
> +        [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> +"
> +	try {
> +		if (Prefix.length() < 3) {
> +			Prefix = Prefix + ""xxx"";

I suggest "tmp" rather than "xxx".
Also, it would help to have a comment here

			// Java's createTempFile requires length >= 3

> +		} else if (Prefix.length() > 5) {
> +			Prefix = Prefix.substring(0, 5);

Likewise a comment here would help, e.g.

			// The documentation for io__make_temp says that
			// we should only use the first five characters
			// of Prefix.

> +:- pragma foreign_proc("Java",
> +	io__remove_file_2(FileName::in, RetVal::out, RetStr::out,
> +		_IO0::di, _IO::uo),
> +	[will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
> +"
> +	try {
> +		java.io.File file = new java.io.File(FileName);
> +
> +		if (file.exists()) {
> +			if (file.delete()) {
> +				RetVal = 0;
> +				RetStr = """";
> +			} else {
> +				RetVal = -1;
> +				RetStr = ""remove_file failed"";
> +			}
> +		} else {
> +			RetVal = -1;
> +			RetStr = ""remove failed: No such file or directory"";

s/remove/remove_file/

Is it really necessary to call file.exists()?
Why not just call file.delete()?
(If there is a good reason for that, please document it.)


The rest of this change looks fine.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list