[m-rev.] For review: Java implementation of IO library

James Goddard goddardjames at yahoo.com
Fri Jan 9 14:12:54 AEDT 2004


> > +		((java.util.Date) Time).setTime(
> It would be nicer to avoid the casts:
> 
Done.

> Also, the call to setTime(0) is unnecessary; you might as well delete that.
> 
Done.

> According to the docs, lastModified() returns 0 if the file does not exist
> or if an I/O error occurs.  You should check the return value from
> lastModified(), and if it is zero, Status should be set to 0, not 1.
>
Done.

> > +	// The Java implementation can distinguish between regular files and
> > +	// directorys, and for everything else it just returns unknown.
> 
> s/ys/ies/
> 
Done.

> > +	/* stdin, stdout, stderr treated as input/output streams,
> > +	** all files to use random access r+w
> 
> That's OK as a quick hack to get things working, I guess, but I think it's
> not adequate in the long run.  If you open all files r+w, it will result
> in permission errors when trying to read a read-only file, won't it?
> So there should at least be an XXX comment there.
> 
I'm not sure what I was thinking when I did that, I think I was trying to make
it more flexible, but you make a good point.  I'll change it to only allow one
mode (read or write) on each stream.

> Please also check that it is possible to open files which are not seekable,
> e.g. pipes, and if it is not, then at least put an XXX comment.
> 
Sure, I'll look into it. Off the top of my head, Java doesn't know about pipes.
If you try to seek stdin, stdout, or stderr at the moment it throws an
exception, but the message needs to be more helpful.

> > +		public MR_MercuryFileStruct(java.lang.String file) {
> > +			try {
> > +				randomaccess	= new java.io.RandomAccessFile(
> > +						file, ""rws"");
> 
> Please put a comment here explaining the meaning of "rws".  Also, there
> should be a comment explaining why "rws" was chosen rather than "rw" or
> "rwd".
I've done this, but I'll be changing this implementation anyway. (See above)
Also, I'd like your opinion on this. 'rws' means write all data synchronously
to the underlying device. 'rwd' means only write the content, without updating
the file's metadata (which requires an extra IO operation). Which would you
prefer? The reason I use either of them is that RandomAccessFile does not have
an explicit flush() method, so adding this option is safer. (albeit slower)

> > +			} catch (java.lang.Exception e) {
> > +				throw new RuntimeException(e.getMessage());
> > +			}
> 
> Please put a comment here explaining why this is not just "throw e".
> 
Done.

> > +				if (append) {
> > +					while (true) {
> > +						int c = randomaccess.read();
> > +						if (c == '\\n') {
> > +							line_number++;
> > +						} else if (c == -1) {
> > +							break;
> > +						}
> > +					}
> > +				// XXX hopefully at this point we're at the
> > +				// end of the file, ready to append.
> > +				}
> 
> That looks inefficient.  Better to just seek to the end of file,
> i.e. seek(SEEK_END, 0).
> 
> The behaviour of the line number for files opened in append mode should
> be documented in the interface section of io.  The Java back-end should
> do the same thing as the C back-end in that regard.
> 
> I think for the C back-end we only count lines written after the file
> was opened.  This is more efficient than trying to count the number of
> lines in the file.
> 
Ok cool, I didn't realise it was that simple =)

> Why does putc() reset the pushback?
> (And likewise for write().)
> 
That was because I wasn't sure what the behaviour should be when combining
ungetc()s with putc()s.. I envisioned doing something horrible with my stack.
But I'll be changing this so that you can't read and write to the same stream
anyway.
This brings up another point - What happens if you do a series of ungetc()s
followed by a seek?  Should I:
- reset the stack? (my favourite)
- leave it in? (so the next getc gets the char you ungot at the previous seek
location)
- have multiple stacks at different points in the file?
- or not use a stack at all, but in fact insert all the ungot characters into
the file stream, pushing everything after them along (this would be horribly
slow and wouldn't work for read-only files, but *could* be done with
temporaries or big arrays or somesuch)

> Since the Mercury interface has an explicit flush operation,
> wouldn't it be better to use "rw" rather than "rws", and
> rely on the user calling flush appropriately if needed?
> 
> There's obviously an efficiency versus reliability trade-off here;
> but is there any reason why we should do it differently for Java
> than for C?
> 
The reason I did it this way is that RandomAccessFile does not support the
flush() method. The alternative would be to use FileOutputStream, which does
have flush(), but then seeks would not be possible, since FileOutputStream
relies on FileChannel for such operations, which was not introduced until Java
v1.4 and would result in messy compilation errors on older systems. 
RandomAccessFile has been around since JDK 1.0.


> > +		if (openmode.charAt(0) == 'r') {
> > +			return new MR_MercuryFileStruct(filename);
> > +		}
> > +		if (openmode.charAt(0) == 'w') {
> > +			return new MR_MercuryFileStruct(filename);
> > +		}
> 
> Those two operations have different semantics.
> If openmode = "r", then the operation should succeed so long as the
> file is readable, even if it is read-only.  If openmode = "w", 
> the operation should succeed so long as the file is writable,
> even if it is write-only.
> 
I'll change this, as discussed above.

James

http://personals.yahoo.com.au - Yahoo! Personals
New people, new possibilities. FREE for a limited time.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list