[m-rev.] for review: read Mercury files using read_file_as_string

Peter Wang novalazy at gmail.com
Sun Jul 28 11:35:07 AEST 2019


On Sat, 27 Jul 2019 17:47:23 +1000 (AEST), Julien Fischer <jfischer at opturion.com> wrote:
> 
> Hi Zoltan,
> 
> On Fri, 26 Jul 2019, Zoltan Somogyi wrote:
> 
> > This diff, which is for review by anyone, gets about a 2% speedup
> > on my laptop. However, it does raise two issues.
> >
> > The first is that after this change, the compiler cannot pass the invalid/null_char
> > test case. As the attached diff shows, instead of reporting null chars in the items
> > in which they occur, it reports simply that there is a null char in the file, and
> > does not even attempt to parse it. We could improve on this by reporting e.g.
> > the line numbers of all the lines that contain NULs, but we could never reproduce
> > the current expected output of invalid/null_char. The question is, is the speedup
> > worth this complication in this very rare situation? And if yes, would the above
> > mentioned improvement be useful to anyone? (It isn't to me.)
> 
> I don't think the less accurate error message is significant; the
> compiler doesn't pass that test in non-C grades anyway.  (Because the
> target languages are fine with null chars in strings.)
> 
> IIRC, that test was not added because of any specfic issue with the
> compiler but as part of change by Simon to prevent null chars being used
> in Mercury generally.

I think it's fine just to comment out the lines containing NUL bytes.

> > The second is that the speedup could be slightly higher if read_file_as_string
> > returned the max offset in the string it returns, which it has to know in order to
> > allocate the right amount of memory. While I could write a new variant of
> > the C version of read_file_as_string that does that, maybe called read_file_as_
> > string_and_length,
> 
> Don't use length, since that's ambiguous.
> 
> > I could not do the same for the Java and Erlang versions.  Would
> > anyone like to volunteer?
> 
> I suggest you add the C implementation and a fallbck Mercury implementation
> (as we have with read_file_as_string).  I can add the Java version.
> My Erlang is probably not up to the task of adding it for that language :-(

There should be no need for specific versions for the non-C backends,
as the string representation in those backends include the length in
code units.

Peter


More information about the reviews mailing list