[m-rev.] for review: more specific errors when file sizes exceed maximum buffer sizes

Julien Fischer jfischer at opturion.com
Mon Oct 28 13:33:25 AEDT 2019



On Mon, 28 Oct 2019, Peter Wang wrote:

> On Mon, 28 Oct 2019 13:08:26 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:
>>
>> Hi Peter,
>>
>> On Mon, 28 Oct 2019, Peter Wang wrote:
>>
>>> On Fri, 25 Oct 2019 16:02:28 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:
>>>>
>>>> More specific errors when file sizes exceed maximum buffer sizes.
>>>>
>>>> Predicates such as read_file_as_string and read_binary_input_as_bitmap, use
>>>> buffers that are backend by Mercury arrays.  Since Mercury arrays are indexed
>>>> using the int type, it is possible to cause the buffer index to overflow when
>>>> reading large files on 32-bit platforms.  Detect this situation and throw an
>>>> exception that is specific to it.  Currently, the behaviour we get depends on
>>>> how much the overflowing index wraps around.
>>>>
>>>> As part of the above, fix the XXXs where we were casting 64-bit stream file
>>>> sizes to ints; file stream files sizes are now always returned as int64s.
>>>>
>>>> Julien.
>>>>
>>>> diff --git a/library/io.m b/library/io.m
>>>> index 91c1a72..7b63ed4 100644
>>>> --- a/library/io.m
>>>> +++ b/library/io.m
>>>> @@ -200,6 +200,9 @@
>>>>       % Returns an error if the file contains a null character, because
>>>>       % null characters are not allowed in Mercury strings.
>>>>       %
>>>> +    % Throws an exception if the number of characters in the stream exceeds the
>>>> +    % maximum index of an array on this platform.
>>>> +    %
>>>
>>> Number of characters may be interpreted as code points,
>>
>> Number of characters *is* interpreted as code points because the text
>> directly above (not shown in the diff) says exactly that.
>
> The limit should be related to the number of code units, though.

Hmmm, it actually depends on the backend.

     :- type buffer
         --->    buffer(array(char)).

     :- pragma foreign_type(c, buffer, "char *", [can_pass_as_mercury_type]).

So code units for C, code points (at the moment) for the other backends.

> On second thought, why throw an exception instead of returning an error?

Doing so would break the existing interface; there isn't currently a
slot to return such an error.

Julien.


More information about the reviews mailing list