[m-rev.] for review: escape characters in strings returned by deconstruct.functor/4

Julien Fischer jfischer at opturion.com
Wed Jul 11 01:38:40 AEST 2018


On Tue, 10 Jul 2018, Zoltan Somogyi wrote:

> On Wed, 11 Jul 2018 00:56:50 +1000 (AEST), Julien Fischer <jfischer at opturion.com> wrote:
>> diff --git a/NEWS b/NEWS
>> index cb97697..3fdc989 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -525,6 +525,10 @@ Changes to the Mercury standard library:
>>     - get_uint8/1, unsafe_get_uint8/1
>>     - set_uint8/4, unsafe_set_uint8/4
>>
>> +* The functor/4 predicate in the deconstruct module has been modified so
>> +  that for character and string data any control characters in the functor
>> +  representation are escaped.
>
> Add a comma before "any".

Done.

>> diff --git a/runtime/mercury_string.c b/runtime/mercury_string.c
>> index 4c11f3f..a061509 100644
>> --- a/runtime/mercury_string.c
>> +++ b/runtime/mercury_string.c
>> @@ -91,6 +91,129 @@ MR_make_string(MR_AllocSiteInfoPtr alloc_id, const char *fmt, ...)
>>       return result;
>>   }
>>
>> +// The code for this function should be kept in sync with that of the
>> +// quote_string predicates in library/term_io.m.
>> +MR_bool
>> +MR_escape_string_quote(MR_String *ptr, const char * string)
>> +{
>> +    MR_Integer pos = 0;
>> +    size_t  num_code_units = 0;
>> +    MR_Char ch;
>> +    MR_bool must_escape = MR_FALSE;
>> +
>> +    // Check if we need to add character escapes to the string.
>> +    //
>> +    while ((ch = MR_utf8_get_next((MR_String)string, &pos)) > 0) {
>
> Space after cast.

Done.

>
>> +    // Check that there the string's encoding was valid.
>> +    //
>
> "there"?
>
> And why the extra empty comment line?

Both deleted.

>
>> +    if (ch < 0) {
>> +        *ptr = NULL;
>> +        return MR_FALSE;
>> +    }
>> +
>> +    if (must_escape) {
>> +
>
> Delete blank line.

Done.

>
>> +        while ((ch = MR_utf8_get_next((MR_String)string, &pos)) > 0) {
>
> Space after cast.

Done.

>> +// True if c is a Unicode control code point, i.e. U+0000..U+001f,
>> +// U+007f..U+009f.
>> +
>> +#define MR_is_control(c) ((0x00 <= (unsigned)(c) && (unsigned)(c) <= 0x1f) || \
>> +                          (0x7f <= (unsigned)(c) && (unsigned)(c) <= 0x9f))
>> +
>
> Space after casts.

Done (and also for the casts in the surrounding macros that similarly
lack them).

>> diff --git a/tests/hard_coded/deconstruct_arg.exp b/tests/hard_coded/deconstruct_arg.exp
>> index 7f4ef4f..49ed6e9 100644
>> --- a/tests/hard_coded/deconstruct_arg.exp
>> +++ b/tests/hard_coded/deconstruct_arg.exp
>> @@ -376,6 +376,104 @@ deconstruct deconstruct: functor 'Ω' arity 0
>>   deconstruct limited deconstruct 3 of 'Ω'
>>   functor 'Ω' arity 0 []
>>
>> +deconstruct functor: ""/0
>> +deconstruct argument 0 of "" doesn't exist
>> +deconstruct argument 1 of "" doesn't exist
>> +deconstruct argument 2 of "" doesn't exist
>> +deconstruct argument 'moo' doesn't exist
>> +deconstruct argument 'mooo!' doesn't exist
>> +deconstruct argument 'packed1' doesn't exist
>> +deconstruct argument 'packed2' doesn't exist
>> +deconstruct argument 'packed3' doesn't exist
>> +deconstruct deconstruct: functor "" arity 0
>> +[]
>> +deconstruct limited deconstruct 3 of ""
>> +functor "" arity 0 []
>
> Before the next change to this test case, we should make its output
> less verbose by reporting simply "no argument exists" if all the
> argument lookup tests fail for an arity-zero term.

Ok, I'll go ahead and do that separately.  (Since there will be
at least on more change functor/4 and friends upcoming.)

> Apart from the minor issues above, the diff seems fine.

Thanks for that.

Julien.


More information about the reviews mailing list