[m-rev.] for review: add some unicode support to Mercury
Julien Fischer
juliensf at csse.unimelb.edu.au
Fri Jul 21 14:29:51 AEST 2006
On Fri, 21 Jul 2006, Ian MacLarty wrote:
> On Wed, Jul 19, 2006 at 04:01:57PM +1000, Peter Moulder wrote:
>> On Wed, Jul 05, 2006 at 12:24:25AM +1000, Ian MacLarty wrote:
>>
>>> +The sequence @samp{\x} introduces
>>> a hexadecimal escape; it must be followed by a sequence of hexadecimal
>>> digits and then a closing backslash. It is replaced
>>> with the character whose character code is identified by the hexadecimal
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> I suggest changing to `byte whose value', to clarify that e.g. \xa0\ is
>> replaced by just one byte rather than being equivalent to \u00a0.
>>
>>> + % Return the number of unicode characters in a UTF-8 encoded string.
>>
>> I suggest explicitly stating that the result is undefined (or
>> unspecified or similar) if the given string isn't valid utf-8. (The
>> existing documentation gives the false impression that it counts only
>> valid, complete unicode characters.)
>>
>>> +++ tests/hard_coded/unicode.m 4 Jul 2006 10:01:15 -0000
>> ...
>>> +utf8_strings = [
>>> + "\u0003",
>>> + "\U00000003",
>>> + "\u0394", % delta
>>> + "\u03A0", % pi
>>> + "\uFFFF",
>>> + "\U0010FFFF",
>>> + "\U000ABCDE",
>>> + "r\u00E9sum\u00E9", % "resume" with accents
>>> + "abc123"
>>> +].
>>
>> It would be nice to add "\u005cu0041" as an example (0x5c = backslash),
>> and similarly "\x5c\u0041", "\x5c\\u0041", "\\u0041" and "u0041".
>> It would be good for some of these examples to use lowercase hex digits.
>>
>> Otherwise looks fine to me.
>>
>
> Here's the new diff and CVS log (the interdiff is almost as big as the
> diff, so I'm just posting the diff).
>
> I'll post the new unicode module as a separate change.
>
> Estimated hours taken: 6
> Branches: main
>
> Add escape sequences for encoding unicode characters in Mercury string
A small point: shouldn't it be "Unicode" rather than "unicode" in the
reference manual and other documentation.
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at csse.unimelb.edu.au
administrative address: owner-mercury-reviews at csse.unimelb.edu.au
unsubscribe: Address: mercury-reviews-request at csse.unimelb.edu.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at csse.unimelb.edu.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list