[m-dev.] Unicode support in Mercury

Ian MacLarty maclarty at csse.unimelb.edu.au
Mon May 9 19:50:20 AEST 2011


2011/5/9 Matt Giuca <matt.giuca at gmail.com>:
> I feel like languages have two choices: either provide an 8-bit clean
> string type (e.g., C, Lua, Go, PHP, Ruby), or provide an abstract
> Unicode string type where the user doesn't need to be aware of the
> representation (e.g., Java, Python).

I don't think this is true for Java.  The Java length method returns
the number of code units in the string, not the number of code points
(for that there is codePointCount).  Mercury's approach seems to me to
be the same as Java's, except that Java uses UTF16, making it less
likely for the length to return a different value from codePointCount.
 See http://download.oracle.com/javase/6/docs/api/java/lang/String.html
and http://download.oracle.com/javase/6/docs/api/java/lang/Character.html#unicode.

Ian.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions:          mercury-developers-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the developers mailing list