[mercury-users] Re: [m-dev.] Character type class

Thomas Conway conway at cs.mu.OZ.AU
Mon Jan 31 15:59:39 AEDT 2000


On Mon, Jan 31, 2000 at 02:07:48PM EST, Michael Day wrote:
> 
> > In practical terms, Unicode supports all the major language groups.
> > FWIT, Unicode is *not* adequate for one of our clients - they want
> > to make sgml/xml databases of ancient documents including the ancient
> > Japanese script, Sanscrit, etc, etc, *none* of which are supported
> > by Unicode.
> 
> So support for character types such as UNICODE would be great, but it must
> be relatively straightforward for the user to implement their own instance
> of the character type class and take advantage of the existing character
> operations...

There are 2 issues at work in this area: container size and encoding.
Ascii is nice because each code fits in a single container. All the
other encodings (except perhaps for the UTF32, ISO one) some codes
require more than one container. The problems arise when you want
to ask for the length of a "string" of characters - if there is not
a one-to-one correspondance between codes and containers for codes,
then you need to be able to interpret the encoding.

Now, in Mercury you *could* handle this:
	:- type string(CharType) == array(CharType).
But this is wasteful to varying extents (8 bit chars on the alpha,
for example).

It would help if Mercury supported packed arrays for chars (and
floats while we're about it).

Thomas
-- 
 Thomas Conway )O+     Every sword has two edges.
     Mercurian            <conway at cs.mu.oz.au>
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the users mailing list