[mercury-users] Character type class

Richard A. O'Keefe ok at hermes.otago.ac.nz
Tue Feb 1 09:24:50 AEDT 2000


Michael wrote
	So, the only sensible way to relate two characters is by their
	position in the code table, ignoring any locale information?

Yes.  The really central point here is that locale-dependent string orders
are all about *writing systems*, and there is no one-to-one correspondence
between the "things" that are assigned character codes and the elements of
a writing system.  This goes both ways:
    ng is a single element of the M\=aori writing system,
       with a definite order (n... < ng... < o...),
       but is two coded characters
    \'e (in English) is two elements of the writing system (a letter e
       and a stress accent) but is a single coded character.
and then there are positional variants of letters, where the things have
no defined order in the writing system at all, because they cannot occur
in the same context (Final vs non-final sigma in Greek, long s vs short
s in old-fashioned English, the initial, medial, final, and separate forms
of most Arabic letters) but the character codes must have a definite order.

	That seems quite
	reasonable. Perhaps offer two varieties of string comparison, one that
	respects locale and one that does a faster comparison purely by character
	code, for use when building tree based maps of strings or similar?
	
Exactly.  With the fast comparison freed from the struggle to almost sort
of look rather like dictionary order, without actually being it, there is
no compelling reason to use pure lexicographic order.  It might, for example,
be preferable to compare the lengths, and switch to lexicographic order only
for strings of the same length.  (This order has the pleasant property that
if X < Z there are only finitely many Y such that X < Y < Z.)
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the users mailing list