[m-rev.] for review: make characters an instance of the uenum typeclass
Julien Fischer
jfischer at opturion.com
Tue Dec 20 19:36:32 AEDT 2022
Hi Peter,
On Tue, 20 Dec 2022, Peter Wang wrote:
>> diff --git a/NEWS b/NEWS
>> index a5322b4..913abd3 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -102,6 +102,18 @@ Changes to the Mercury standard library
>> - func `promise_only_solution/1`
>> - pred `promise_only_solution_io/4`
>>
>> +### Changes to the `char` module
>> +
>> +* The following type has had its typeclass memberships changed:
>> +
>> + - The type `character` is now an instance of the new `uenum` typeclass.
>> +
>
> Just state this without the leadup?
It is consistent with how it has been done elsewhere in the NEWS file.
>> diff --git a/extras/lex/lex.m b/extras/lex/lex.m
...
>> @@ -720,10 +720,10 @@ read_from_string(Offset, Result, String, unsafe_promise_unique(String)) :-
>> )
>> ].
>>
>> -:- instance regexp(sparse_bitset(T)) <= (regexp(T),enum(T)) where [
>> +:- instance regexp(sparse_bitset(T)) <= (regexp(T),uenum(T)) where [
>> re(SparseBitset) = charset(Charset) :-
>> Charset = sparse_bitset.foldl(
>> - func(Enum, Set0) = insert(Set0, char.det_from_int(to_int(Enum))),
>> + func(Enum, Set0) = insert(Set0, char.det_from_uint(to_uint(Enum))),
>> SparseBitset,
>> sparse_bitset.init)
>> ].
>
> (BTW, sparse_bitset is an inefficient representation for large charsets,
> like valid_unicode_chars. diet should be better.)
That's a separate change.
>> +:- pragma foreign_proc("C",
>> + to_uint(Character::in) = (UInt::out),
>> + [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
>> + does_not_affect_liveness],
>> +"
>> + UInt = (MR_UnsignedChar) Character;
>> +").
>> +
>> +:- pragma foreign_proc("C#",
>> + to_uint(Character::in) = (UInt::out),
>> + [will_not_call_mercury, promise_pure, thread_safe],
>> +"
>> + UInt = (uint) Character;
>> +").
>> +
>> +:- pragma foreign_proc("Java",
>> + to_uint(Character::in) = (UInt::out),
>> + [will_not_call_mercury, promise_pure, thread_safe],
>> +"
>> + UInt = Character;
>> +").
>> +
>> +:- pragma foreign_proc("C",
>> + from_uint(UInt::in, Character::out),
>> + [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
>> + does_not_affect_liveness],
>> +"
>> + Character = (MR_UnsignedChar) UInt;
>> + SUCCESS_INDICATOR = (UInt <= 0x10ffff);
>> +").
>> +
>> +:- pragma foreign_proc("C#",
>> + from_uint(UInt::in, Character::out),
>> + [will_not_call_mercury, promise_pure, thread_safe],
>> +"
>> + Character = (int) UInt;
>> + SUCCESS_INDICATOR = (UInt <= 0x10ffff);
>> +").
>> +
>> +:- pragma foreign_proc("Java",
>> + from_uint(UInt::in, Character::out),
>> + [will_not_call_mercury, promise_pure, thread_safe],
>> +"
>> + Character = UInt;
>> + SUCCESS_INDICATOR = ((UInt & 0xffffffffL) <= (0x10ffff & 0xffffffffL));
>> +").
>
> Do we need the foreign procs, or can we cast to/from int?
We can avoid the foreign_procs in the to_uint direction; I have replaced
them. Using foreign_procs for the from_uint direction requires less
comparisons for those targets that support unsigned integers directly.
>> diff --git a/tests/hard_coded/char_uint_conv.m b/tests/hard_coded/char_uint_conv.m
>> index e69de29..bc59026 100644
>> --- a/tests/hard_coded/char_uint_conv.m
>> +++ b/tests/hard_coded/char_uint_conv.m
>> + char.det_from_int(0x1fb00), % BLOCK SEXTANT-1
>> + char.det_from_int(0x1fbf9), % SEGEMENTED DIGIT NINE
>> +
>> + % CJK Unified Idenographs Extension B
>
> Ideographs
Fixed.
> That looks fine, otherwise.
Thanks.
Julien.
More information about the reviews
mailing list