[m-rev.] for review: fix problems with the char module

Julien Fischer jfischer at opturion.com
Thu Sep 11 14:59:57 AEST 2014


On Thu, 11 Sep 2014, Peter Wang wrote:

> On Wed, 10 Sep 2014 17:36:38 +1000 (EST), Julien Fischer <jfischer at opturion.com> wrote:
>>
>> Fix problems with the char module.
>>
>> (1) The behaviour of digit_to_int/2 was inconsistent with that of is_digit/2.
>> The former succeeds for all of 0-9, a-z and A-Z while the latter succeeds only
>> for 0-9 (i.e. it was possible for digit_to_int/2 to succeed for non-decimal
>> characters, which is not what was intended in many of it uses).
>>
>> (2) Predicates involving hexadecimal digits were inconsistently named, they
>> were "hex digits" in one predicate name, "hex chars" in another.
>>
>> This change ensures that the following operations are supported for binary,
>> octal, decimal and hexadecimal digits and that we use a consistent naming
>> scheme for the predicates that implement them:
>>      - testing if a character is a digit of the given base
>>      - conversion to an int
>>      - conversion from an int
>>
>> In addition, we also add predicates for supporting these operations for user
>> defined bases, ranging from 2-36.
>>
>> library/char.m:
>>      Add the predicate is_decimal_digit/1, which is a synonym for is_digit/1.
>>
>>      Add the predicate is_base_digit/2.
>>
>>      Add the predicates int_to_{binary,octal,decimal,hex}_digit/2 and
>>      base_int_to_digit/3.
>>
>>      Add the predicates {binary,octal,decimal,hex}_digit_to_int/2 and
>>      base_digit_to_int/3.
>>
>>      Add det function versions of the above.
>>
>>      Delete the function det_digit_to_int/1 that I added the other day.
>>
>>      Mark the following as obsolete:
>>          - is_hex_digit/2
>>          - int_to_hex_char/2
>>          - int_to_digit/2
>>          - det_int_to_digit/1
>>          - det_int_to_digit/2
>
> It's probably the right thing to do, but the is_hex_digit and
> int_to_hex_char deprecations will be annoying :(

We can make them obsolete by comment only for now if you think the
presence of the pragmas will be particularly annoying.  I don't intend
to actually remove any of these for at least several release.

>> -* We have added the function det_digit_to_int/1 to the char module.
>> +* The following predicates and functions have been added to the char module:
>> +
>> +    - is_decimal_digit/1
>> +    - is_base_digit/2
>> +    - int_to_binary_digit/2
>> +    - int_to_octal_digit/2
>> +    - int_to_decimal_digit/2
>> +    - int_to_hex_digit/2
>> +    - base_int_to_digit/3, det_base_int_to_digit/2
>> +    - binary_digit_to_int/2, det_binary_digit_to_int/1
>> +    - octal_digit_to_int/2, det_octal_digit_to_int/1
>> +    - decimal_digit_to_int/2, det_decimal_digit_to_int/1
>> +    - hex_digit_to_int/2, det_hex_digit_to_int/1
>> +    - base_digit_to_int/3, det_base_digit_to_int/2
>> +
>> +  The following predicates in the char module have been deprecated and will
>> +  either be removed or have their semantics changed in a future release.
>> +
>> +    - is_hex_digit/2
>> +    - int_to_hex_char/2
>> +    - digit_to_int/2
>> +    - int_to_digit/2
>> +    - det_int_to_digit/1, det_int_to_digit/2
>> +
>> +  NOTE: existing code that calls the predicate char.digit_to_int/2 and assumes
>> +  that the call will succeed only for those characters for which
>> +  char.is_digit/1 is true may be broken.  char.digit_to_int/2 succeeds for
>> +  the characters 0-9, a-z and A-Z, not just 0-9.
>
> I suggest:
>
>    NOTE: existing code that calls char.digit_to_int/2 assuming that it
>    will only succeed for decimal digits (0-9) may be broken.

Ok.

>> @@ -141,39 +141,125 @@
>>       %
>>   :- pred char.is_binary_digit(char::in) is semidet.
>>
>> -    % True iff the character is a octal digit (0-7) in the ASCII range.
>> +    % True iff the character is an octal digit (0-7) in the ASCII range.
>>       %
>>   :- pred char.is_octal_digit(char::in) is semidet.
>>
>> -    % True iff the character is a hexadecimal digit (0-9, a-f, A-F)
>> -    % in the ASCII range.
>> +    % True iff the character is a decimal digit (0-9) in the ASCII range.
>> +    % Synonym for char.is_digit/1.
>> +    %
>> +:- pred char.is_decimal_digit(char::in) is semidet.
>> +
>> +    % True iff the character is a hexadecimal digit (0-9, a-f, A-F) in the
>> +    % ASCII range.
>>       %
>>   :- pred char.is_hex_digit(char::in) is semidet.
>> +
>> +    % is_base_digit(Base, Digit):
>> +    % True iff Digit is a digit in the given Base (0-9, a-z, A-Z).
>> +    % Throws an exception if Base < 2 or Base > 36.
>> +    %
>> +:- pred char.is_base_digit(int::in, char::in) is semidet.
>> +
>> +    % Convert an integer in the range 0-1 to a binary digit (0 or 1) in the
>> +    % ASCII range.
>> +    %
>> +:- pred char.int_to_binary_digit(int::in, char::out) is semidet.
>> +
>> +    % Convert an integer 0-7 to an octal digit (0-7) in the ASCII range.
>> +    %
>> +:- pred char.int_to_octal_digit(int::in, char::out) is semidet.
>> +
>> +    % Convert an integer 0-9 to a decimal digit (0-9) in the ASCII range.
>> +    %
>> +:- pred char.int_to_decimal_digit(int::in, char::out) is semidet.
>> +
>> +    % Convert an integer 0-15 to a hexadecimal digit (0-9, A-F) in the ASCII
>> +    % range.
>> +    %
>> +:- pred char.int_to_hex_digit(int::in, char::out) is semidet.
>
> to an uppercase hexadecimal digit

Done.

>> +    % base_int_to_digit(Base, Int, DigitChar):
>> +    % True iff `DigitChar' is a decimal digit (0-9) or an uppercase letter
>> +    % (A-Z) representing the value `Int' (0-35) in the given base.
>> +    % Throws an exception if `Base' < 2 or `Base' > 36.
>> +    %
>> +:- pred char.base_int_to_digit(int::in, int::in, char::out) is semidet.
>> +
>> +    % As above, but throw an exception instead of failing.
>> +    %
>> +:- func char.det_base_int_to_digit(int, int) = char.
>
>> +    % True iff char is a binary digit (0 or 1).
>> +    % Returns the character's value as a digit (0-1).
>> +    %
>> +:- pred char.binary_digit_to_int(char::in, int::out) is semidet.
>
> I suggest:
>
>    % binary_digit_to_int(Char, Int):
>    % True iff Char is a binary digit (0-1) representing the value Int.
>
> Similarly for the rest.

Done.

>
>> +
>> +    % As above, but throws an exception instead of failing.
>> +    %
>> +:- func char.det_binary_digit_to_int(char) = int.
>> +
>> +    % True iff char is an octal digit (0-7).
>> +    % Returns the character's value as a digit (0-7).
>> +    %
>> +:- pred char.octal_digit_to_int(char::in, int::out) is semidet.
>> +
>> +    % As above, but throws an exception instead of failing.
>> +    %
>> +:- func char.det_octal_digit_to_int(char) = int.
>> +
>> +    % True iff char is a decimal digit (0-9).
>> +    % Returns the character's value as a digit (0-9).
>> +    %
>> +:- pred char.decimal_digit_to_int(char::in, int::out) is semidet.
>> +
>> +    % As above, but throws an exception instead of failing.
>> +    %
>> +:- func char.det_decimal_digit_to_int(char) = int.
>> +
>> +    % True iff char is a hexadecimal digit (0-9, a-z or A-F).
>> +    % Returns the character's value as a digit (0-15).
>> +    %
>> +:- pred char.hex_digit_to_int(char::in, int::out) is semidet.
>> +
>> +    % As above, but throws an exception instead of failing.
>> +    %
>> +:- func char.det_hex_digit_to_int(char) = int.
>> +
>> +    % base_digit_to_int(Base, DigitChar, Int):
>> +    % True iff `DigitChar' is a decimal digit (0-9) or a letter (a-z, A-Z)
>> +    % representing the value `Int' (0-35) in the given base.
>> +    % Throws an exception if `Base' < 2 or `Base' > 36.
>> +    %
>> +:- pred char.base_digit_to_int(int::in, char::in, int::out) is semidet.
>> +
>> +    % As above, but throws an exception instead of failing.
>> +    %
>> +:- func char.det_base_digit_to_int(int, char) = int.
>> +
>> +:- pragma obsolete(char.is_hex_digit/2).
>>   :- pred char.is_hex_digit(char, int).
>>   :- mode char.is_hex_digit(in, out) is semidet.
>
> Please name its replacement in a comment.  Similarly for the rest.

Done.

>>
>> -    % Convert an integer 0-15 to a hexadecimal digit (0-9, A-F)
>> -    % in the ASCII range.
>> +    % Convert an integer 0-15 to a hexadecimal digit (0-9, A-F) in the ASCII
>> +    % range.
>>       %
>> +:- pragma obsolete(char.int_to_hex_char/2).
>>   :- pred char.int_to_hex_char(int, char).
>>   :- mode char.int_to_hex_char(in, out) is semidet.
>>
>>       % Succeeds if char is a decimal digit (0-9) or letter (a-z or A-Z).
>>       % Returns the character's value as a digit (0-9 or 10-35).
>>       %
>> +:- pragma obsolete(char.digit_to_int/2).
>>   :- pred char.digit_to_int(char::in, int::out) is semidet.
>>
>> -    % As above, but calls error/1 instead of failing.
>> +    % char.int_to_digit(Int, DigitChar):
>>       %
>> -:- func char.det_digit_to_int(char) = int.
>> -
>> -    % char.int_to_uppercase_digit(Int, DigitChar):
>> -    %
>> -    % True iff `Int' is an integer in the range 0-35 and
>> -    % `DigitChar' is a decimal digit or uppercase letter
>> -    % whose value as a digit is `Int'.
>> +    % True iff `Int' is an integer in the range 0-35 and `DigitChar' is a
>> +    % decimal digit or uppercase letter whose value as a digit is `Int'.
>>       %
>> +:- pragma obsolete(char.int_to_digit/2).
>>   :- pred char.int_to_digit(int, char).
>>   :- mode char.int_to_digit(in, out) is semidet.
>>   :- mode char.int_to_digit(out, in) is semidet.
>> @@ -181,7 +267,9 @@
>>       % Returns a decimal digit or uppercase letter corresponding to the value.
>>       % Calls error/1 if the integer is not in the range 0-35.
>>       %
>> +:- pragma obsolete(char.det_int_to_digit/1).
>>   :- func char.det_int_to_digit(int) = char.
>> +:- pragma obsolete(char.det_int_to_digit/2).
>>   :- pred char.det_int_to_digit(int::in, char::out) is det.
>>
>>       % Convert a char to a pretty_printer.doc for formatting.
>
>> @@ -372,71 +460,228 @@ char.to_upper(Char, Upper) :-
>>   % but these versions are very portable.
>>
>>   %-----------------------------------------------------------------------------%
>> +%
>> +% Digit classification.
>> +%
>> +
>> +is_binary_digit('0').
>> +is_binary_digit('1').
>
> (Ideally is_*_digit would just call *_digit_to_int and end up with the
> same code)

I'll look into that another time -- for this change, I stuck with what
the existing code did.

>> +is_base_digit(Base, Digit) :-
>> +    ( ( Base < 2 ; Base > 36) ->
>> +        error("char.is_base_digit: invalid base")
>> +    ;
>> +        base_digit_to_int(Base, Digit, _Int)
>> +    ).
>
> Invert the condition (and elsewhere).

Done.  Diff follows:

Cheers,
Julien.

diff --git a/NEWS b/NEWS
index 8fc219b..0af8eec 100644
--- a/NEWS
+++ b/NEWS
@@ -59,10 +59,8 @@ Changes to the Mercury standard library:
      - int_to_digit/2
      - det_int_to_digit/1, det_int_to_digit/2

-  NOTE: existing code that calls the predicate char.digit_to_int/2 and assumes
-  that the call will succeed only for those characters for which
-  char.is_digit/1 is true may be broken.  char.digit_to_int/2 succeeds for
-  the characters 0-9, a-z and A-Z, not just 0-9.
+  NOTE: existing code that calls char.digit_to_int/2 assuming that it will
+  only succeed for decimal digits (0-9) may be broken.

  Changes to the Mercury compiler:

diff --git a/library/char.m b/library/char.m
index 4af004d..76146a6 100644
--- a/library/char.m
+++ b/library/char.m
@@ -174,15 +174,15 @@
      %
  :- pred char.int_to_decimal_digit(int::in, char::out) is semidet.

-    % Convert an integer 0-15 to a hexadecimal digit (0-9, A-F) in the ASCII
-    % range.
+    % Convert an integer 0-15 to an uppercase hexadecimal digit (0-9, A-F) in
+    % the ASCII range.
      %
  :- pred char.int_to_hex_digit(int::in, char::out) is semidet.

-    % base_int_to_digit(Base, Int, DigitChar):
-    % True iff `DigitChar' is a decimal digit (0-9) or an uppercase letter
-    % (A-Z) representing the value `Int' (0-35) in the given base.
-    % Throws an exception if `Base' < 2 or `Base' > 36.
+    % base_int_to_digit(Base, Int, Char):
+    % True iff Char is a decimal digit (0-9) or an uppercase letter (A-Z)
+    % representing the value Int (0-35) in the given base.
+    % Throws an exception if Base < 2 or Base > 36.
      %
  :- pred char.base_int_to_digit(int::in, int::in, char::out) is semidet.

@@ -190,8 +190,8 @@
      %
  :- func char.det_base_int_to_digit(int, int) = char.

-    % True iff char is a binary digit (0 or 1).
-    % Returns the character's value as a digit (0-1).
+    % binary_digit_to_int(Char, Int):
+    % True iff Char is a binary digit (0-1) representing the value Int.
      %
  :- pred char.binary_digit_to_int(char::in, int::out) is semidet.

@@ -199,8 +199,8 @@
      %
  :- func char.det_binary_digit_to_int(char) = int.

-    % True iff char is an octal digit (0-7).
-    % Returns the character's value as a digit (0-7).
+    % octal_digit_to_int(Char, Int):
+    % True iff Char is an octal digit (0-7) representing the value Int.
      %
  :- pred char.octal_digit_to_int(char::in, int::out) is semidet.

@@ -208,8 +208,8 @@
      %
  :- func char.det_octal_digit_to_int(char) = int.

-    % True iff char is a decimal digit (0-9).
-    % Returns the character's value as a digit (0-9).
+    % decimal_digit_to_int(Char, Int):
+    % True iff Char is a decimal digit (0-9) representing the value Int.
      %
  :- pred char.decimal_digit_to_int(char::in, int::out) is semidet.

@@ -217,8 +217,9 @@
      %
  :- func char.det_decimal_digit_to_int(char) = int.

-    % True iff char is a hexadecimal digit (0-9, a-z or A-F).
-    % Returns the character's value as a digit (0-15).
+    % hex_digit_to_int(Char, Int):
+    % True iff Char is a hexadecimal digit (0-9, a-z or A-F) representing the
+    % value Int.
      %
  :- pred char.hex_digit_to_int(char::in, int::out) is semidet.

@@ -226,10 +227,10 @@
      %
  :- func char.det_hex_digit_to_int(char) = int.

-    % base_digit_to_int(Base, DigitChar, Int):
-    % True iff `DigitChar' is a decimal digit (0-9) or a letter (a-z, A-Z)
-    % representing the value `Int' (0-35) in the given base.
-    % Throws an exception if `Base' < 2 or `Base' > 36.
+    % base_digit_to_int(Base, Char, Int):
+    % True iff Char is a decimal digit (0-9) or a letter (a-z, A-Z)
+    % representing the value Int (0-35) in the given base.
+    % Throws an exception if Base < 2 or Base > 36.
      %
  :- pred char.base_digit_to_int(int::in, char::in, int::out) is semidet.

@@ -237,6 +238,8 @@
      %
  :- func char.det_base_digit_to_int(int, char) = int.

+    % Use hex_digit_to_int/2 instead.
+    %
  :- pragma obsolete(char.is_hex_digit/2).
  :- pred char.is_hex_digit(char, int).
  :- mode char.is_hex_digit(in, out) is semidet.
@@ -244,6 +247,8 @@
      % Convert an integer 0-15 to a hexadecimal digit (0-9, A-F) in the ASCII
      % range.
      %
+    % Use char.int_to_hex_digit/2 instead.
+    %
  :- pragma obsolete(char.int_to_hex_char/2).
  :- pred char.int_to_hex_char(int, char).
  :- mode char.int_to_hex_char(in, out) is semidet.
@@ -254,10 +259,18 @@
  :- pragma obsolete(char.digit_to_int/2).
  :- pred char.digit_to_int(char::in, int::out) is semidet.

-    % char.int_to_digit(Int, DigitChar):
+    % char.int_to_digit(Int, Char):
      %
-    % True iff `Int' is an integer in the range 0-35 and `DigitChar' is a
-    % decimal digit or uppercase letter whose value as a digit is `Int'.
+    % True iff Int is an integer in the range 0-35 and Char is a
+    % decimal digit or uppercase letter whose value as a digit is Int.
+    %
+    % Use whichever of int_to_binary_digit/2, int_to_octal_digit/2,
+    % int_to_decimal_digit/2, int_to_hex_digit/2 or base_int_to_digit/3 is
+    % appropriate instead of the (in, out) mode 
+    %
+    % Use whichever of binary_digit_to_int/2, octal_digit_to_int/2,
+    % decimal_digit_to_int/2, hex_digit_to_int/2 or base_digit_to_int/3
+    % is appropriate instead of the (out, in) mode.
      %
  :- pragma obsolete(char.int_to_digit/2).
  :- pred char.int_to_digit(int, char).
@@ -267,6 +280,10 @@
      % Returns a decimal digit or uppercase letter corresponding to the value.
      % Calls error/1 if the integer is not in the range 0-35.
      %
+    % Use whichever of det_int_to_binary_digit/1, det_int_to_octal_digit/1
+    % det_int_to_decimal_digit/1, det_int_to_hex_digit/1 or
+    % det_base_int_to_digit/2 is appropriate instead.
+    %
  :- pragma obsolete(char.det_int_to_digit/1).
  :- func char.det_int_to_digit(int) = char.
  :- pragma obsolete(char.det_int_to_digit/2).
@@ -513,10 +530,10 @@ is_hex_digit('E').
  is_hex_digit('F').

  is_base_digit(Base, Digit) :-
-    ( ( Base < 2 ; Base > 36) ->
-        error("char.is_base_digit: invalid base")
-    ;
+    ( Base > 1, Base < 37 ->
          base_digit_to_int(Base, Digit, _Int)
+    ;
+        error("char.is_base_digit: invalid base")
      ).

  %-----------------------------------------------------------------------------%
@@ -602,15 +619,15 @@ det_hex_digit_to_int(Digit) = Int :-
      ).

  base_digit_to_int(Base, Digit, Int) :-
-    ( ( Base < 2 ; Base > 36 ) ->
-        error("char.base_digit_to_int: invalid base")
-    ;
+    ( Base > 1, Base < 37 ->
          ( char.lower_upper(Digit, Upper) ->
              int_to_extended_digit(Int, Upper)
          ;
              int_to_extended_digit(Int, Digit)
          ),
          Int < Base
+    ;
+        error("char.base_digit_to_int: invalid base")
      ).

  det_base_digit_to_int(Base, Digit) = Int :-
@@ -669,11 +686,11 @@ int_to_hex_char(Int, Char) :-
      int_to_hex_digit(Int, Char).

  base_int_to_digit(Base, Int, Digit) :-
-    ( ( Base < 2 ; Base > 36) ->
-        error("char.base_int_to_digit: invalid base")
-    ;
+    ( Base > 1, Base < 37 ->
          Int < Base,
          int_to_extended_digit(Int, Digit)
+    ;
+        error("char.base_int_to_digit: invalid base")
      ).

  det_base_int_to_digit(Base, Int) = Digit :-



More information about the reviews mailing list