[m-rev.] for review: allow optional underscores in numeric literals

Julien Fischer jfischer at opturion.com
Mon Jan 9 16:38:24 AEDT 2017


For review by anyone.

In particular, I would like feedback on:

     1. the list of places in numeric literals where underscores are
        not allowed (see below).

     2. the wording of error message involving numeric literals; currently
        the reference manual refers to the as "literals", while the lexer
        variously describes them as "constants" or "tokens"

This diff does not contain any user facing documentation.  I'll add that
as a separate change once this is reviewed.

Julien.

------------------------------

Allow optional underscores in numeric literals.

Allow the optional use of underscores in numeric literals for the purpose of
improving their readability (e.g. by grouping digits etc).  We allow any number
of underscores between digits and also between the radix prefix (if present) and
the initial digit.  (When integer type suffixes are supported we will also
allow them to be preceded by any number of underscores.)  The following are
*not* allowed:

    1. Leading underscores.
    2. Trailing underscores.
    3. Underscores inside the components of a radix prefix (e.g.
       0_xffff or 0__b101010.)
    4. Underscores immediately adjacent to the decimal point in a float
       literal (e.g. 123_._123.)
    5. Underscores immediately adjacent to the exponent ('e' or 'E) in
       a float literal (e.g. 123_e12 or 123E_12.)
    6. Underscores immediately adjacent to the optional sign of an exponent
       in a float literal (e.g. 123_+e12 or 123-_E12.)
    7. Underscores between the optional sign of an exponent and the exponent
       indicator (e.g. 123+_e12.)

library/lexer.m:
     Modify the scanner to account of underscores in numeric literals according
     to the scheme above.

library/string.m:
library/integer.m:
     Export undocumented functions for converting strings containing underscores
     into ints or integers respectively.

tests/hard_coded/parse_number_from_io.{m,exp}:
      Test parsing of valid numeric literals from file streams.

tests/hard_coed/parse_number_from_string.{m,exp}:
      Test parsing of valid and invalid numeric literal from string.

tests/invalid/invalid_binary_literal.{m,err_exp}:
tests/invalid/invalid_decimal_literal.{m,err_exp}:
tests/invalid/invalid_octal_literal.{m,err_exp}:
tests/invalid/invalid_hex_literal.{m,err_exp}:
tests/invalid/invalid_float_literal.{m,err_exp}:
       Test parsing of invalid numeric literals from file streams.

tests/hard_coded/parse_number_from_{io,string}.m:
tests/hard_coded/parse_number_from_{io,string}.exp:
      Test parsing of valid numeric literals.

tests/hard_coded/Mmakefile:
tests/invalid/Mmakefile:
      Add the new test cases.

diff --git a/library/integer.m b/library/integer.m
index a87f033..11a92d5 100644
--- a/library/integer.m
+++ b/library/integer.m
@@ -234,6 +234,17 @@

  %---------------------------------------------------------------------------%

+:- interface.
+
+    % Exported for use by lexer.m.
+    %
+:- pred from_base_string_underscore(int::in, string::in, integer::out)
+    is semidet.
+
+%---------------------------------------------------------------------------%
+
+:- implementation.
+
  % Possible improvements:
  %
  % 1) allow negative digits (-base+1 .. base-1) in lists of digits
@@ -1519,4 +1530,36 @@ det_from_base_string(Base, String) = Integer :-
      ).

  %---------------------------------------------------------------------------%
+
+from_base_string_underscore(Base, String, Integer) :-
+    string.index(String, 0, Char),
+    Len = string.length(String),
+    ( if Char = ('-') then
+        Len > 1,
+        string.foldl_between(accumulate_integer_underscore(Base), String,
+            1, Len, integer.zero, PosInteger),
+        Integer = -PosInteger
+    else if Char = ('+') then
+        Len > 1,
+        string.foldl_between(accumulate_integer_underscore(Base), String,
+            1, Len, integer.zero, Integer)
+    else
+        string.foldl_between(accumulate_integer_underscore(Base), String,
+            0, Len, integer.zero, Integer)
+    ).
+
+:- pred accumulate_integer_underscore(int::in, char::in, integer::in, integer::out)
+    is semidet.
+
+accumulate_integer_underscore(Base, Char, !N) :-
+    ( if char.base_digit_to_int(Base, Char, Digit0) then
+        Digit = integer(Digit0),
+        !:N = (integer(Base) * !.N) + Digit
+    else if Char = '_' then
+        true
+    else
+        false
+    ).
+
+%---------------------------------------------------------------------------%
  %---------------------------------------------------------------------------%
diff --git a/library/lexer.m b/library/lexer.m
index c527ffd..fddd8aa 100644
--- a/library/lexer.m
+++ b/library/lexer.m
@@ -322,6 +322,66 @@ grab_string(String, Posn0, SubString, Posn, Posn) :-
      Posn = posn(_, _, Offset),
      string.unsafe_between(String, Offset0, Offset, SubString).

+    % As above, but the string is known to represent a float literal.
+    % Filter out any underscore characters from the returned string.
+    % We have to do this since the underlying mechanisms we currently use for
+    % converting strings into floats (sscanf in C, parseDouble in Java etc)
+    % cannot handle underscores in their input.
+    %
+:- pred grab_float_string(string::in, posn::in, string::out,
+    posn::in, posn::out) is det.
+
+grab_float_string(String, Posn0, FloatString, Posn, Posn) :-
+    Posn0 = posn(_, _, Offset0),
+    Posn = posn(_, _, Offset),
+    unsafe_get_float_between(String, Offset0, Offset, FloatString).
+
+:- pred unsafe_get_float_between(string::in, int::in, int::in,
+    string::uo) is det.
+:- pragma foreign_proc("C",
+    unsafe_get_float_between(Str::in, Start::in, End::in, FloatStr::uo),
+    [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
+        does_not_affect_liveness, no_sharing],
+"
+    int src;
+    int dst = 0;
+
+    MR_allocate_aligned_string_msg(FloatStr, End - Start, MR_ALLOC_ID);
+    for (src = Start; src < End; src++) {
+        if (Str[src] != '_') {
+            FloatStr[dst] = Str[src];
+            dst++;
+        }
+    }
+    FloatStr[dst] = '\\0';
+").
+
+:- pragma foreign_proc("C#",
+    unsafe_get_float_between(Str::in, Start::in, End::in, SubString::uo),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+    SubString = Str.Substring(Start, End - Start).Replace(\"_\", \"\");
+").
+
+:- pragma foreign_proc("Java",
+    unsafe_get_float_between(Str::in, Start::in, End::in, FloatStr::uo),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+    FloatStr = Str.substring(Start, End).replace(\"_\", \"\");
+").
+
+    % For use by the Erlang backend.
+    %
+unsafe_get_float_between(Str, Start, End, FloatStr) :-
+    string.unsafe_between(Str, Start, End, FloatStr0),
+    ( if string.contains_char(FloatStr0, '_') then
+        string.to_char_list(FloatStr0, Digits0),
+        list.negated_filter(is_underscore, Digits0, Digits),
+        string.from_char_list(Digits, FloatStr)
+    else
+        FloatStr = FloatStr0
+    ).
+
  :- pred string_set_line_number(int::in, posn::in, posn::out) is det.

  string_set_line_number(LineNumber, Posn0, Posn) :-
@@ -627,7 +687,7 @@ execute_get_token_action(Stream, Char, Action, ScannedPastWhiteSpace,
      ;
          Action = action_nonzero_digit,
          get_context(Stream, Context, !IO),
-        get_number(Stream, [Char], Token, !IO)
+        get_number(Stream, last_digit_is_not_underscore, [Char], Token, !IO)
      ;
          Action = action_special_token,
          get_context(Stream, Context, !IO),
@@ -702,7 +762,9 @@ execute_string_get_token_action(String, Len, Posn0, Char, Action,
          string_get_zero(String, Len, Posn0, Token, Context, !Posn)
      ;
          Action = action_nonzero_digit,
-        string_get_number(String, Len, Posn0, Token, Context, !Posn)
+        LastDigit = last_digit_is_not_underscore,
+        string_get_number(String, LastDigit, Len, Posn0, Token, Context,
+            !Posn)
      ;
          Action = action_special_token,
          string_get_context(Posn0, Context, !Posn),
@@ -779,7 +841,7 @@ handle_special_token(Char, ScannedPastWhiteSpace, Token) :-
              Token = SpecialToken
          )
      else
-        error("lexer.m, handle_special_token: unknown special token")
+        error("lexer.m: handle_special_token: unknown special token")
      ).

  :- pred special_token(char::in, token::out) is semidet.
@@ -1893,7 +1955,8 @@ get_zero(Stream, Token, !IO) :-
      ;
          Result = ok,
          ( if char.is_digit(Char) then
-            get_number(Stream, [Char], Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_number(Stream, LastDigit, [Char], Token, !IO)
          else if Char = '''' then
              get_char_code(Stream, Token, !IO)
          else if Char = 'b' then
@@ -1903,7 +1966,8 @@ get_zero(Stream, Token, !IO) :-
          else if Char = 'x' then
              get_hex(Stream, Token, !IO)
          else if Char = ('.') then
-            get_int_dot(Stream, ['0'], Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_int_dot(Stream, LastDigit, ['0'], Token, !IO)
          else if ( Char = 'e' ; Char = 'E' ) then
              get_float_exponent(Stream, [Char, '0'], Token, !IO)
          else
@@ -1912,13 +1976,19 @@ get_zero(Stream, Token, !IO) :-
          )
      ).

+:- type last_digit_is_underscore
+    --->    last_digit_is_underscore
+    ;       last_digit_is_not_underscore.
+
  :- pred string_get_zero(string::in, int::in, posn::in, token::out,
      string_token_context::out, posn::in, posn::out) is det.

  string_get_zero(String, Len, Posn0, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_digit(Char) then
-            string_get_number(String, Len, Posn0, Token, Context, !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_number(String, LastDigit, Len, Posn0, Token, Context,
+                !Posn)
          else if Char = '''' then
              string_get_char_code(String, Len, Posn0, Token, Context, !Posn)
          else if Char = 'b' then
@@ -1928,7 +1998,8 @@ string_get_zero(String, Len, Posn0, Token, Context, !Posn) :-
          else if Char = 'x' then
              string_get_hex(String, Len, Posn0, Token, Context, !Posn)
          else if Char = ('.') then
-            string_get_int_dot(String, Len, Posn0, Token, Context, !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_int_dot(String, LastDigit, Len, Posn0, Token, Context, !Posn)
          else if ( Char = 'e' ; Char = 'E' ) then
              string_get_float_exponent(String, Len, Posn0, Token, Context,
                  !Posn)
@@ -1984,7 +2055,10 @@ get_binary(Stream, Token, !IO) :-
      ;
          Result = ok,
          ( if char.is_binary_digit(Char) then
-            get_binary_2(Stream, [Char], Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_binary_2(Stream, LastDigit, [Char], Token, !IO)
+        else if Char = '_' then
+            get_binary(Stream, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
              Token = error("unterminated binary constant")
@@ -1998,7 +2072,11 @@ string_get_binary(String, Len, Posn0, Token, Context, !Posn) :-
      Posn1 = !.Posn,
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_binary_digit(Char) then
-            string_get_binary_2(String, Len, Posn1, Token, Context, !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_binary_2(String, LastDigit, Len, Posn1, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            string_get_binary(String, Len, Posn1, Token, Context, !Posn)
          else
              string_ungetchar(String, !Posn),
              Token = error("unterminated binary constant"),
@@ -2009,44 +2087,79 @@ string_get_binary(String, Len, Posn0, Token, Context, !Posn) :-
          string_get_context(Posn0, Context, !Posn)
      ).

-:- pred get_binary_2(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_binary_2(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_binary_2(Stream, !.RevChars, Token, !IO) :-
+get_binary_2(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
          Result = error(Error),
          Token = io_error(Error)
      ;
          Result = eof,
-        rev_char_list_to_int(!.RevChars, base_2, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_int(!.RevChars, base_2, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated binary constant")
+        )
      ;
          Result = ok,
          ( if char.is_binary_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_binary_2(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_binary_2(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            get_binary_2(Stream, !.LastDigit, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
-            rev_char_list_to_int(!.RevChars, base_2, Token)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_int(!.RevChars, base_2, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated binary constant")
+            )
          )
      ).

-:- pred string_get_binary_2(string::in, int::in, posn::in, token::out,
-    string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_binary_2(string::in, last_digit_is_underscore::in,
+    int::in, posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_binary_2(String, Len, Posn1, Token, Context, !Posn) :-
+string_get_binary_2(String, !.LastDigit, Len, Posn1, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_binary_digit(Char) then
-            string_get_binary_2(String, Len, Posn1, Token, Context, !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_binary_2(String, !.LastDigit, Len, Posn1, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            string_get_binary_2(String, !.LastDigit, Len, Posn1, Token, Context,
+                !Posn)
          else
              string_ungetchar(String, !Posn),
-            grab_string(String, Posn1, BinaryString, !Posn),
-            conv_string_to_int(BinaryString, base_2, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                grab_string(String, Posn1, BinaryString, !Posn),
+                conv_string_to_int(BinaryString, base_2, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated binary constant")
+            ),
              string_get_context(Posn1, Context, !Posn)
          )
      else
-        grab_string(String, Posn1, BinaryString, !Posn),
-        conv_string_to_int(BinaryString, base_2, Token),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            grab_string(String, Posn1, BinaryString, !Posn),
+            conv_string_to_int(BinaryString, base_2, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated binary constant")
+        ),
          string_get_context(Posn1, Context, !Posn)
      ).

@@ -2063,7 +2176,10 @@ get_octal(Stream, Token, !IO) :-
      ;
          Result = ok,
          ( if char.is_octal_digit(Char) then
-            get_octal_2(Stream, [Char], Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_octal_2(Stream, LastDigit, [Char], Token, !IO)
+        else if Char = '_' then
+            get_octal(Stream, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
              Token = error("unterminated octal constant")
@@ -2071,14 +2187,17 @@ get_octal(Stream, Token, !IO) :-
      ).

  :- pred string_get_octal(string::in, int::in, posn::in, token::out,
-    string_token_context::out,
-    posn::in, posn::out) is det.
+    string_token_context::out, posn::in, posn::out) is det.

  string_get_octal(String, Len, Posn0, Token, Context, !Posn) :-
      Posn1 = !.Posn,
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_octal_digit(Char) then
-            string_get_octal_2(String, Len, Posn1, Token, Context, !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_octal_2(String, LastDigit, Len, Posn1, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            string_get_octal(String, Len, Posn0, Token, Context, !Posn)
          else
              string_ungetchar(String, !Posn),
              Token = error("unterminated octal constant"),
@@ -2089,44 +2208,79 @@ string_get_octal(String, Len, Posn0, Token, Context, !Posn) :-
          string_get_context(Posn0, Context, !Posn)
      ).

-:- pred get_octal_2(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_octal_2(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_octal_2(Stream, !.RevChars, Token, !IO) :-
+get_octal_2(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
          Result = error(Error),
          Token = io_error(Error)
      ;
          Result = eof,
-        rev_char_list_to_int(!.RevChars, base_8, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_int(!.RevChars, base_8, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated octal constant")
+        )
      ;
          Result = ok,
          ( if char.is_octal_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_octal_2(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_octal_2(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            get_octal_2(Stream, !.LastDigit, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
-            rev_char_list_to_int(!.RevChars, base_8, Token)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_int(!.RevChars, base_8, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated octal constant")
+            )
          )
      ).

-:- pred string_get_octal_2(string::in, int::in, posn::in, token::out,
-    string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_octal_2(string::in, last_digit_is_underscore::in,
+    int::in, posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_octal_2(String, Len, Posn1, Token, Context, !Posn) :-
+string_get_octal_2(String, !.LastDigit, Len, Posn1, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_octal_digit(Char) then
-            string_get_octal_2(String, Len, Posn1, Token, Context, !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_octal_2(String, !.LastDigit, Len, Posn1, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            string_get_octal_2(String, !.LastDigit, Len, Posn1, Token, Context,
+                !Posn)
          else
              string_ungetchar(String, !Posn),
-            grab_string(String, Posn1, BinaryString, !Posn),
-            conv_string_to_int(BinaryString, base_8, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                grab_string(String, Posn1, BinaryString, !Posn),
+                conv_string_to_int(BinaryString, base_8, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated octal constant")
+            ),
              string_get_context(Posn1, Context, !Posn)
          )
      else
-        grab_string(String, Posn1, BinaryString, !Posn),
-        conv_string_to_int(BinaryString, base_8, Token),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            grab_string(String, Posn1, BinaryString, !Posn),
+            conv_string_to_int(BinaryString, base_8, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated octal constant")
+        ),
          string_get_context(Posn1, Context, !Posn)
      ).

@@ -2143,7 +2297,10 @@ get_hex(Stream, Token, !IO) :-
      ;
          Result = ok,
          ( if char.is_hex_digit(Char) then
-            get_hex_2(Stream, [Char], Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_hex_2(Stream, LastDigit, [Char], Token, !IO)
+        else if Char = '_' then
+            get_hex(Stream, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
              Token = error("unterminated hex constant")
@@ -2151,14 +2308,17 @@ get_hex(Stream, Token, !IO) :-
      ).

  :- pred string_get_hex(string::in, int::in, posn::in, token::out,
-    string_token_context::out,
-    posn::in, posn::out) is det.
+    string_token_context::out, posn::in, posn::out) is det.

  string_get_hex(String, Len, Posn0, Token, Context, !Posn) :-
      Posn1 = !.Posn,
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_hex_digit(Char) then
-            string_get_hex_2(String, Len, Posn1, Token, Context, !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_hex_2(String, LastDigit, Len, Posn1, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            string_get_hex(String, Len, Posn0, Token, Context, !Posn)
          else
              string_ungetchar(String, !Posn),
              Token = error("unterminated hex constant"),
@@ -2169,102 +2329,199 @@ string_get_hex(String, Len, Posn0, Token, Context, !Posn) :-
          string_get_context(Posn0, Context, !Posn)
      ).

-:- pred get_hex_2(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_hex_2(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_hex_2(Stream, !.RevChars, Token, !IO) :-
+get_hex_2(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
          Result = error(Error),
          Token = io_error(Error)
      ;
          Result = eof,
-        rev_char_list_to_int(!.RevChars, base_16, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_int(!.RevChars, base_16, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated hex constant")
+        )
      ;
          Result = ok,
          ( if char.is_hex_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_hex_2(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_hex_2(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            get_hex_2(Stream, !.LastDigit, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
-            rev_char_list_to_int(!.RevChars, base_16, Token)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_int(!.RevChars, base_16, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated hex constant")
+            )
          )
      ).

-:- pred string_get_hex_2(string::in, int::in, posn::in, token::out,
-    string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_hex_2(string::in, last_digit_is_underscore::in,
+    int::in, posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_hex_2(String, Len, Posn1, Token, Context, !Posn) :-
+string_get_hex_2(String, !.LastDigit, Len, Posn1, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_hex_digit(Char) then
-            string_get_hex_2(String, Len, Posn1, Token, Context, !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_hex_2(String, !.LastDigit, Len, Posn1, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            string_get_hex_2(String, !.LastDigit, Len, Posn1, Token, Context,
+                !Posn)
          else
-            string_ungetchar(String, !Posn),
-            grab_string(String, Posn1, BinaryString, !Posn),
-            conv_string_to_int(BinaryString, base_16, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                string_ungetchar(String, !Posn),
+                grab_string(String, Posn1, BinaryString, !Posn),
+                conv_string_to_int(BinaryString, base_16, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated hex constant")
+            ),
              string_get_context(Posn1, Context, !Posn)
          )
      else
-        grab_string(String, Posn1, BinaryString, !Posn),
-        conv_string_to_int(BinaryString, base_16, Token),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            grab_string(String, Posn1, BinaryString, !Posn),
+            conv_string_to_int(BinaryString, base_16, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated hex constant")
+        ),
          string_get_context(Posn1, Context, !Posn)
      ).

-:- pred get_number(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_number(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_number(Stream, !.RevChars, Token, !IO) :-
+get_number(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
          Result = error(Error),
          Token = io_error(Error)
      ;
          Result = eof,
-        rev_char_list_to_int(!.RevChars, base_10, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_int(!.RevChars, base_10, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated decimal constant")
+        )
      ;
          Result = ok,
          ( if char.is_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_number(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_number(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            get_number(Stream, !.LastDigit, !.RevChars, Token, !IO)
          else if Char = ('.') then
-            get_int_dot(Stream, !.RevChars, Token, !IO)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                get_int_dot(Stream, !.LastDigit, !.RevChars, Token, !IO)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated decimal constant")
+            )
          else if ( Char = 'e' ; Char = 'E' ) then
-            !:RevChars = [Char | !.RevChars],
-            get_float_exponent(Stream, !.RevChars, Token, !IO)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                !:RevChars = [Char | !.RevChars],
+                get_float_exponent(Stream, !.RevChars, Token, !IO)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("underscore before exponent")
+            )
          else
              io.putback_char(Stream, Char, !IO),
-            rev_char_list_to_int(!.RevChars, base_10, Token)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_int(!.RevChars, base_10, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated decimal constant")
+            )
          )
      ).

-:- pred string_get_number(string::in, int::in, posn::in, token::out,
-    string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_number(string::in, last_digit_is_underscore::in,
+    int::in, posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_number(String, Len, Posn0, Token, Context, !Posn) :-
+string_get_number(String, !.LastDigit, Len, Posn0, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_digit(Char) then
-            string_get_number(String, Len, Posn0, Token, Context, !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_number(String, !.LastDigit, Len, Posn0, Token, Context,
+                !Posn)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            string_get_number(String, !.LastDigit, Len, Posn0, Token, Context,
+                !Posn)
          else if Char = ('.') then
-            string_get_int_dot(String, Len, Posn0, Token, Context, !Posn)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                string_get_int_dot(String, !.LastDigit, Len, Posn0, Token, Context,
+                    !Posn)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated decimal constant"),
+                string_get_context(Posn0, Context, !Posn)
+            )
          else if ( Char = 'e' ; Char = 'E' ) then
-            string_get_float_exponent(String, Len, Posn0, Token, Context,
-                !Posn)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                string_get_float_exponent(String, Len, Posn0, Token, Context,
+                    !Posn)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("underscore before exponent"),
+                string_get_context(Posn0, Context, !Posn)
+            )
          else
              string_ungetchar(String, !Posn),
-            grab_string(String, Posn0, NumberString, !Posn),
-            conv_string_to_int(NumberString, base_10, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                grab_string(String, Posn0, NumberString, !Posn),
+                conv_string_to_int(NumberString, base_10, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated decimal constant")
+            ),
              string_get_context(Posn0, Context, !Posn)
          )
      else
-        grab_string(String, Posn0, NumberString, !Posn),
-        conv_string_to_int(NumberString, base_10, Token),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            grab_string(String, Posn0, NumberString, !Posn),
+            conv_string_to_int(NumberString, base_10, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated decimal constant")
+        ),
          string_get_context(Posn0, Context, !Posn)
      ).

-:- pred get_int_dot(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_int_dot(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_int_dot(Stream, !.RevChars, Token, !IO) :-
+get_int_dot(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      % XXX The float literal syntax doesn't match ISO Prolog.
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
@@ -2273,96 +2530,163 @@ get_int_dot(Stream, !.RevChars, Token, !IO) :-
      ;
          Result = eof,
          io.putback_char(Stream, '.', !IO),
-        rev_char_list_to_int(!.RevChars, base_10, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_int(!.RevChars, base_10, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated decimal constant")
+        )
      ;
          Result = ok,
          ( if char.is_digit(Char) then
              !:RevChars = [Char, '.' | !.RevChars],
-            get_float_decimals(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_float_decimals(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            Token = error("underscore following decimal point")
          else
              io.putback_char(Stream, Char, !IO),
              % We can't putback the ".", because io.putback_char only
              % guarantees one character of pushback. So instead, we return
              % an `integer_dot' token; the main loop of get_token_list_2 will
              % handle this appropriately.
-            rev_char_list_to_int(!.RevChars, base_10, Token0),
-            ( if Token0 = integer(Int) then
-                Token = integer_dot(Int)
-            else
-                Token = Token0
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_int(!.RevChars, base_10, Token0),
+                ( if Token0 = integer(Int) then
+                    Token = integer_dot(Int)
+                else
+                    Token = Token0
+                )
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated decimal constant")
              )
          )
      ).

-:- pred string_get_int_dot(string::in, int::in, posn::in, token::out,
-    string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_int_dot(string::in, last_digit_is_underscore::in, int::in,
+    posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_int_dot(String, Len, Posn0, Token, Context, !Posn) :-
+string_get_int_dot(String, !.LastDigit, Len, Posn0, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_digit(Char) then
-            string_get_float_decimals(String, Len, Posn0, Token, Context,
-                !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_float_decimals(String, !.LastDigit, Len, Posn0, Token,
+                Context, !Posn)
+        else if Char = '_' then
+            Token = error("underscore following decimal point"),
+            string_get_context(Posn0, Context, !Posn)
          else
              string_ungetchar(String, !Posn),
              string_ungetchar(String, !Posn),
-            grab_string(String, Posn0, NumberString, !Posn),
-            conv_string_to_int(NumberString, base_10, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                grab_string(String, Posn0, NumberString, !Posn),
+                conv_string_to_int(NumberString, base_10, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated decimal constant")
+            ),
              string_get_context(Posn0, Context, !Posn)
          )
      else
          string_ungetchar(String, !Posn),
-        grab_string(String, Posn0, NumberString, !Posn),
-        conv_string_to_int(NumberString, base_10, Token),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            grab_string(String, Posn0, NumberString, !Posn),
+            conv_string_to_int(NumberString, base_10, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated decimal constant")
+        ),
          string_get_context(Posn0, Context, !Posn)
      ).

      % We have read past the decimal point, so now get the decimals.
      %
-:- pred get_float_decimals(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_float_decimals(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_float_decimals(Stream, !.RevChars, Token, !IO) :-
+get_float_decimals(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
          Result = error(Error),
          Token = io_error(Error)
      ;
          Result = eof,
-        rev_char_list_to_float(!.RevChars, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_float(!.RevChars, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("fractional part of float terminated by underscore")
+        )
      ;
          Result = ok,
          ( if char.is_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_float_decimals(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_float_decimals(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            !:LastDigit=  last_digit_is_underscore,
+            get_float_decimals(Stream, !.LastDigit, !.RevChars, Token, !IO)
          else if ( Char = 'e' ; Char = 'E' ) then
              !:RevChars = [Char | !.RevChars],
              get_float_exponent(Stream, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
-            rev_char_list_to_float(!.RevChars, Token)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_float(!.RevChars, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("fractional part of float terminated by underscore")
+            )
          )
      ).

-:- pred string_get_float_decimals(string::in, int::in, posn::in,
-    token::out, string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_float_decimals(string::in, last_digit_is_underscore::in,
+    int::in, posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_float_decimals(String, Len, Posn0, Token, Context, !Posn) :-
+string_get_float_decimals(String, !.LastDigit, Len, Posn0, Token, Context,
+        !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_digit(Char) then
-            string_get_float_decimals(String, Len, Posn0, Token, Context,
-                !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_float_decimals(String, !.LastDigit, Len, Posn0, Token,
+                Context, !Posn)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            string_get_float_decimals(String, !.LastDigit, Len, Posn0, Token,
+                Context, !Posn)
          else if ( Char = 'e' ; Char = 'E' ) then
              string_get_float_exponent(String, Len, Posn0, Token, Context,
                  !Posn)
          else
              string_ungetchar(String, !Posn),
-            grab_string(String, Posn0, FloatString, !Posn),
-            conv_to_float(FloatString, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                grab_float_string(String, Posn0, FloatString, !Posn),
+                conv_to_float(FloatString, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("fractional part of float terminated by underscore")
+            ),
              string_get_context(Posn0, Context, !Posn)
          )
      else
-        grab_string(String, Posn0, FloatString, !Posn),
-        conv_to_float(FloatString, Token),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            grab_float_string(String, Posn0, FloatString, !Posn),
+            conv_to_float(FloatString, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("fractional part of float terminated by underscore")
+        ),
          string_get_context(Posn0, Context, !Posn)
      ).

@@ -2384,7 +2708,8 @@ get_float_exponent(Stream, !.RevChars, Token, !IO) :-
              get_float_exponent_2(Stream, !.RevChars, Token, !IO)
          else if char.is_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_float_exponent_3(Stream, !.RevChars, Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_float_exponent_3(Stream, LastDigit, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
              Token = error("unterminated exponent in float token")
@@ -2400,15 +2725,16 @@ string_get_float_exponent(String, Len, Posn0, Token, Context, !Posn) :-
              string_get_float_exponent_2(String, Len, Posn0, Token, Context,
                  !Posn)
          else if char.is_digit(Char) then
-            string_get_float_exponent_3(String, Len, Posn0, Token, Context,
-                !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_float_exponent_3(String, LastDigit, Len, Posn0, Token,
+                Context, !Posn)
          else
              string_ungetchar(String, !Posn),
              Token = error("unterminated exponent in float token"),
              string_get_context(Posn0, Context, !Posn)
          )
      else
-        grab_string(String, Posn0, FloatString, !Posn),
+        grab_float_string(String, Posn0, FloatString, !Posn),
          conv_to_float(FloatString, Token),
          string_get_context(Posn0, Context, !Posn)
      ).
@@ -2432,7 +2758,8 @@ get_float_exponent_2(Stream, !.RevChars, Token, !IO) :-
          Result = ok,
          ( if char.is_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_float_exponent_3(Stream, !.RevChars, Token, !IO)
+            LastDigit = last_digit_is_not_underscore,
+            get_float_exponent_3(Stream, LastDigit, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
              Token = error("unterminated exponent in float token")
@@ -2449,8 +2776,9 @@ get_float_exponent_2(Stream, !.RevChars, Token, !IO) :-
  string_get_float_exponent_2(String, Len, Posn0, Token, Context, !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_digit(Char) then
-            string_get_float_exponent_3(String, Len, Posn0, Token, Context,
-                !Posn)
+            LastDigit = last_digit_is_not_underscore,
+            string_get_float_exponent_3(String, LastDigit, Len, Posn0, Token,
+                Context, !Posn)
          else
              string_ungetchar(String, !Posn),
              Token = error("unterminated exponent in float token"),
@@ -2464,45 +2792,80 @@ string_get_float_exponent_2(String, Len, Posn0, Token, Context, !Posn) :-
      % We have read past the first digit of the exponent -
      % now get the remaining digits.
      %
-:- pred get_float_exponent_3(io.input_stream::in, list(char)::in, token::out,
-    io::di, io::uo) is det.
+:- pred get_float_exponent_3(io.input_stream::in, last_digit_is_underscore::in,
+    list(char)::in, token::out, io::di, io::uo) is det.

-get_float_exponent_3(Stream, !.RevChars, Token, !IO) :-
+get_float_exponent_3(Stream, !.LastDigit, !.RevChars, Token, !IO) :-
      io.read_char_unboxed(Stream, Result, Char, !IO),
      (
          Result = error(Error),
          Token = io_error(Error)
      ;
          Result = eof,
-        rev_char_list_to_float(!.RevChars, Token)
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            rev_char_list_to_float(!.RevChars, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated exponent in float token")
+        )
      ;
          Result = ok,
          ( if char.is_digit(Char) then
              !:RevChars = [Char | !.RevChars],
-            get_float_exponent_3(Stream, !.RevChars, Token, !IO)
+            !:LastDigit = last_digit_is_not_underscore,
+            get_float_exponent_3(Stream, !.LastDigit, !.RevChars, Token, !IO)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            get_float_exponent_3(Stream, !.LastDigit, !.RevChars, Token, !IO)
          else
              io.putback_char(Stream, Char, !IO),
-            rev_char_list_to_float(!.RevChars, Token)
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                rev_char_list_to_float(!.RevChars, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated exponent in float token")
+            )
          )
      ).

-:- pred string_get_float_exponent_3(string::in, int::in, posn::in,
-    token::out, string_token_context::out, posn::in, posn::out) is det.
+:- pred string_get_float_exponent_3(string::in, last_digit_is_underscore::in,
+    int::in, posn::in, token::out, string_token_context::out,
+    posn::in, posn::out) is det.

-string_get_float_exponent_3(String, Len, Posn0, Token, Context, !Posn) :-
+string_get_float_exponent_3(String, !.LastDigit, Len, Posn0, Token, Context,
+        !Posn) :-
      ( if string_read_char(String, Len, Char, !Posn) then
          ( if char.is_digit(Char) then
-            string_get_float_exponent_3(String, Len, Posn0, Token, Context,
-                !Posn)
+            !:LastDigit = last_digit_is_not_underscore,
+            string_get_float_exponent_3(String, !.LastDigit, Len, Posn0, Token,
+                Context, !Posn)
+        else if Char = '_' then
+            !:LastDigit = last_digit_is_underscore,
+            string_get_float_exponent_3(String, !.LastDigit, Len, Posn0, Token,
+                Context, !Posn)
          else
              string_ungetchar(String, !Posn),
-            grab_string(String, Posn0, FloatString, !Posn),
-            conv_to_float(FloatString, Token),
+            (
+                !.LastDigit = last_digit_is_not_underscore,
+                grab_float_string(String, Posn0, FloatString, !Posn),
+                conv_to_float(FloatString, Token)
+            ;
+                !.LastDigit = last_digit_is_underscore,
+                Token = error("unterminated exponent in float token")
+            ),
              string_get_context(Posn0, Context, !Posn)
          )
      else
-        grab_string(String, Posn0, FloatString, !Posn),
-        conv_to_float(FloatString, Token),
+        grab_float_string(String, Posn0, FloatString, !Posn),
+        (
+            !.LastDigit = last_digit_is_not_underscore,
+            conv_to_float(FloatString, Token)
+        ;
+            !.LastDigit = last_digit_is_underscore,
+            Token = error("unterminated exponent in float token")
+        ),
          string_get_context(Posn0, Context, !Posn)
      ).

@@ -2525,9 +2888,9 @@ rev_char_list_to_int(RevChars, Base, Token) :-

  conv_string_to_int(String, Base, Token) :-
      BaseInt = integer_base_int(Base),
-    ( if string.base_string_to_int(BaseInt, String, Int) then
+    ( if string.base_string_to_int_underscore(BaseInt, String, Int) then
          Token = integer(Int)
-    else if integer.from_base_string(BaseInt, String, Integer) then
+    else if integer.from_base_string_underscore(BaseInt, String, Integer) then
          Token = big_integer(Base, Integer)
      else
          Token = error("invalid character in int")
diff --git a/library/string.m b/library/string.m
index 1c4299a..cb70509 100644
--- a/library/string.m
+++ b/library/string.m
@@ -1377,6 +1377,16 @@
  :- interface.
  :- include_module format.
  :- include_module parse_util.
+
+    % Exported for use by lexer.m (XXX perhaps it ought to be defined in
+    % that module instead?)
+    %
+    % Like base_string_to_int/3, but allow for an arbitrary number of
+    % underscores between the other characters.  Leading and trailing
+    % underscores are allowed.
+    %
+:- pred base_string_to_int_underscore(int::in, string::in, int::out) is semidet.
+
  :- implementation.

  :- include_module parse_runtime.
@@ -5476,6 +5486,97 @@ accumulate_negative_int(Base, Char, N0, N) :-

  %---------------------%

+base_string_to_int_underscore(Base, String, Int) :-
+    index(String, 0, Char),
+    End = count_code_units(String),
+    ( if Char = ('-') then
+        End > 1,
+        foldl_between(base_negative_accumulator_underscore(Base), String, 1, End, 0, Int)
+    else if Char = ('+') then
+        End > 1,
+        foldl_between(base_accumulator_underscore(Base), String, 1, End, 0, Int)
+    else
+        foldl_between(base_accumulator_underscore(Base), String, 0, End, 0, Int)
+    ).
+
+:- func base_accumulator_underscore(int) = pred(char, int, int).
+:- mode base_accumulator_underscore(in) = out(pred(in, in, out) is semidet) is det.
+
+base_accumulator_underscore(Base) = Pred :-
+    % Avoid allocating a closure for the common bases. A more general, but
+    % finicky, way to avoid the allocation is to inline foldl_between so that
+    % the higher-order calls in base_string_to_int can be specialised.
+    % The redundant closures will also need to be deleted by unused argument
+    % elimination.
+    ( if Base = 10 then
+        Pred = accumulate_int_underscore(10)
+    else if Base = 16 then
+        Pred = accumulate_int_underscore(16)
+    else if Base = 8 then
+        Pred = accumulate_int_underscore(8)
+    else if Base = 2 then
+        Pred = accumulate_int_underscore(2)
+    else
+        Pred = accumulate_int_underscore(Base)
+    ).
+
+:- pred accumulate_int_underscore(int::in, char::in, int::in, int::out) is semidet.
+
+accumulate_int_underscore(Base, Char, N0, N) :-
+    ( if
+        char.base_digit_to_int(Base, Char, M)
+    then
+        N = (Base * N0) + M,
+        % Fail on overflow.
+        % XXX depends on undefined behaviour
+        N0 =< N
+    else if
+        Char = '_'
+    then
+        N = N0
+    else
+        false
+    ).
+
+:- func base_negative_accumulator_underscore(int) = pred(char, int, int).
+:- mode base_negative_accumulator_underscore(in) = out(pred(in, in, out) is semidet)
+    is det.
+
+base_negative_accumulator_underscore(Base) = Pred :-
+    % Avoid allocating a closure for the common bases.
+    ( if Base = 10 then
+        Pred = accumulate_negative_int_underscore(10)
+    else if Base = 16 then
+        Pred = accumulate_negative_int_underscore(16)
+    else if Base = 8 then
+        Pred = accumulate_negative_int_underscore(8)
+    else if Base = 2 then
+        Pred = accumulate_negative_int_underscore(2)
+    else
+        Pred = accumulate_negative_int_underscore(Base)
+    ).
+
+:- pred accumulate_negative_int_underscore(int::in, char::in,
+    int::in, int::out) is semidet.
+
+accumulate_negative_int_underscore(Base, Char, N0, N) :-
+    ( if
+        char.base_digit_to_int(Base, Char, M)
+    then
+        N = (Base * N0) - M,
+        % Fail on overflow.
+        % XXX depends on undefined behaviour
+        N =< N0
+    else if
+        Char = '_'
+    then
+        N = N0
+    else
+        false
+    ).
+
+%---------------------%
+
  :- pragma foreign_export("C", to_float(in, out),
      "ML_string_to_float").

diff --git a/tests/hard_coded/Mmakefile b/tests/hard_coded/Mmakefile
index 3be8400..9dab3ae 100644
--- a/tests/hard_coded/Mmakefile
+++ b/tests/hard_coded/Mmakefile
@@ -226,6 +226,8 @@ ORDINARY_PROGS =	\
  	pack_args_copy \
  	pack_args_float \
  	pack_args_intermod1 \
+	parse_number_from_io \
+	parse_number_from_string \
  	ppc_bug \
  	pprint_test \
  	pprint_test2 \
diff --git a/tests/hard_coded/parse_number_from_io.exp b/tests/hard_coded/parse_number_from_io.exp
index e69de29..b4aa73e 100644
--- a/tests/hard_coded/parse_number_from_io.exp
+++ b/tests/hard_coded/parse_number_from_io.exp
@@ -0,0 +1,52 @@
+Decimal:
+0
+0
+10
+-10
+10
+-10
+
+Binary:
+0
+0
+1
+-1
+68
+-68
+
+Octal:
+511
+-511
+511
+-511
+511
+-511
+511
+-511
+
+Hexadecimal:
+255
+-255
+255
+-255
+255
+-255
+4095
+
+Float:
+0.123
+-0.123
+0.123
+-0.123
+1.123
+-1.123
+12.123
+-12.123
+12.123
+-12.123
+12300000000000.0
+12300000000000.0
+1200000000000.0
+1200000000000.0
+1.2e-10
+1.2e-10
diff --git a/tests/hard_coded/parse_number_from_io.m b/tests/hard_coded/parse_number_from_io.m
index e69de29..e51b9c0 100644
--- a/tests/hard_coded/parse_number_from_io.m
+++ b/tests/hard_coded/parse_number_from_io.m
@@ -0,0 +1,70 @@
+%---------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et
+%---------------------------------------------------------------------------%
+
+:- module parse_number_from_io.
+:- interface.
+
+:- import_module io.
+
+:- pred main(io::di, io::uo) is det.
+
+:- implementation.
+
+main(!IO) :-
+    io.print_line("Decimal:", !IO),
+    io.print_line(0, !IO),
+    io.print_line(-0, !IO),
+    io.print_line(10, !IO),
+    io.print_line(-10, !IO),
+    io.print_line(1_0, !IO),
+    io.print_line(-1_0, !IO),
+    io.nl(!IO),
+
+    io.print_line("Binary:", !IO),
+    io.print_line(0b0, !IO),
+    io.print_line(-0b0, !IO),
+    io.print_line(0b_1, !IO),
+    io.print_line(-0b_1, !IO),
+    io.print_line(0b_1000_100, !IO),
+    io.print_line(-0b_1000_100, !IO),
+    io.nl(!IO),
+
+    io.print_line("Octal:", !IO),
+    io.print_line(0o777, !IO),
+    io.print_line(-0o777, !IO),
+    io.print_line(0o_777, !IO),
+    io.print_line(-0o_777, !IO),
+    io.print_line(0o_7_7_7, !IO),
+    io.print_line(-0o_7_7_7, !IO),
+    io.print_line(0o_7__7___7, !IO),
+    io.print_line(-0o_7__7___7, !IO),
+    io.nl(!IO),
+
+    io.print_line("Hexadecimal:", !IO),
+    io.print_line(0xff, !IO),
+    io.print_line(-0xff, !IO),
+    io.print_line(0x_ff, !IO),
+    io.print_line(-0x_ff, !IO),
+    io.print_line(0xf_f, !IO),
+    io.print_line(-0xf_f, !IO),
+    io.print_line(0x_f_f__f, !IO),
+    io.nl(!IO),
+
+    io.print_line("Float:", !IO),
+    io.print_line(0.123, !IO),
+    io.print_line(-0.123, !IO),
+    io.print_line(0.1_2__3, !IO),
+    io.print_line(-0.1_2__3, !IO),
+    io.print_line(1.123, !IO),
+    io.print_line(-1.123, !IO),
+    io.print_line(1_2.123, !IO),
+    io.print_line(-1_2.123, !IO),
+    io.print_line(1__2.1_2__3, !IO),
+    io.print_line(-1__2.1_2__3, !IO),
+    io.print_line(1_2_3e1_1, !IO),
+    io.print_line(1_2_3E1_1, !IO),
+    io.print_line(1_2e+1_1, !IO),
+    io.print_line(1_2E+1_1, !IO),
+    io.print_line(1_2e-1_1, !IO),
+    io.print_line(1_2E-1_1, !IO).
diff --git a/tests/hard_coded/parse_number_from_string.exp b/tests/hard_coded/parse_number_from_string.exp
index e69de29..aab148d 100644
--- a/tests/hard_coded/parse_number_from_string.exp
+++ b/tests/hard_coded/parse_number_from_string.exp
@@ -0,0 +1,105 @@
+Valid decimal literals:
+read_term("0.") = functor(integer(0), [], context("", 1))
+read_term("-0.") = functor(integer(0), [], context("", 1))
+read_term("10.") = functor(integer(10), [], context("", 1))
+read_term("-10.") = functor(integer(-10), [], context("", 1))
+read_term("1_0.") = functor(integer(10), [], context("", 1))
+read_term("-1_0.") = functor(integer(-10), [], context("", 1))
+read_term("1_000_000_000_000_000_000_000.") = functor(big_integer(base_10, i(5, [13877, 12907, 7261, 14976, 0])), [], context("", 1))
+read_term("-1_000_000_000_000_000_000_000.") = functor(atom("-"), [functor(big_integer(base_10, i(5, [13877, 12907, 7261, 14976, 0])), [], context("", 1))], context("", 1))
+
+Invalid decimal literals:
+read_term("123_.") = Syntax error: unterminated decimal constant
+read_term("-123_.") = Syntax error: unterminated decimal constant
+read_term("-_123") = Syntax error: operator or `.' expected
+
+Valid binary literals:
+read_term("0b0.") = functor(integer(0), [], context("", 1))
+read_term("-0b0.") = functor(integer(0), [], context("", 1))
+read_term("0b_1.") = functor(integer(1), [], context("", 1))
+read_term("-0b_1.") = functor(integer(-1), [], context("", 1))
+read_term("0b_1000_100.") = functor(integer(68), [], context("", 1))
+read_term("-0b_1000_100.") = functor(integer(-68), [], context("", 1))
+
+Invalid binary literals:
+read_term("0b.") = Syntax error: unterminated binary constant
+read_term("-0b.") = Syntax error: unterminated binary constant
+read_term("0b_.") = Syntax error: unterminated binary constant
+read_term("-0b_.") = Syntax error: unterminated binary constant
+read_term("0b11_.") = Syntax error: unterminated binary constant
+read_term("-0b11_.") = Syntax error: unterminated binary constant
+
+Valid octal literals:
+read_term("0o77.") = functor(integer(63), [], context("", 1))
+read_term("-0o77.") = functor(integer(-63), [], context("", 1))
+read_term("0o_77.") = functor(integer(63), [], context("", 1))
+read_term("-0o_77.") = functor(integer(-63), [], context("", 1))
+read_term("0o_7_7.") = functor(integer(63), [], context("", 1))
+read_term("-0o_7_7.") = functor(integer(-63), [], context("", 1))
+read_term("0o_7__7___7.") = functor(integer(511), [], context("", 1))
+read_term("-0o_7__7___7.") = functor(integer(-511), [], context("", 1))
+
+Invalid octal literals:
+read_term("0o.") = Syntax error: unterminated octal constant
+read_term("-0o") = Syntax error: unterminated octal constant
+read_term("0o_.") = Syntax error: unterminated octal constant
+read_term("-0o_.") = Syntax error: unterminated octal constant
+read_term("0o77_.") = Syntax error: unterminated octal constant
+read_term("-0o77_.") = Syntax error: unterminated octal constant
+
+Valid hexadecimal literals:
+read_term("0xff.") = functor(integer(255), [], context("", 1))
+read_term("-0xff.") = functor(integer(-255), [], context("", 1))
+read_term("0x_ff.") = functor(integer(255), [], context("", 1))
+read_term("-0x_ff.") = functor(integer(-255), [], context("", 1))
+read_term("0xf_f.") = functor(integer(255), [], context("", 1))
+read_term("-0xf_f.") = functor(integer(-255), [], context("", 1))
+read_term("0x_f_f__f.") = functor(integer(4095), [], context("", 1))
+read_term("-0x_f_f__f.") = functor(integer(-4095), [], context("", 1))
+read_term("0xfffffffffffffffffffffffff.") = functor(big_integer(base_16, i(8, [3, 16383, 16383, 16383, 16383, 16383, 16383, 16383])), [], context("", 1))
+read_term("-0xfffffffffffffffffffffffff.") = functor(atom("-"), [functor(big_integer(base_16, i(8, [3, 16383, 16383, 16383, 16383, 16383, 16383, 16383])), [], context("", 1))], context("", 1))
+
+Invalid hexadecimal literals:
+read_term("0x.") = Syntax error: unterminated hex constant
+read_term("-0x.") = Syntax error: unterminated hex constant
+read_term("0x_.") = Syntax error: unterminated hex constant
+read_term("-0x_.") = Syntax error: unterminated hex constant
+read_term("0xff_.") = Syntax error: unterminated hex constant
+read_term("-0xff_.") = Syntax error: unterminated hex constant
+
+Valid float literals:
+read_term("0.123.") = functor(float(0.123), [], context("", 1))
+read_term("-0.123.") = functor(float(-0.123), [], context("", 1))
+read_term("0.1_2__3.") = functor(float(0.123), [], context("", 1))
+read_term("-0.1_2__3.") = functor(float(-0.123), [], context("", 1))
+read_term("1.123.") = functor(float(1.123), [], context("", 1))
+read_term("-1.123.") = functor(float(-1.123), [], context("", 1))
+read_term("1_2.123.") = functor(float(12.123), [], context("", 1))
+read_term("-1_2.123.") = functor(float(-12.123), [], context("", 1))
+read_term("1__2.1_2__3.") = functor(float(12.123), [], context("", 1))
+read_term("-1__2.1_2__3.") = functor(float(-12.123), [], context("", 1))
+read_term("1_2_3e1_1.") = functor(float(12300000000000.0), [], context("", 1))
+read_term("1_2_3E1_1.") = functor(float(12300000000000.0), [], context("", 1))
+read_term("1_2e+1_1.") = functor(float(1200000000000.0), [], context("", 1))
+read_term("1_2E+1_1.") = functor(float(1200000000000.0), [], context("", 1))
+read_term("1_2e-1_1.") = functor(float(1.2e-10), [], context("", 1))
+read_term("1_2E-1_1.") = functor(float(1.2e-10), [], context("", 1))
+
+Invalid float literals:
+read_term("1_2_3.1_2_3_.") = Syntax error: fractional part of float terminated by underscore
+read_term("1_2_3e1_2_3_.") = Syntax error: unterminated exponent in float token
+read_term("123_._123.") = Syntax error: unterminated decimal constant
+read_term("123._123.") = Syntax error: underscore following decimal point
+read_term("123_.123.") = Syntax error: unterminated decimal constant
+read_term("123_e12.") = Syntax error: underscore before exponent
+read_term("123_E12.") = Syntax error: underscore before exponent
+read_term("123e_12.") = Syntax error: unterminated exponent in float token
+read_term("123E_12.") = Syntax error: unterminated exponent in float token
+read_term("123e12_.") = Syntax error: unterminated exponent in float token
+read_term("123E12_.") = Syntax error: unterminated exponent in float token
+read_term("12_e11.") = Syntax error: underscore before exponent
+read_term("12_E11.") = Syntax error: underscore before exponent
+read_term("123.12e-_12.") = Syntax error: unterminated exponent in float token
+read_term("123.12e+_12.") = Syntax error: unterminated exponent in float token
+read_term("123.12e12_.") = Syntax error: unterminated exponent in float token
+read_term("123.12E12_.") = Syntax error: unterminated exponent in float token
diff --git a/tests/hard_coded/parse_number_from_string.m b/tests/hard_coded/parse_number_from_string.m
index e69de29..527532a 100644
--- a/tests/hard_coded/parse_number_from_string.m
+++ b/tests/hard_coded/parse_number_from_string.m
@@ -0,0 +1,201 @@
+%---------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et
+%---------------------------------------------------------------------------%
+
+% Test the parsing of numeric literals from strings.
+
+:- module parse_number_from_string.
+:- interface.
+
+:- import_module io.
+
+:- pred main(io::di, io::uo) is det.
+
+%---------------------------------------------------------------------------%
+%---------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module list.
+:- import_module term.
+:- import_module term_io.
+:- import_module parser.
+:- import_module string.
+
+%---------------------------------------------------------------------------%
+
+main(!IO) :-
+    io.print_line("Valid decimal literals:", !IO),
+    list.foldl(run_test, valid_decimal_cases, !IO),
+    io.print_line("\nInvalid decimal literals:", !IO),
+    list.foldl(run_test, invalid_decimal_cases, !IO),
+    io.print_line("\nValid binary literals:", !IO),
+    list.foldl(run_test, valid_binary_cases, !IO),
+    io.print_line("\nInvalid binary literals:", !IO),
+    list.foldl(run_test, invalid_binary_cases, !IO),
+    io.print_line("\nValid octal literals:", !IO),
+    list.foldl(run_test, valid_octal_cases, !IO),
+    io.print_line("\nInvalid octal literals:", !IO),
+    list.foldl(run_test, invalid_octal_cases, !IO),
+    io.print_line("\nValid hexadecimal literals:", !IO),
+    list.foldl(run_test, valid_hex_cases, !IO),
+    io.print_line("\nInvalid hexadecimal literals:", !IO),
+    list.foldl(run_test, invalid_hex_cases, !IO),
+    io.print_line("\nValid float literals:", !IO),
+    list.foldl(run_test, valid_float_cases, !IO),
+    io.print_line("\nInvalid float literals:", !IO),
+    list.foldl(run_test, invalid_float_cases, !IO).
+
+:- pred run_test(string::in, io::di, io::uo) is det.
+
+run_test(String, !IO) :-
+    io.format("read_term(\"%s\") = ", [s(String)], !IO),
+    read_term_from_string("", String, _Posn, Result : read_term),
+    (
+        Result = eof,
+        io.print_line("<<eof>>", !IO)
+    ;
+        Result = error(Msg, _),
+        io.print_line(Msg, !IO)
+    ;
+        Result = term(_Varset, Term),
+        io.print_line(Term, !IO)
+    ).
+
+%---------------------------------------------------------------------------%
+
+:- func valid_decimal_cases = list(string).
+
+valid_decimal_cases = [
+    "0.",
+    "-0.",
+    "10.",
+    "-10.",
+    "1_0.",
+    "-1_0.",
+    "1_000_000_000_000_000_000_000.",
+    "-1_000_000_000_000_000_000_000."
+].
+
+:- func invalid_decimal_cases = list(string).
+
+invalid_decimal_cases = [
+    "123_.",
+    "-123_.",
+    "-_123"
+].
+
+:- func valid_binary_cases = list(string).
+
+valid_binary_cases = [
+    "0b0.",
+    "-0b0.",
+    "0b_1.",
+    "-0b_1.",
+    "0b_1000_100.",
+    "-0b_1000_100."
+].
+
+:- func invalid_binary_cases = list(string).
+
+invalid_binary_cases =[
+    "0b.",
+    "-0b.",
+    "0b_.",
+    "-0b_.",
+    "0b11_.",
+    "-0b11_."
+].
+
+:- func valid_octal_cases = list(string).
+
+valid_octal_cases = [
+    "0o77.",
+    "-0o77.",
+    "0o_77.",
+    "-0o_77.",
+    "0o_7_7.",
+    "-0o_7_7.",
+    "0o_7__7___7.",
+    "-0o_7__7___7."
+].
+
+:- func invalid_octal_cases = list(string).
+
+invalid_octal_cases = [
+    "0o.",
+    "-0o",
+    "0o_.",
+    "-0o_.",
+    "0o77_.",
+    "-0o77_."
+].
+
+:- func valid_hex_cases = list(string).
+
+valid_hex_cases = [
+    "0xff.",
+    "-0xff.",
+    "0x_ff.",
+    "-0x_ff.",
+    "0xf_f.",
+    "-0xf_f.",
+    "0x_f_f__f.",
+    "-0x_f_f__f.",
+    "0xfffffffffffffffffffffffff.",
+    "-0xfffffffffffffffffffffffff."
+].
+
+:- func invalid_hex_cases = list(string).
+
+invalid_hex_cases = [
+    "0x.",
+    "-0x.",
+    "0x_.",
+    "-0x_.",
+    "0xff_.",
+    "-0xff_."
+].
+
+:- func valid_float_cases = list(string).
+
+valid_float_cases = [
+    "0.123.",
+    "-0.123.",
+    "0.1_2__3.",
+    "-0.1_2__3.",
+    "1.123.",
+    "-1.123.",
+    "1_2.123.",
+    "-1_2.123.",
+    "1__2.1_2__3.",
+    "-1__2.1_2__3.",
+    "1_2_3e1_1.",
+    "1_2_3E1_1.",
+    "1_2e+1_1.",
+    "1_2E+1_1.",
+    "1_2e-1_1.",
+    "1_2E-1_1."
+].
+
+:- func invalid_float_cases = list(string).
+
+invalid_float_cases = [
+    "1_2_3.1_2_3_.",
+    "1_2_3e1_2_3_.",
+    "123_._123.",
+    "123._123.",
+    "123_.123.",
+    "123_e12.",
+    "123_E12.",
+    "123e_12.",
+    "123E_12.",
+    "123e12_.",
+    "123E12_.",
+    "12_e11.",
+    "12_E11.",
+    "123.12e-_12.",
+    "123.12e+_12.",
+    "123.12e12_.",
+    "123.12E12_."
+].
diff --git a/tests/invalid/Mmakefile b/tests/invalid/Mmakefile
index 79dd91e..0ca0932 100644
--- a/tests/invalid/Mmakefile
+++ b/tests/invalid/Mmakefile
@@ -171,6 +171,11 @@ SINGLEMODULE= \
  	instance_bug \
  	instance_no_type \
  	instance_var_bug \
+	invalid_binary_literal \
+	invalid_decimal_literal \
+	invalid_float_literal \
+	invalid_hex_literal \
+	invalid_octal_literal \
  	invalid_event \
  	invalid_export_detism \
  	invalid_instance_declarations \
diff --git a/tests/invalid/invalid_binary_literal.err_exp b/tests/invalid/invalid_binary_literal.err_exp
index e69de29..c6b6770 100644
--- a/tests/invalid/invalid_binary_literal.err_exp
+++ b/tests/invalid/invalid_binary_literal.err_exp
@@ -0,0 +1,30 @@
+invalid_binary_literal.m:006: Error: no clauses for function `foo1'/0.
+invalid_binary_literal.m:007: Error: no clauses for function `foo2'/0.
+invalid_binary_literal.m:008: Error: no clauses for function `foo3'/0.
+invalid_binary_literal.m:009: Error: no clauses for function `foo4'/0.
+invalid_binary_literal.m:010: Error: no clauses for function `foo5'/0.
+invalid_binary_literal.m:011: Error: no clauses for function `foo6'/0.
+invalid_binary_literal.m:012: Error: no clauses for function `foo7'/0.
+invalid_binary_literal.m:013: Error: no clauses for function `foo8'/0.
+invalid_binary_literal.m:017: Syntax error at token `. ': unexpected token at
+invalid_binary_literal.m:017:   start of (sub)term.
+invalid_binary_literal.m:017: Syntax error: unterminated binary constant.
+invalid_binary_literal.m:019: Syntax error at token `. ': unexpected token at
+invalid_binary_literal.m:019:   start of (sub)term.
+invalid_binary_literal.m:019: Syntax error: unterminated binary constant.
+invalid_binary_literal.m:021: Syntax error at token `. ': unexpected token at
+invalid_binary_literal.m:021:   start of (sub)term.
+invalid_binary_literal.m:021: Syntax error: unterminated binary constant.
+invalid_binary_literal.m:023: Syntax error at token `. ': unexpected token at
+invalid_binary_literal.m:023:   start of (sub)term.
+invalid_binary_literal.m:023: Syntax error: unterminated binary constant.
+invalid_binary_literal.m:025: Syntax error at token `. ': unexpected token at
+invalid_binary_literal.m:025:   start of (sub)term.
+invalid_binary_literal.m:025: Syntax error: unterminated binary constant.
+invalid_binary_literal.m:027: Syntax error at token `. ': unexpected token at
+invalid_binary_literal.m:027:   start of (sub)term.
+invalid_binary_literal.m:027: Syntax error: unterminated binary constant.
+invalid_binary_literal.m:029: Syntax error at variable `_b11': operator or `.'
+invalid_binary_literal.m:029:   expected.
+invalid_binary_literal.m:031: Syntax error at variable `_b11': operator or `.'
+invalid_binary_literal.m:031:   expected.
diff --git a/tests/invalid/invalid_binary_literal.m b/tests/invalid/invalid_binary_literal.m
index e69de29..6e1ea2f 100644
--- a/tests/invalid/invalid_binary_literal.m
+++ b/tests/invalid/invalid_binary_literal.m
@@ -0,0 +1,31 @@
+% Test parsing of invalid binary litrals.
+
+:- module invalid_binary_literal.
+:- interface.
+
+:- func foo1 = int.
+:- func foo2 = int.
+:- func foo3 = int.
+:- func foo4 = int.
+:- func foo5 = int.
+:- func foo6 = int.
+:- func foo7 = int.
+:- func foo8 = int.
+
+:- implementation.
+
+foo1 = 0b.
+
+foo2 = -0b.
+
+foo3 = 0b_.
+
+foo4 = -0b_.
+
+foo5 = 0b11_.
+
+foo6 = -0b11_.
+
+foo7 = 0_b11.
+
+foo8 = -0_b11.
diff --git a/tests/invalid/invalid_decimal_literal.err_exp b/tests/invalid/invalid_decimal_literal.err_exp
index e69de29..1ab2c10 100644
--- a/tests/invalid/invalid_decimal_literal.err_exp
+++ b/tests/invalid/invalid_decimal_literal.err_exp
@@ -0,0 +1,4 @@
+invalid_decimal_literal.m:006: Error: no clauses for function `foo1'/0.
+invalid_decimal_literal.m:007: Error: no clauses for function `foo2'/0.
+invalid_decimal_literal.m:011: Syntax error: unterminated decimal constant.
+invalid_decimal_literal.m:013: Syntax error: unterminated decimal constant.
diff --git a/tests/invalid/invalid_decimal_literal.m b/tests/invalid/invalid_decimal_literal.m
index e69de29..f973d22 100644
--- a/tests/invalid/invalid_decimal_literal.m
+++ b/tests/invalid/invalid_decimal_literal.m
@@ -0,0 +1,13 @@
+% Test parsing of invalid decimal integer literals.
+
+:- module invalid_decimal_literal.
+:- interface.
+
+:- func foo1 = int.
+:- func foo2 = int.
+
+:- implementation.
+
+foo1 = 561_.
+
+foo2 = -444_.
diff --git a/tests/invalid/invalid_float_literal.err_exp b/tests/invalid/invalid_float_literal.err_exp
index e69de29..85e27cc 100644
--- a/tests/invalid/invalid_float_literal.err_exp
+++ b/tests/invalid/invalid_float_literal.err_exp
@@ -0,0 +1,51 @@
+invalid_float_literal.m:006: Error: no clauses for function `foo1'/0.
+invalid_float_literal.m:007: Error: no clauses for function `foo2'/0.
+invalid_float_literal.m:008: Error: no clauses for function `foo3'/0.
+invalid_float_literal.m:009: Error: no clauses for function `foo4'/0.
+invalid_float_literal.m:010: Error: no clauses for function `foo5'/0.
+invalid_float_literal.m:011: Error: no clauses for function `foo6'/0.
+invalid_float_literal.m:012: Error: no clauses for function `foo7'/0.
+invalid_float_literal.m:013: Error: no clauses for function `foo8'/0.
+invalid_float_literal.m:014: Error: no clauses for function `foo9'/0.
+invalid_float_literal.m:015: Error: no clauses for function `foo10'/0.
+invalid_float_literal.m:016: Error: no clauses for function `foo11'/0.
+invalid_float_literal.m:017: Error: no clauses for function `foo12'/0.
+invalid_float_literal.m:021: Syntax error at token `. ': unexpected token at
+invalid_float_literal.m:021:   start of (sub)term.
+invalid_float_literal.m:021: Syntax error: fractional part of float terminated
+invalid_float_literal.m:021:   by underscore.
+invalid_float_literal.m:023: Syntax error at token `. ': unexpected token at
+invalid_float_literal.m:023:   start of (sub)term.
+invalid_float_literal.m:023: Syntax error: unterminated exponent in float
+invalid_float_literal.m:023:   token.
+invalid_float_literal.m:025: In clause head: error: atom expected at 12.
+invalid_float_literal.m:025: Syntax error: underscore before exponent.
+invalid_float_literal.m:027: In clause head: error: atom expected at _12.
+invalid_float_literal.m:027: Syntax error: unterminated exponent in float
+invalid_float_literal.m:027:   token.
+invalid_float_literal.m:029: In clause head: error: atom expected at _123.
+invalid_float_literal.m:029: Syntax error: unterminated decimal constant.
+invalid_float_literal.m:031: In clause head: error: atom expected at 123.
+invalid_float_literal.m:031: Syntax error: underscore following decimal point.
+invalid_float_literal.m:033: In clause head: error: atom expected at 123.
+invalid_float_literal.m:033: Syntax error: unterminated decimal constant.
+invalid_float_literal.m:035: Syntax error: unterminated exponent in float
+invalid_float_literal.m:035:   token.
+invalid_float_literal.m:035: Error: clause for predicate
+invalid_float_literal.m:035:   `invalid_float_literal.-'/2
+invalid_float_literal.m:035:   without corresponding `:- pred' declaration.
+invalid_float_literal.m:035: Inferred :- pred -(T1, int).
+invalid_float_literal.m:037: Syntax error: unterminated exponent in float
+invalid_float_literal.m:037:   token.
+invalid_float_literal.m:037: Error: clause for predicate
+invalid_float_literal.m:037:   `invalid_float_literal.+'/2
+invalid_float_literal.m:037:   without corresponding `:- pred' declaration.
+invalid_float_literal.m:037: Inferred :- pred +(T1, int).
+invalid_float_literal.m:039: In clause head: error: atom expected at 12.
+invalid_float_literal.m:039: Syntax error: underscore before exponent.
+invalid_float_literal.m:041: In clause head: error: atom expected at _12.
+invalid_float_literal.m:041: Syntax error: unterminated exponent in float
+invalid_float_literal.m:041:   token.
+invalid_float_literal.m:043: In clause head: error: atom expected at _12.
+invalid_float_literal.m:043: Syntax error: unterminated exponent in float
+invalid_float_literal.m:043:   token.
diff --git a/tests/invalid/invalid_float_literal.m b/tests/invalid/invalid_float_literal.m
index e69de29..56b9feb 100644
--- a/tests/invalid/invalid_float_literal.m
+++ b/tests/invalid/invalid_float_literal.m
@@ -0,0 +1,43 @@
+% Test parsing of invalid float literals.
+
+:- module invalid_float_literal.
+:- interface.
+
+:- func foo1 = float.
+:- func foo2 = float.
+:- func foo3 = float.
+:- func foo4 = float.
+:- func foo5 = float.
+:- func foo6 = float.
+:- func foo7 = float.
+:- func foo8 = float.
+:- func foo9 = float.
+:- func foo10 = float.
+:- func foo11 = float.
+:- func foo12 = float.
+
+:- implementation.
+
+foo1 = 1_2_3.1_2_3_.
+
+foo2 = 1_2_3e1_2_3_.
+
+foo3 = 123_e12.
+
+foo4 = 123e_12.
+
+foo5 = 123_._123.
+
+foo6 = 123._123.
+
+foo7 = 123_.123.
+
+foo8 = 123.12e_-12.
+
+foo9 = 123.12e_+12.
+
+foo10 = 123_e12.
+
+foo11 = 123.12e-_12.
+
+foo12 = 123.12e+_12.
diff --git a/tests/invalid/invalid_hex_literal.err_exp b/tests/invalid/invalid_hex_literal.err_exp
index e69de29..cd402e3 100644
--- a/tests/invalid/invalid_hex_literal.err_exp
+++ b/tests/invalid/invalid_hex_literal.err_exp
@@ -0,0 +1,30 @@
+invalid_hex_literal.m:006: Error: no clauses for function `foo1'/0.
+invalid_hex_literal.m:007: Error: no clauses for function `foo2'/0.
+invalid_hex_literal.m:008: Error: no clauses for function `foo3'/0.
+invalid_hex_literal.m:009: Error: no clauses for function `foo4'/0.
+invalid_hex_literal.m:010: Error: no clauses for function `foo5'/0.
+invalid_hex_literal.m:011: Error: no clauses for function `foo6'/0.
+invalid_hex_literal.m:012: Error: no clauses for function `foo7'/0.
+invalid_hex_literal.m:013: Error: no clauses for function `foo8'/0.
+invalid_hex_literal.m:017: Syntax error at token `. ': unexpected token at
+invalid_hex_literal.m:017:   start of (sub)term.
+invalid_hex_literal.m:017: Syntax error: unterminated hex constant.
+invalid_hex_literal.m:019: Syntax error at token `. ': unexpected token at
+invalid_hex_literal.m:019:   start of (sub)term.
+invalid_hex_literal.m:019: Syntax error: unterminated hex constant.
+invalid_hex_literal.m:021: Syntax error at token `. ': unexpected token at
+invalid_hex_literal.m:021:   start of (sub)term.
+invalid_hex_literal.m:021: Syntax error: unterminated hex constant.
+invalid_hex_literal.m:023: Syntax error at token `. ': unexpected token at
+invalid_hex_literal.m:023:   start of (sub)term.
+invalid_hex_literal.m:023: Syntax error: unterminated hex constant.
+invalid_hex_literal.m:025: Syntax error at token `. ': unexpected token at
+invalid_hex_literal.m:025:   start of (sub)term.
+invalid_hex_literal.m:025: Syntax error: unterminated hex constant.
+invalid_hex_literal.m:027: Syntax error at token `. ': unexpected token at
+invalid_hex_literal.m:027:   start of (sub)term.
+invalid_hex_literal.m:027: Syntax error: unterminated hex constant.
+invalid_hex_literal.m:029: Syntax error at variable `_xff': operator or `.'
+invalid_hex_literal.m:029:   expected.
+invalid_hex_literal.m:031: Syntax error at variable `_xff': operator or `.'
+invalid_hex_literal.m:031:   expected.
diff --git a/tests/invalid/invalid_hex_literal.m b/tests/invalid/invalid_hex_literal.m
index e69de29..84743fd 100644
--- a/tests/invalid/invalid_hex_literal.m
+++ b/tests/invalid/invalid_hex_literal.m
@@ -0,0 +1,31 @@
+% Test parsing of invalid hex integer literals.
+
+:- module invalid_hex_literal.
+:- interface.
+
+:- func foo1 = int.
+:- func foo2 = int.
+:- func foo3 = int.
+:- func foo4 = int.
+:- func foo5 = int.
+:- func foo6 = int.
+:- func foo7 = int.
+:- func foo8 = int.
+
+:- implementation.
+
+foo1 = 0x.
+
+foo2 = -0x.
+
+foo3 = 0x_.
+
+foo4 = -0x_.
+
+foo5 = 0xff_.
+
+foo6 = -0xff_.
+
+foo7 = 0_xff.
+
+foo8 = -0_xff.
diff --git a/tests/invalid/invalid_octal_literal.err_exp b/tests/invalid/invalid_octal_literal.err_exp
index e69de29..d84cd8f 100644
--- a/tests/invalid/invalid_octal_literal.err_exp
+++ b/tests/invalid/invalid_octal_literal.err_exp
@@ -0,0 +1,30 @@
+invalid_octal_literal.m:006: Error: no clauses for function `foo1'/0.
+invalid_octal_literal.m:007: Error: no clauses for function `foo2'/0.
+invalid_octal_literal.m:008: Error: no clauses for function `foo3'/0.
+invalid_octal_literal.m:009: Error: no clauses for function `foo4'/0.
+invalid_octal_literal.m:010: Error: no clauses for function `foo5'/0.
+invalid_octal_literal.m:011: Error: no clauses for function `foo6'/0.
+invalid_octal_literal.m:012: Error: no clauses for function `foo7'/0.
+invalid_octal_literal.m:013: Error: no clauses for function `foo8'/0.
+invalid_octal_literal.m:017: Syntax error at token `. ': unexpected token at
+invalid_octal_literal.m:017:   start of (sub)term.
+invalid_octal_literal.m:017: Syntax error: unterminated octal constant.
+invalid_octal_literal.m:019: Syntax error at token `. ': unexpected token at
+invalid_octal_literal.m:019:   start of (sub)term.
+invalid_octal_literal.m:019: Syntax error: unterminated octal constant.
+invalid_octal_literal.m:021: Syntax error at token `. ': unexpected token at
+invalid_octal_literal.m:021:   start of (sub)term.
+invalid_octal_literal.m:021: Syntax error: unterminated octal constant.
+invalid_octal_literal.m:023: Syntax error at token `. ': unexpected token at
+invalid_octal_literal.m:023:   start of (sub)term.
+invalid_octal_literal.m:023: Syntax error: unterminated octal constant.
+invalid_octal_literal.m:025: Syntax error at token `. ': unexpected token at
+invalid_octal_literal.m:025:   start of (sub)term.
+invalid_octal_literal.m:025: Syntax error: unterminated octal constant.
+invalid_octal_literal.m:027: Syntax error at token `. ': unexpected token at
+invalid_octal_literal.m:027:   start of (sub)term.
+invalid_octal_literal.m:027: Syntax error: unterminated octal constant.
+invalid_octal_literal.m:029: Syntax error at variable `_o77': operator or `.'
+invalid_octal_literal.m:029:   expected.
+invalid_octal_literal.m:031: Syntax error at variable `_o77': operator or `.'
+invalid_octal_literal.m:031:   expected.
diff --git a/tests/invalid/invalid_octal_literal.m b/tests/invalid/invalid_octal_literal.m
index e69de29..5288018 100644
--- a/tests/invalid/invalid_octal_literal.m
+++ b/tests/invalid/invalid_octal_literal.m
@@ -0,0 +1,31 @@
+% Test parsing of invalid octal integer literals.
+
+:- module invalid_octal_literal.
+:- interface.
+
+:- func foo1 = int.
+:- func foo2 = int.
+:- func foo3 = int.
+:- func foo4 = int.
+:- func foo5 = int.
+:- func foo6 = int.
+:- func foo7 = int.
+:- func foo8 = int.
+
+:- implementation.
+
+foo1 = 0o.
+
+foo2 = -0o.
+
+foo3 = 0o_.
+
+foo4 = -0o_.
+
+foo5 = 0o77_.
+
+foo6 = -0o77_.
+
+foo7 = 0_o77.
+
+foo8 = -0_o77.


More information about the reviews mailing list