[m-rev.] diff: allow UTF-8 to java target files

Peter Wang novalazy at gmail.com
Mon Aug 24 17:30:45 AEST 2009


Branches: main

compiler/c_util.m:
        Allow UTF-8 string literals in Mercury source files to be written to
        Java target files unscathed.  Each UTF-8 code unit that was part of a
        multi-byte sequence (i.e. value > 127) was being individually escaped.
        This is assuming the compiler is built in a C grade.

diff --git a/compiler/c_util.m b/compiler/c_util.m
index 87bb25f..e9d97a6 100644
--- a/compiler/c_util.m
+++ b/compiler/c_util.m
@@ -316,11 +316,28 @@ quote_one_char(Lang, Char, RevChars0, RevChars) :-
         java_escape_special_char(Char, RevEscapeChars)
     ->
         list.append(RevEscapeChars, RevChars0, RevChars)
-    ; escape_special_char(Char, EscapeChar) ->
+    ;
+        escape_special_char(Char, EscapeChar)
+    ->
         RevChars = [EscapeChar, '\\' | RevChars0]
-    ; is_c_source_char(Char) ->
+    ;
+        is_c_source_char(Char)
+    ->
+        RevChars = [Char | RevChars0]
+    ;
+        Lang = literal_java,
+        char.to_int(Char) >= 0x80
+    ->
+        % If the compiler is built in a C grade (8-bit strings), we assume that
+        % both the Mercury source file and Java target file use UTF-8 encoding.
+        % Each `Char' will be a UTF-8 code unit in a multi-byte sequence.
+        % If the compiler is built in a Java backend, each `Char' will be a
+        % UTF-16 code unit, possibly of a surrogate pair. In both cases the
+        % code units must be passed through without escaping.
         RevChars = [Char | RevChars0]
-    ; char.to_int(Char, 0) ->
+    ;
+        char.to_int(Char, 0)
+    ->
         RevChars = ['0', '\\' | RevChars0]
     ;
         escape_any_char(Char, EscapeChars),

--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list