[m-rev.] for review: change string representation for erlang backend

Peter Wang wangp at students.csse.unimelb.edu.au
Mon Aug 13 13:58:28 AEST 2007


Estimated hours taken: 30
Branches: main

Change the representation of Mercury strings in the Erlang backend to use
binaries instead of the conventional list of integers representation.
Binaries require much less space and provide O(1) indexing (among other
things), although some operations are faster with lists of integers.  I did
not notice much speed difference with the Mercury compiler either way, though.

The HiPE compiler seems not to treat binaries in static data structures
efficiently as it does for lists, so we stick to the list representation
for strings in RTTI data to avoid a big performance drop.

compiler/elds.m:
	Modify the ELDS to allow two string representations, elds_binary and
	elds_list_of_ints.

compiler/elds_to_erlang.m:
compiler/erl_call_gen.m:
compiler/erl_code_gen.m:
compiler/erl_code_util.m:
compiler/erl_rtti.m:
compiler/erl_unify_gen.m:
	Conform to the change in the ELDS.

library/dir.m:
library/erlang_builtin.m:
library/erlang_rtti_implementation.m:
library/io.m:
library/library.m:
library/string.m:
	Update foreign code to account for the new string representation.

	Make the io module use `raw' streams, i.e. don't spawn a separate
	process to handle the file as we have our own process as well.

	Also set read_ahead and delayed_write options to enable caching and
	reduce the number of system calls.

util/mkinit_erl.c:
	Add a comment about the string representation in _init.erl files
	for initialising environment variable values.

doc/reference_manual.texi:
	Update the documentation.

Index: compiler/elds.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/elds.m,v
retrieving revision 1.17
diff -u -r1.17 elds.m
--- compiler/elds.m	1 Aug 2007 05:47:06 -0000	1.17
+++ compiler/elds.m	13 Aug 2007 03:38:33 -0000
@@ -212,7 +212,15 @@
     --->    elds_char(char)
     ;       elds_int(int)
     ;       elds_float(float)
-    ;       elds_string(string)
+
+    ;       elds_binary(string)
+            % We use Erlang binaries to represent most Mercury strings.
+
+    ;       elds_list_of_ints(string)
+            % In RTTI data we use the conventional Erlang representation of
+            % strings (a list of integers) because the HiPE compiler doesn't
+            % seem to treat binaries as static data as efficiently.
+
     ;       elds_atom_raw(string)
     ;       elds_atom(sym_name)
             % `elds_atom_raw' is useful to introduce arbitrary atoms into the
Index: compiler/elds_to_erlang.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/elds_to_erlang.m,v
retrieving revision 1.27
diff -u -r1.27 elds_to_erlang.m
--- compiler/elds_to_erlang.m	1 Aug 2007 05:51:04 -0000	1.27
+++ compiler/elds_to_erlang.m	13 Aug 2007 03:38:33 -0000
@@ -738,10 +738,16 @@
         io.write_float(Float, !IO),
         space(!IO)
     ;
-        Term = elds_string(String),
-        io.write_char('"', !IO),
+        Term = elds_binary(String),
+        io.write_string("<<""", !IO),
         write_with_escaping(in_string, String, !IO),
-        io.write_char('"', !IO),
+        io.write_string(""">>", !IO),
+        space(!IO)
+    ;
+        Term = elds_list_of_ints(String),
+        io.write_string("""", !IO),
+        write_with_escaping(in_string, String, !IO),
+        io.write_string("""", !IO),
         space(!IO)
     ;
         Term = elds_char(Char),
Index: compiler/erl_call_gen.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/erl_call_gen.m,v
retrieving revision 1.11
diff -u -r1.11 erl_call_gen.m
--- compiler/erl_call_gen.m	1 Aug 2007 05:47:06 -0000	1.11
+++ compiler/erl_call_gen.m	13 Aug 2007 03:38:33 -0000
@@ -781,7 +781,7 @@
         erl_gen_info_add_env_var_name(EnvVar, !Info),
         Args = [
             elds_term(elds_atom_raw("env_var")),
-            elds_term(elds_string(EnvVar))
+            elds_term(elds_binary(EnvVar))
         ]
     ;
         TraceExpr = trace_not(ExprA),
Index: compiler/erl_code_gen.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/erl_code_gen.m,v
retrieving revision 1.23
diff -u -r1.23 erl_code_gen.m
--- compiler/erl_code_gen.m	7 Aug 2007 07:09:52 -0000	1.23
+++ compiler/erl_code_gen.m	13 Aug 2007 03:38:33 -0000
@@ -803,9 +803,11 @@
             string.length(String) =< 255
         )
     then
+        % Atom = list_to_atom(binary_to_list(Var))
         erl_gen_info_new_named_var("Atom", AtomVar, !Info),
+        CharList = elds_call_builtin("binary_to_list", [expr_from_var(Var)]),
         StringToAtom = elds_eq(expr_from_var(AtomVar),
-            elds_call_builtin("list_to_atom", [expr_from_var(Var)])),
+            elds_call_builtin("list_to_atom", [CharList])),
         MaybeConvertToAtom = yes(StringToAtom),
         SwitchVar = AtomVar,
         GenCase = erl_gen_case_on_atom(CodeModel, InstMap,
Index: compiler/erl_code_util.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/erl_code_util.m,v
retrieving revision 1.13
diff -u -r1.13 erl_code_util.m
--- compiler/erl_code_util.m	7 Aug 2007 07:09:52 -0000	1.13
+++ compiler/erl_code_util.m	13 Aug 2007 03:38:33 -0000
@@ -540,7 +540,8 @@
     (
         ( Term0 = elds_int(_)
         ; Term0 = elds_float(_)
-        ; Term0 = elds_string(_)
+        ; Term0 = elds_binary(_)
+        ; Term0 = elds_list_of_ints(_)
         ; Term0 = elds_char(_)
         ; Term0 = elds_atom_raw(_)
         ; Term0 = elds_atom(_)
@@ -703,7 +704,8 @@
     (
         ( Term = elds_int(_)
         ; Term = elds_float(_)
-        ; Term = elds_string(_)
+        ; Term = elds_binary(_)
+        ; Term = elds_list_of_ints(_)
         ; Term = elds_char(_)
         ; Term = elds_atom_raw(_)
         ; Term = elds_atom(_)
@@ -840,7 +842,8 @@
     (
         ( Term = elds_int(_)
         ; Term = elds_float(_)
-        ; Term = elds_string(_)
+        ; Term = elds_binary(_)
+        ; Term = elds_list_of_ints(_)
         ; Term = elds_char(_)
         ; Term = elds_atom_raw(_)
         ; Term = elds_atom(_)
Index: compiler/erl_rtti.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/erl_rtti.m,v
retrieving revision 1.14
diff -u -r1.14 erl_rtti.m
--- compiler/erl_rtti.m	9 Jul 2007 04:48:44 -0000	1.14
+++ compiler/erl_rtti.m	13 Aug 2007 03:38:33 -0000
@@ -562,8 +562,8 @@
         elds_term(elds_int(Version)),
         UnifyExpr,
         CompareExpr,
-        elds_term(elds_string(sym_name_to_string(ModuleName))),
-        elds_term(elds_string(TypeName)),
+        elds_term(elds_list_of_ints(sym_name_to_string(ModuleName))),
+        elds_term(elds_list_of_ints(TypeName)),
         erlang_type_ctor_rep(Details),
         ELDSDetails
         ]),
@@ -803,7 +803,7 @@
     ; dynamic_cast(Term, Char) ->
         ELDS = elds_term(elds_char(Char))
     ; dynamic_cast(Term, String) ->
-        ELDS = elds_term(elds_string(String))
+        ELDS = elds_term(elds_list_of_ints(String))
     ; dynamic_cast(Term, Float) ->
         ELDS = elds_term(elds_float(Float))
 
Index: compiler/erl_unify_gen.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/compiler/erl_unify_gen.m,v
retrieving revision 1.10
diff -u -r1.10 erl_unify_gen.m
--- compiler/erl_unify_gen.m	7 Aug 2007 07:09:52 -0000	1.10
+++ compiler/erl_unify_gen.m	13 Aug 2007 03:38:33 -0000
@@ -266,7 +266,7 @@
         Term = elds_int(Int)
     ;
         ConsId = string_const(String),
-        Term = elds_string(String)
+        Term = elds_binary(String)
     ;
         ConsId = float_const(Float),
         Term = elds_float(Float)
Index: library/dir.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/dir.m,v
retrieving revision 1.41
diff -u -r1.41 dir.m
--- library/dir.m	3 Aug 2007 02:30:39 -0000	1.41
+++ library/dir.m	13 Aug 2007 03:38:33 -0000
@@ -909,12 +909,13 @@
     dir.make_directory(DirName::in, Res::out, _IO0::di, _IO::uo),
     [may_call_mercury, promise_pure, tabled_for_io, thread_safe, terminates],
 "
+    DirNameStr = binary_to_list(DirName),
     % filelib:ensure_dir makes all the parent directories.
-    case filelib:ensure_dir(DirName) of
+    case filelib:ensure_dir(DirNameStr) of
         ok ->
             ErrorIfExists = 0,
             Res = mercury__dir:'ML_make_single_directory_2'(ErrorIfExists,
-                DirName);
+                DirName);   % not DirNameStr
         {error, Reason} ->
             Res = mercury__dir:'ML_make_mkdir_res_error'(Reason)
     end
@@ -1078,11 +1079,13 @@
         Result::out, _IO0::di, _IO::uo),
     [may_call_mercury, promise_pure, tabled_for_io, thread_safe, terminates],
 "
-    case file:make_dir(DirName) of
+    DirNameStr = binary_to_list(DirName),
+    case file:make_dir(DirNameStr) of
         ok ->
             Result = mercury__dir:'ML_make_mkdir_res_ok'();
         {error, eexist} when ErrorIfExists =:= 0 ->
-            Result = mercury__dir:'ML_make_mkdir_res_exists'(eexist, DirName);
+            Result = mercury__dir:'ML_make_mkdir_res_exists'(eexist,
+                DirName);   % not DirNameStr
         {error, Reason} ->
             Result = mercury__dir:'ML_make_mkdir_res_error'(Reason)
     end
Index: library/erlang_builtin.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/erlang_builtin.m,v
retrieving revision 1.3
diff -u -r1.3 erlang_builtin.m
--- library/erlang_builtin.m	22 Jun 2007 04:42:21 -0000	1.3
+++ library/erlang_builtin.m	13 Aug 2007 03:38:33 -0000
@@ -87,14 +87,15 @@
             put(mutable_key(MutableName), Value),
             global_server_loop();
 
-        {init_env_var, EnvVarName} ->
-            case os:getenv(EnvVarName) of
+        {init_env_var, EnvVarNameStr} ->
+            % EnvVarNameStr is a string (list of integers), not a binary.
+            case os:getenv(EnvVarNameStr) of
                 false ->
                     Value = false;
                 _ ->
                     Value = true
             end,
-            put(env_var_key(EnvVarName), Value),
+            put(env_var_key(list_to_binary(EnvVarNameStr)), Value),
             global_server_loop();
 
         {trace_evaluate_runtime_condition, Cond, From} ->
Index: library/erlang_rtti_implementation.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/erlang_rtti_implementation.m,v
retrieving revision 1.14
diff -u -r1.14 erlang_rtti_implementation.m
--- library/erlang_rtti_implementation.m	25 Jul 2007 03:08:50 -0000	1.14
+++ library/erlang_rtti_implementation.m	13 Aug 2007 03:38:33 -0000
@@ -73,6 +73,7 @@
 :- implementation.
 
 :- import_module array.
+:- import_module char.
 :- import_module int.
 :- import_module require.
 :- import_module string.
@@ -371,7 +372,7 @@
         TypeCtorRep = etcr_du,
         FunctorReps = TypeCtorInfo ^ type_ctor_functors,
         FunctorRep = matching_du_functor(FunctorReps, Term),
-        Functor = FunctorRep ^ edu_name,
+        Functor = string.from_char_list(FunctorRep ^ edu_name),
         Arity = FunctorRep ^ edu_orig_arity,
         Arguments = list.map(
             get_du_functor_arg(TypeInfo, FunctorRep, Term), 1 .. Arity)
@@ -754,7 +755,8 @@
                         
                         MaybeArgName = ArgInfo ^ du_arg_name,
                         (
-                            MaybeArgName = yes(ArgName)
+                            MaybeArgName = yes(ArgName0),
+                            ArgName = string.from_char_list(ArgName0)
                         ;
                             MaybeArgName = no,
                             ArgName = ""
@@ -762,7 +764,7 @@
                         N = [ArgName | N0]
                     ), ArgInfos, [], RevArgTypes, [], RevArgNames),
 
-                Name = FunctorRep ^ edu_name,
+                Name = string.from_char_list(FunctorRep ^ edu_name),
                 Arity = FunctorRep ^ edu_orig_arity,
                 ArgTypes = list.reverse(RevArgTypes),
                 ArgNames = list.reverse(RevArgNames),
@@ -1005,7 +1007,7 @@
     type_ctor_module_name(TypeCtorInfo::in) = (ModuleName::out),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    ModuleName = element(?ML_tci_module_name, TypeCtorInfo)
+    ModuleName = list_to_binary(element(?ML_tci_module_name, TypeCtorInfo))
 ").
 
 type_ctor_module_name(_) = "dummy value" :-
@@ -1017,7 +1019,7 @@
     type_ctor_type_name(TypeCtorInfo::in) = (TypeName::out),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    TypeName = element(?ML_tci_type_name, TypeCtorInfo)
+    TypeName = list_to_binary(element(?ML_tci_type_name, TypeCtorInfo))
 ").
 
 type_ctor_type_name(_) = "dummy value" :-
@@ -1053,7 +1055,7 @@
     type_ctor_dummy_functor_name(TypeCtorInfo::in) = (Functor::out),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    Functor = element(?ML_tci_details, TypeCtorInfo)
+    Functor = list_to_binary(element(?ML_tci_details, TypeCtorInfo))
 ").
 
 type_ctor_dummy_functor_name(_) = "dummy value" :-
@@ -1310,7 +1312,7 @@
 
 :- type erlang_du_functor
     --->    erlang_du_functor(
-                edu_name            :: string,
+                edu_name            :: list(char),
                 edu_orig_arity      :: int,
                 edu_ordinal         :: int,
                 edu_rep             :: erlang_atom,
@@ -1320,7 +1322,7 @@
 
 :- type du_arg_info
     --->    du_arg_info(
-                du_arg_name         :: maybe(string),
+                du_arg_name         :: maybe(list(char)),
                 du_arg_type         :: maybe_pseudo_type_info
             ).
 
@@ -1355,15 +1357,15 @@
 :- type tc_name
     --->    tc_name(
                 tcn_module              :: module_name,
-                tcn_name                :: string,
+                tcn_name                :: list(char),
                 tcn_arity               :: int
             ).
 
 :- type module_name == sym_name.
 
 :- type sym_name
-    --->    unqualified(string)
-    ;       qualified(sym_name, string).
+    --->    unqualified(list(char))
+    ;       qualified(sym_name, list(char)).
 
 :- type tc_type == maybe_pseudo_type_info.
 
Index: library/io.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/io.m,v
retrieving revision 1.400
diff -u -r1.400 io.m
--- library/io.m	3 Aug 2007 02:30:39 -0000	1.400
+++ library/io.m	13 Aug 2007 03:38:33 -0000
@@ -2531,7 +2531,7 @@
 "
     % XXX see clearerr
     RetVal = 0,
-    RetStr = """"
+    RetStr = <<>>
 ").
 
 :- pred io.make_err_msg(string::in, string::out, io::di, io::uo) is det.
@@ -2618,7 +2618,7 @@
         undefined ->
             Msg = Msg0;
         Reason ->
-            Msg = Msg0 ++ file:format_error(Reason)
+            Msg = list_to_binary([Msg0, file:format_error(Reason)])
     end
 ").
 
@@ -2846,15 +2846,16 @@
         Time::out, _IO0::di, _IO::uo),
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
 "
-    case filelib:last_modified(FileName) of
+    FileNameStr = binary_to_list(FileName),
+    case filelib:last_modified(FileNameStr) of
         {YMD, HMS} ->
             Status = 1,
-            Msg = """",
+            Msg = <<>>,
             % time_t in Erlang is in UTC.
             Time = {time_t, erlang:localtime_to_universaltime({YMD, HMS})};
         _ ->
             Status = 0,
-            Msg = ""filelib:last_modified failed"",
+            Msg = <<""filelib:last_modified failed"">>,
             Time = -1
     end
 ").
@@ -3101,11 +3102,12 @@
         Result::out, _IO0::di, _IO::uo),
     [may_call_mercury, promise_pure, tabled_for_io, thread_safe, terminates],
 "
+    FileNameStr = binary_to_list(FileName),
     case FollowSymLinks of
         0 -> Read = fun file:read_link_info/1;
         1 -> Read = fun file:read_file_info/1
     end,
-    case Read(FileName) of
+    case Read(FileNameStr) of
         {ok, FileInfo} ->
             #file_info{type = Type} = FileInfo,
             case Type of
@@ -3498,8 +3500,6 @@
     "ML_make_io_res_0_ok").
 :- pragma foreign_export("IL", (make_io_res_0_ok = out),
     "ML_make_io_res_0_ok").
-:- pragma foreign_export("Erlang", (make_io_res_0_ok = out),
-    "ML_make_io_res_0_ok").
 
 make_io_res_0_ok = ok.
 
@@ -6240,6 +6240,26 @@
     % Note that we send back acknowledgements for all messages.  This is to
     % ensure that two operations from the same process are done in order.
     %
+mercury_start_file_server(ParentPid, FileName, Mode) ->
+    case Mode of
+        [$r | _] ->
+            ModeList = [read, raw, read_ahead];
+        [$w | _] ->
+            ModeList = [write, raw, delayed_write];
+        [$a | _] ->
+            ModeList = [append, raw, delayed_write]
+    end,
+    case file:open(FileName, ModeList) of
+        {ok, IoDevice} ->
+            StreamId = make_ref(),
+            Stream = {'ML_stream', StreamId, self()},
+            ParentPid ! {self(), open_ack, {ok, Stream}},
+            mercury_file_server(IoDevice, 1, [])
+    ;
+        {error, Reason} ->
+            ParentPid ! {self(), open_ack, {error, Reason}}
+    end.
+
 mercury_file_server(IoDevice, LineNr0, PutBack0) ->
     receive
         {From, close} ->
@@ -6274,25 +6294,22 @@
     ;
         {From, write_char, Char} ->
             From ! {self(), write_char_ack},
-            % XXX use file:write with raw streams
             % XXX return error code
-            io:put_chars(IoDevice, [Char]),
+            file:write(IoDevice, [Char]),
             LineNr = LineNr0 + one_if_nl(Char),
             mercury_file_server(IoDevice, LineNr, PutBack0)
     ;
         {From, write_string, Chars} ->
             From ! {self(), write_string_ack},
-            % XXX use file:write with raw streams
             % XXX return error code
-            io:put_chars(IoDevice, Chars),
+            file:write(IoDevice, Chars),
             LineNr = LineNr0 + count_nls(Chars, 0),
             mercury_file_server(IoDevice, LineNr, PutBack0)
     ;
         {From, write_int, Val} ->
             From ! {self(), write_int_ack},
-            % XXX use file:write with raw streams
             % XXX return error code
-            io:format(IoDevice, ""~B"", [Val]),
+            file:write(IoDevice, integer_to_list(Val)),
             mercury_file_server(IoDevice, LineNr0, PutBack0)
     ;
         {From, sync} ->
@@ -6327,31 +6344,29 @@
 one_if_nl($\\n) -> 1;
 one_if_nl(_)    -> 0.
 
-count_nls([], N) -> N;
-count_nls([$\\n | Cs], N) -> count_nls(Cs, N + 1);
-count_nls([_    | Cs], N) -> count_nls(Cs, N).
+count_nls(Bin, N) ->
+    count_nls_2(Bin, size(Bin) - 1, N).
+
+count_nls_2(_,  -1, N) -> N;
+count_nls_2(Bin, I, N) ->
+    case Bin of
+        <<_:I/binary, $\\n, _/binary>> ->
+            count_nls_2(Bin, I - 1, N + 1);
+        _ ->
+            count_nls_2(Bin, I - 1, N)
+    end.
 
 % Client side.
 
 mercury_open_stream(FileName, Mode) ->
-    case Mode of
-        [$r | _] ->
-            ModeList = [read];
-        [$w | _] ->
-            ModeList = [write];
-        [$a | _] ->
-            ModeList = [append]
-    end,
-    case file:open(FileName, ModeList) of
-        {ok, IoDevice} ->
-            Pid = spawn(fun() -> 
-                mercury_file_server(IoDevice, 1, [])
-            end),
-            StreamId = make_ref(),
-            Stream = {'ML_stream', StreamId, Pid},
-            {ok, Stream};
-        {error, Reason} ->
-            {error, Reason}
+    ParentPid = self(),
+    Pid = spawn(fun() -> 
+        % Raw streams can only be used by the process which opened it.
+        mercury_start_file_server(ParentPid, FileName, Mode)
+    end),
+    receive
+        {Pid, open_ack, Result} ->
+            Result
     end.
 
 mercury_close_stream(Stream) ->
@@ -8950,8 +8965,11 @@
         StreamId::out, Stream::out, _IO0::di, _IO::uo),
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
 "
+    FileNameStr = binary_to_list(FileName),
+    ModeStr = binary_to_list(Mode),
+
     % Text and binary streams are exactly the same so far.
-    case mercury__io:mercury_open_stream(FileName, Mode) of
+    case mercury__io:mercury_open_stream(FileNameStr, ModeStr) of
         {ok, Stream} ->
             {'ML_stream', StreamId, _Pid} = Stream,
             ResultCode = 0;
@@ -8968,8 +8986,11 @@
         StreamId::out, Stream::out, _IO0::di, _IO::uo),
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
 "
+    FileNameStr = binary_to_list(FileName),
+    ModeStr = binary_to_list(Mode),
+
     % Text and binary streams are exactly the same so far.
-    case mercury__io:mercury_open_stream(FileName, Mode) of
+    case mercury__io:mercury_open_stream(FileNameStr, ModeStr) of
         {ok, Stream} ->
             {'ML_stream', StreamId, _Pid} = Stream,
             ResultCode = 0;
@@ -9179,7 +9200,8 @@
     % 3. the error code is returned in an inefficient way
     % 4. standard output and standard error are always tied together
     %
-    OutputCode = os:cmd(Command ++ ""; echo -n $?""),
+    CommandStr = binary_to_list(Command),
+    OutputCode = os:cmd(CommandStr ++ ""; echo -n $?""),
     case string:rchr(OutputCode, $\\n) of
         0 ->
             Code = OutputCode;
@@ -9190,9 +9212,9 @@
     {Status, []} = string:to_integer(Code),
     case Status =:= 0 of
         true ->
-            Msg = """";
+            Msg = <<>>;
         false ->
-            Msg = ""error invoking system command""
+            Msg = <<""error invoking system command"">>
     end
 ").
 
@@ -9339,7 +9361,8 @@
     io.command_line_arguments(Args::out, _IO0::di, _IO::uo),
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
 "
-    Args = init:get_plain_arguments()
+    ArgStrings = init:get_plain_arguments(),
+    Args = lists:map(fun list_to_binary/1, ArgStrings)
 ").
 
 :- pragma foreign_proc("Erlang",
@@ -9495,12 +9518,13 @@
     io.getenv(Var::in, Value::out),
     [will_not_call_mercury, tabled_for_io],
 "
-    case os:getenv(Var) of
+    case os:getenv(binary_to_list(Var)) of
         false ->
             SUCCESS_INDICATOR = false,
-            Value = """";
-        Value ->
-            SUCCESS_INDICATOR = true
+            Value = <<>>;
+        ValueStr ->
+            SUCCESS_INDICATOR = true,
+            Value = list_to_binary(ValueStr)
     end
 ").
 
@@ -9564,7 +9588,9 @@
     io.setenv(Var::in, Value::in),
     [will_not_call_mercury, tabled_for_io],
 "
-    os:putenv(Var, Value),
+    VarStr = binary_to_list(Var),
+    ValueStr = binary_to_list(Value),
+    os:putenv(VarStr, ValueStr),
     SUCCESS_INDICATOR = true
 ").
 
@@ -9767,6 +9793,10 @@
     [will_not_call_mercury, promise_pure, tabled_for_io,
         does_not_affect_liveness],
 "
+    DirStr = binary_to_list(Dir),
+    PrefixStr = binary_to_list(Prefix),
+    SepStr = binary_to_list(Sep),
+
     % Constructs a temporary name by concatenating Dir, Sep, Prefix
     % three hex digits, '.', and 3 more hex digits.
 
@@ -9786,16 +9816,17 @@
     Seed = {A1 + Pid, A2, A3},
 
     case
-        mercury__io:'ML_do_make_temp_2'(Dir, Prefix, Sep, MaxTries, Seed)
+        mercury__io:'ML_do_make_temp_2'(DirStr, PrefixStr, SepStr,
+            MaxTries, Seed)
     of
         {ok, FileName0} ->
-            FileName = FileName0,
+            FileName = list_to_binary(FileName0),
             Error = 0,
-            ErrorMessage = """";
+            ErrorMessage = <<>>;
         {error, Reason} ->
-            FileName = """",
+            FileName = <<>>,
             Error = -1,
-            ErrorMessage = Reason
+            ErrorMessage = list_to_binary(Reason)
     end
 ").
 
@@ -10026,13 +10057,15 @@
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe,
         does_not_affect_liveness],
 "
-    case file:delete(FileName) of
+    FileNameStr = binary_to_list(FileName),
+    case file:delete(FileNameStr) of
         ok ->
             RetVal = 0,
-            RetStr = """";
+            RetStr = <<>>;
         {error, Reason} ->
             RetVal = -1,
-            RetStr = ""remove failed: "" ++ file:format_error(Reason)
+            ReasonStr = file:format_error(Reason),
+            RetStr = list_to_binary([""remove failed: "", ReasonStr])
     end
 ").
 
@@ -10154,13 +10187,16 @@
         RetStr::out, _IO0::di, _IO::uo),
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe],
 "
-    case file:rename(OldFileName, NewFileName) of
+    OldFileNameStr = binary_to_list(OldFileName),
+    NewFileNameStr = binary_to_list(NewFileName),
+    case file:rename(OldFileNameStr, NewFileNameStr) of
         ok ->
             RetVal = 0,
-            RetStr = """";
+            RetStr = <<>>;
         {error, Reason} ->
             RetVal = -1,
-            RetStr = ""rename_file failed: "" ++ file:format_error(Reason)
+            ReasonStr = file:format_error(Reason),
+            RetStr = list_to_binary([""rename_file failed: "", ReasonStr])
     end
 ").
 
@@ -10224,7 +10260,9 @@
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe,
         does_not_affect_liveness],
 "
-    case file:make_symlink(FileName, LinkFileName) of
+    FileNameStr = binary_to_list(FileName),
+    LinkFileNameStr = binary_to_list(LinkFileName),
+    case file:make_symlink(FileNameStr, LinkFileNameStr) of
         ok ->
             Status = 0;
         {error, _Reason} ->
@@ -10310,14 +10348,15 @@
     [will_not_call_mercury, promise_pure, tabled_for_io, thread_safe,
         does_not_affect_liveness],
 "
-    case file:read_link(FileName) of
-        {ok, TargetFileName} ->
+    case file:read_link(binary_to_list(FileName)) of
+        {ok, TargetFileNameStr} ->
+            TargetFileName = list_to_binary(TargetFileNameStr),
             Status = 1,
-            Error = """";
+            Error = <<>>;
         {error, Reason} ->
             Status = 0,
-            TargetFileName = """",
-            Error = file:format_error(Reason)
+            TargetFileName = <<>>,
+            Error = list_to_binary(file:format_error(Reason))
     end
 ").
 
Index: library/library.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/library.m,v
retrieving revision 1.113
diff -u -r1.113 library.m
--- library/library.m	9 Aug 2007 01:36:11 -0000	1.113
+++ library/library.m	13 Aug 2007 03:38:33 -0000
@@ -204,7 +204,7 @@
     library.version(Version::out),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    Version = ?MR_VERSION "" configured for "" ?MR_FULLARCH
+    Version = << ?MR_VERSION "" configured for "" ?MR_FULLARCH >>
 ").
 
 %---------------------------------------------------------------------------%
Index: library/string.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/string.m,v
retrieving revision 1.263
diff -u -r1.263 string.m
--- library/string.m	9 Jul 2007 04:02:18 -0000	1.263
+++ library/string.m	13 Aug 2007 03:38:33 -0000
@@ -1278,7 +1278,7 @@
     [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
         does_not_affect_liveness],
 "
-    CharList = Str
+    CharList = binary_to_list(Str)
 ").
 
 string.to_char_list_2(Str, CharList) :-
@@ -1361,8 +1361,8 @@
     [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
         does_not_affect_liveness],
 "
-    Str = CharList,
-    SUCCESS_INDICATOR = true
+    SUCCESS_INDICATOR = true,
+    Str = list_to_binary(CharList)
 ").
 
 :- pragma promise_equivalent_clauses(string.semidet_from_char_list/2).
@@ -1574,7 +1574,7 @@
     [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
         does_not_affect_liveness],
 "
-    Str = lists:concat(Strs)
+    Str = list_to_binary(Strs)
 ").
 
     % We implement string.join_list in C as this minimises the amount of
@@ -1703,26 +1703,32 @@
 }").
 
 :- pragma foreign_proc("Erlang",
-    sub_string_search_start(WholeString::in, Pattern::in, BeginAt::in,
-        Index::out),
+    sub_string_search_start(String::in, SubString::in, BeginAt::in, Index::out),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    String = lists:nthtail(BeginAt, WholeString),
-    case Pattern of
-        """" ->
-            % string:str does not handle the empty pattern as we would like.
-            Index = BeginAt,
-            SUCCESS_INDICATOR = true;
-        _ ->
-            case string:str(String, Pattern) of
-                0 ->
-                    Index = -1,
-                    SUCCESS_INDICATOR = false;
-                Match ->
-                    Index = BeginAt + Match - 1,
-                    SUCCESS_INDICATOR = true
-            end
-    end
+    Index = mercury__string:sub_string_search_start_2(String, SubString,
+        BeginAt, size(String), size(SubString)),
+    SUCCESS_INDICATOR = (Index =/= -1)
+").
+
+:- pragma foreign_decl("Erlang", local, "
+-export([sub_string_search_start_2/5]).
+").
+
+:- pragma foreign_code("Erlang", "
+sub_string_search_start_2(String, SubString, I, Length, SubLength) ->
+    case I + SubLength =< Length of
+        true ->
+            case String of
+                <<_:I/binary, SubString:SubLength/binary, _/binary>> ->
+                    I;
+                _ ->
+                    sub_string_search_start_2(String, SubString, I + 1,
+                        Length, SubLength)
+            end;
+        false ->
+            -1
+    end.
 ").
 
 % This is only used if there is no matching foreign_proc definition
@@ -3288,7 +3294,8 @@
     string.lowlevel_float_to_string(FloatVal::in, FloatString::uo),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    [FloatString] = io_lib:format(""~.12f"", [FloatVal])
+    List = io_lib:format(""~.12f"", [FloatVal]),
+    FloatString = list_to_binary(List)
 ").
 
 string.det_to_float(FloatString) =
@@ -3392,8 +3399,9 @@
     [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
         does_not_affect_liveness],
 "
+    S = binary_to_list(FloatString),
     % string:to_float fails on integers, so tack on a trailing '.0' string.
-    case string:to_float(FloatString ++ "".0"") of
+    case string:to_float(S ++ "".0"") of
         {FloatVal, []} ->
             SUCCESS_INDICATOR = true;
         {FloatVal, "".0""} ->
@@ -3451,12 +3459,6 @@
 "
     SUCCESS_INDICATOR = (Str.IndexOf(Ch) != -1);
 ").
-:- pragma foreign_proc("Erlang",
-    string.contains_char(Str::in, Ch::in),
-    [will_not_call_mercury, promise_pure, thread_safe],
-"
-    SUCCESS_INDICATOR = (string:chr(Str, Ch) =/= 0)
-").
 string.contains_char(String, Char) :-
     string.contains_char(String, Char, 0, string.length(String)).
 
@@ -3542,8 +3544,7 @@
     string.unsafe_index(Str::in, Index::in, Ch::uo),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    % `lists:nth' counts from 1.
-    Ch = lists:nth(Index + 1, Str)
+    <<_:Index/binary, Ch/integer, _/binary>> = Str
 ").
 string.unsafe_index(Str, Index, Char) :-
     ( string.first_char(Str, First, Rest) ->
@@ -3704,8 +3705,8 @@
     string.unsafe_set_char_2(Ch::in, Index::in, Str0::in, Str::out),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    {Left, [_ | Right]} = lists:split(Index, Str0),
-    Str = Left ++ [Ch | Right]
+    <<Left:Index/binary, _/integer, Right/binary>> = Str0,
+    Str = list_to_binary([Left, Ch, Right])
 ").
 
 % :- pragma foreign_proc("C",
@@ -3757,7 +3758,7 @@
     string.length(Str::in, Length::uo),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    Length = string:len(Str)
+    Length = size(Str)
 ").
 
 :- pragma foreign_proc("C",
@@ -3783,7 +3784,7 @@
     string.length(Str::ui, Length::uo),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    Length = string:len(Str)
+    Length = size(Str)
 ").
 
 string.length(Str0, Len) :-
@@ -3838,7 +3839,15 @@
     string.append_iii(S1::in, S2::in, S3::in),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    SUCCESS_INDICATOR = string:equal(S1 ++ S2, S3)
+    S1_length = size(S1),
+    S2_length = size(S2),
+    case S1_length + S2_length =:= size(S3) of
+        true ->
+            <<Left:S1_length/binary, Right/binary>> = S3,
+            SUCCESS_INDICATOR = (Left =:= S1 andalso Right =:= S2);
+        false ->
+            SUCCESS_INDICATOR = false
+    end
 ").
 
 string.append_iii(X, Y, Z) :-
@@ -3909,7 +3918,7 @@
     string.append_iio(S1::in, S2::in, S3::uo),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    S3 = S1 ++ S2
+    S3 = list_to_binary([S1, S2])
 ").
 
 string.append_iio(X, Y, Z) :-
@@ -3962,7 +3971,7 @@
     string.append_ooi_3(S1Len::in, _S3Len::in, S1::out, S2::out, S3::in),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    {S1, S2} = lists:split(S1Len, S3)
+    << S1:S1Len/binary, S2/binary >> = S3
 ").
 
 string.append_ooi_3(S1Len, _S3Len, S1, S2, S3) :-
@@ -4025,16 +4034,23 @@
     [will_not_call_mercury, promise_pure, thread_safe, will_not_modify_trail,
         does_not_affect_liveness],
 "
-    Start =
-        if
-            Start0 < 0 -> 0;
-            true -> Start0
-        end,
+    if
+        Start0 < 0 ->
+            Start = 0;
+        true ->
+            Start = Start0
+    end,
     if
         Count =< 0 ->
-            SubString = """";
+            SubString = <<>>;
         true ->
-            SubString = string:substr(Str, 1 + Start, Count)
+            End = size(Str),
+            case Start + Count >= End of
+                true ->
+                    <<_:Start/binary, SubString/binary>> = Str;
+                false ->
+                    <<_:Start/binary, SubString:Count/binary, _/binary>> = Str
+            end
     end
 ").
 
@@ -4065,7 +4081,7 @@
     string.unsafe_substring(Str::in, Start::in, Count::in, SubString::uo),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    SubString = string:sub_string(Str, 1 + Start, Start + Count)
+    << _:Start/binary, SubString:Count/binary, _/binary >> = Str
 ").
 
 :- pragma foreign_proc("C",
@@ -4122,13 +4138,13 @@
 "
     if
         Count =< 0 ->
-            Left = """",
+            Left = <<>>,
             Right = Str;
-        Count > length(Str) ->
+        Count > size(Str) ->
             Left = Str,
-            Right = """";
+            Right = <<>>;
         true ->
-            {Left, Right} = lists:split(Count, Str)
+            << Left:Count/binary, Right/binary >> = Str
     end
 ").
 
@@ -4189,14 +4205,10 @@
     [will_not_call_mercury, promise_pure, thread_safe],
 "
     case Str of
-        [First0 | Rest0] ->
-            SUCCESS_INDICATOR = true,
-            First = First0,
-            Rest = Rest0;
+        <<First, Rest/binary>> ->
+            SUCCESS_INDICATOR = true;
         _ ->
-            SUCCESS_INDICATOR = false,
-            First = 0,
-            Rest = []
+            SUCCESS_INDICATOR = false
     end
 ").
 
@@ -4238,7 +4250,7 @@
     [will_not_call_mercury, promise_pure, thread_safe],
 "
     case Str of
-        [First | Rest] ->
+        << First, Rest/binary >> ->
             SUCCESS_INDICATOR = true;
         _ ->
             SUCCESS_INDICATOR = false,
@@ -4296,11 +4308,11 @@
     [will_not_call_mercury, promise_pure, thread_safe],
 "
     case Str of
-        [First | Rest] ->
+        <<First, Rest/binary>> ->
             SUCCESS_INDICATOR = true;
         _ ->
             SUCCESS_INDICATOR = false,
-            Rest = []
+            Rest = <<>>
     end
 ").
 
@@ -4354,14 +4366,12 @@
     [will_not_call_mercury, promise_pure, thread_safe],
 "
     case Str of
-        [First0 | Rest0] ->
-            SUCCESS_INDICATOR = true,
-            First = First0,
-            Rest = Rest0;
+        <<First, Rest/binary>> ->
+            SUCCESS_INDICATOR = true;
         _ ->
             SUCCESS_INDICATOR = false,
             First = 0,
-            Rest = []
+            Rest = <<>>
     end
 ").
 
@@ -4394,7 +4404,7 @@
     string.first_char(Str::uo, First::in, Rest::in),
     [will_not_call_mercury, promise_pure, thread_safe],
 "
-    Str = [First | Rest]
+    Str = list_to_binary([First, Rest])
 ").
 
 %-----------------------------------------------------------------------------%
Index: util/mkinit_erl.c
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/util/mkinit_erl.c,v
retrieving revision 1.3
diff -u -r1.3 mkinit_erl.c
--- util/mkinit_erl.c	18 Jun 2007 05:41:31 -0000	1.3
+++ util/mkinit_erl.c	13 Aug 2007 03:38:33 -0000
@@ -262,6 +262,7 @@
 
     printf("init_env_vars() -> \n");
     for (i = 0; i < mercury_env_var_next; i++) {
+        /* The environment variable is passed as a string, not a binary. */
         printf("\t'ML_erlang_global_server' ! {init_env_var, \"%s\"},\n",
             mercury_env_vars[i]);
     }
Index: doc/reference_manual.texi
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/doc/reference_manual.texi,v
retrieving revision 1.401
diff -u -r1.401 reference_manual.texi
--- doc/reference_manual.texi	25 Jul 2007 06:12:28 -0000	1.401
+++ doc/reference_manual.texi	13 Aug 2007 03:38:34 -0000
@@ -6784,7 +6784,8 @@
 
 The Mercury types @code{int}, @code{float} and @code{char}
 are mapped to Erlang integers, floats and integers respectively.
-A Mercury @code{string} is represented as a list of integers in Erlang.
+A Mercury @code{string} is represented by an Erlang binary,
+not by a list of integers.
 
 Mercury variables whose type is a Mercury discriminated union type
 will be passed as an Erlang tuple with the first element of the tuple
@@ -6831,7 +6832,7 @@
 Procedures which are nondeterministic take as a final argument a
 success continuation.  This is an function which has an input variable
 for each variable of the Mercury procedure with an output mode.  For
-each solution, the success continuation with the values of those
+each solution, the success continuation is called with the values of those
 output variables.  When there are no more solutions the Erlang
 function returns with an undefined value.
 
@@ -7905,12 +7906,11 @@
 @samp{SUCCESS_INDICATOR}.  For example:
 
 @example
-:- pred string.contains_char(string, character).
-:- mode string.contains_char(in, in) is semidet.
+:- pred contains_char(list(char)::in, char::in) is semidet.
 
 :- pragma foreign_proc("Erlang",
-	string.contains_char(Str::in, Ch::in),
-        [will_not_call_mercury, promise_pure],
+	contains_char(Str::in, Ch::in),
+        [will_not_call_mercury, promise_pure, thread_safe],
         "SUCCESS_INDICATOR = (string:chr(Str, Ch) =/= 0)").
 @end example
 
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list