[m-rev.] For review: mangle.m added to library

Ralph Becket rafe at cs.mu.OZ.AU
Tue Dec 7 16:05:39 AEDT 2004


Estimated hours taken: 1.5
Branches: main

library/mangle.m:
	Added.  This module provides fairly general purpose string
	(de)mangling functionality.  The mangled names are suitable for use
	as identifiers in C programs.

library/library.m:
	Register the new addition.

NEWS:
	Mention the new addition.

tests/hard_coded/test_mangle.m:
tests/hard_coded/test_mangle.exp:
tests/hard_coded/Mmakefile:
	Added a test case.

Index: NEWS
===================================================================
RCS file: /home/mercury1/repository/mercury/NEWS,v
retrieving revision 1.350
diff -u -r1.350 NEWS
--- NEWS	11 Nov 2004 13:46:41 -0000	1.350
+++ NEWS	7 Dec 2004 04:53:08 -0000
@@ -16,11 +16,11 @@
 * Experimental support for user-defined constrained types has been added.
 
 Changes to the Mercury standard library:
-* We've added several new modules: cord, for sequences with O(1) consing and
-  concatenation, array2d, for two-dimensional arrays, and version_array,
-  version_array2d, version_bitmap, version_hash_table, and version_store,
-  implementing non-unique versions of these types supporting O(1) access for
-  non-persistent use.
+* We've added several new modules: mangle, for name mangling, cord, for
+  sequences with O(1) consing and concatenation, array2d, for two-dimensional
+  arrays, and version_array, version_array2d, version_bitmap,
+  version_hash_table, and version_store, implementing non-unique versions of
+  these types supporting O(1) access for non-persistent use.
 * New procedures have been added to many of the existing standard library
   modules.  Most notably, these include procedures for creating
   directories and symbolic links, for checking file types and file
@@ -114,6 +114,10 @@
   chapter of the Mercury Language Reference Manual.
 
 Changes to the Mercury standard library:
+
+* We've added a new module, mangle.m, providing the means for converting
+  arbitrary strings into forms suitable for use as, say, identifiers in C
+  programs, and back again.
 
 * We've added some new higher-order predicates, rbtree.foldl2/6
   and rbtree.foldl3 to the rbtree module.  The predicate 
Index: library/library.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/library.m,v
retrieving revision 1.75
diff -u -r1.75 library.m
--- library/library.m	16 Nov 2004 00:45:12 -0000	1.75
+++ library/library.m	7 Dec 2004 04:49:29 -0000
@@ -73,6 +73,7 @@
 :- import_module io.
 :- import_module lexer.
 :- import_module list.
+:- import_module mangle.
 :- import_module map.
 :- import_module math.
 :- import_module multi_map.
@@ -186,6 +187,7 @@
 mercury_std_library_module("lexer").
 mercury_std_library_module("library").
 mercury_std_library_module("list").
+mercury_std_library_module("mangle").
 mercury_std_library_module("map").
 mercury_std_library_module("math").
 mercury_std_library_module("multi_map").
Index: library/mangle.m
===================================================================
RCS file: library/mangle.m
diff -N library/mangle.m
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ library/mangle.m	7 Dec 2004 04:48:39 -0000
@@ -0,0 +1,165 @@
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
+% Copyright (C) 2004 The University of Melbourne.
+% This file may only be copied under the terms of the GNU Library General
+% Public License - see the file COPYING.LIB in the Mercury distribution.
+%---------------------------------------------------------------------------%
+% mangle.m
+% Ralph Becket <rafe at cs.mu.oz.au>
+% Tue Dec  7 14:17:10 EST 2004
+%
+% Reasonably general purpose name (de)mangling facilities for converting
+% arbitrary strings into a form suitable for use as identifiers in a C
+% program, say, and back again.
+%
+% The mangling scheme works by translating characters in the unmangled string
+% into sequences of characters in the mangled string as follows:
+%
+%   Unmangled               Mangled 
+%   ---------               -------
+%   Z                       ZZ
+%   c in [a-zA-Y0-9_]       c
+%   c not in [a-zA-Z0-9_]   Z## (where ## is the hexadecimal code for c)
+%
+% Examples (assuming ASCII character encoding):
+%
+%   hello_world             hello_world
+%   foo/0                   fooZ2F0
+%   this+that!              thisZ2BthatZ21
+%   Zap                     ZZap
+%   Buzz                    Buzz
+%
+%-----------------------------------------------------------------------------%
+
+:- module mangle.
+
+:- interface.
+
+:- import_module string.
+
+
+
+    % mangled(UnmangledString) = MangledString.
+    %
+:- func mangled(string) = string.
+
+    % demangled(MangledString) = UnmangledString.
+    %
+:- func demangled(string) = string.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module char.
+:- import_module exception.
+:- import_module int.
+:- import_module list.
+:- import_module string.
+
+
+
+:- type strings == list(string).
+
+%-----------------------------------------------------------------------------%
+
+    % We try to construct as few substrings as possible by traversing the
+    % input and only cutting it around special characters.
+    %
+mangled(S0) = S :-
+    N  = length(S0) - 1,
+    int.fold_down2(mangle(S0), 0, N, N, J, [], Ss0),
+    Ss = ( if   Ss0 \= [], J >= 0
+           then [unsafe_substring(S0, 0, J + 1) | Ss0]
+           else Ss0
+         ),
+    S  = ( if Ss = [] then S0 else string.append_list(Ss) ).
+
+
+    % mangle(UnmangledString, CurrentIndex, NextCutPoint0, NextCutPoint,
+    %   MangledStringParts0, MangledStringParts).
+    %
+:- pred mangle(string::in, int::in, int::in, int::out,
+            strings::in, strings::out) is det.
+
+mangle(S0, I, J0, J, Ss0, Ss) :-
+    C = S0 ^ unsafe_elem(I),
+    ( if C = 'Z' then
+        J  = I - 1,
+        Ss = ( if   I = J0
+               then ["ZZ" | Ss0]
+               else ["ZZ", unsafe_substring(S0, I + 1, J0 - I) | Ss0]
+             )
+      else if is_alnum_or_underscore(C) then
+        J  = J0,
+        Ss = Ss0
+      else
+        HH = int_to_base_string(to_int(C), 16),
+        J  = I - 1,
+        Ss = ( if   I = J0
+               then ["Z", HH | Ss0]
+               else ["Z", HH, unsafe_substring(S0, I + 1, J0 - I) | Ss0]
+             )
+    ).
+
+%-----------------------------------------------------------------------------%
+
+    % We try to construct as few substrings as possible by traversing the
+    % input and only cutting it around special sequences.
+    %
+demangled(S0) = S :-
+    demangle(S0, 0, 0, length(S0), [], RevSs),
+    S = ( if RevSs = [] then S0 else string.append_list(reverse(RevSs)) ).
+
+
+    % demangle(MangledString, LastCutPoint, CurrentIndex, MangledStringLength,
+    %   RevDemangledStringParts0, RevDemangledStringParts).
+    %
+:- pred demangle(string::in, int::in, int::in, int::in,
+            strings::in, strings::out) is det.
+
+demangle(S0, I, J, L, RevSs0, RevSs) :-
+    ( if J >= L then
+        RevSs = ( if   RevSs0 \= [], I < L
+                  then [unsafe_substring(S0, I, L - I) | RevSs0]
+                  else RevSs0
+                )
+      else
+        C = S0 ^ unsafe_elem(J),
+        ( if C \= 'Z' then
+                % This is not an escaped sequence.
+                %
+            demangle(S0, I, J + 1, L, RevSs0, RevSs)
+          else if
+            J + 1 < L,
+            S0 ^ unsafe_elem(J + 1) = 'Z'
+          then
+                % This is a `ZZ' escaped sequence.
+                %
+            RevSs1 = ( if   I = J
+                       then ["Z" | RevSs0]
+                       else ["Z", unsafe_substring(S0, I, J - I) | RevSs0]
+                     ),
+            demangle(S0, J + 2, J + 2, L, RevSs1, RevSs)
+          else if
+            J + 2 < L,
+            HH = unsafe_substring(S0, J + 1, 2),
+            base_string_to_int(16, HH, CharCode),
+            char.to_int(Char, CharCode),
+            D = char_to_string(Char)
+          then
+                % This is a `Z##' escaped sequence.
+                %
+            RevSs1 = ( if   I = J
+                       then [D | RevSs0]
+                       else [D, unsafe_substring(S0, I, J - I) | RevSs0]
+                     ),
+            demangle(S0, J + 3, J + 3, L, RevSs1, RevSs)
+          else
+            throw("mangle.demangle: improperly mangled string `" ++ S0 ++ "'")
+        )
+    ).
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
Index: tests/hard_coded/Mmakefile
===================================================================
RCS file: /home/mercury1/repository/tests/hard_coded/Mmakefile,v
retrieving revision 1.242
diff -u -r1.242 Mmakefile
--- tests/hard_coded/Mmakefile	2 Dec 2004 08:03:57 -0000	1.242
+++ tests/hard_coded/Mmakefile	7 Dec 2004 05:03:10 -0000
@@ -166,6 +166,7 @@
 	test_bitset \
 	test_cord \
 	test_imported_no_tag \
+	test_mangle \
 	tim_qual1 \
 	time_test \
 	trans_intermod_user_equality \
Index: tests/hard_coded/test_mangle.exp
===================================================================
RCS file: tests/hard_coded/test_mangle.exp
diff -N tests/hard_coded/test_mangle.exp
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ tests/hard_coded/test_mangle.exp	7 Dec 2004 05:02:26 -0000
@@ -0,0 +1,24 @@
+mangled("foo") = "foo"
+demangled("foo") = "foo"
+mangled("foo_bar") = "foo_bar"
+demangled("foo_bar") = "foo_bar"
+mangled("") = ""
+demangled("") = ""
+mangled("FOO") = "FOO"
+demangled("FOO") = "FOO"
+mangled("FoO") = "FoO"
+demangled("FoO") = "FoO"
+mangled("Buzz") = "Buzz"
+demangled("Buzz") = "Buzz"
+mangled("BUZZ") = "BUZZZZ"
+demangled("BUZZZZ") = "BUZZ"
+mangled("f/0") = "fZ2F0"
+demangled("fZ2F0") = "f/0"
+mangled("fZ2F0") = "fZZ2F0"
+demangled("fZZ2F0") = "fZ2F0"
+mangled("bang!bang!") = "bangZ21bangZ21"
+demangled("bangZ21bangZ21") = "bang!bang!"
+mangled("this+that") = "thisZ2Bthat"
+demangled("thisZ2Bthat") = "this+that"
+mangled("s p a c e s") = "sZ20pZ20aZ20cZ20eZ20s"
+demangled("sZ20pZ20aZ20cZ20eZ20s") = "s p a c e s"
Index: tests/hard_coded/test_mangle.m
===================================================================
RCS file: tests/hard_coded/test_mangle.m
diff -N tests/hard_coded/test_mangle.m
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ tests/hard_coded/test_mangle.m	7 Dec 2004 05:02:00 -0000
@@ -0,0 +1,43 @@
+%-----------------------------------------------------------------------------%
+% testmangle.m
+% Ralph Becket <rafe at cs.mu.oz.au>
+% Tue Dec  7 14:48:28 EST 2004
+% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
+%
+%-----------------------------------------------------------------------------%
+
+:- module testmangle.
+
+:- interface.
+
+:- import_module io.
+
+
+
+:- pred main(io :: di, io :: uo) is det.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module string, list, mangle.
+
+%-----------------------------------------------------------------------------%
+
+main(!IO) :-
+    list.foldl(
+        test_mangle,
+        ["foo", "foo_bar", "", "FOO", "FoO", "Buzz", "BUZZ",
+         "f/0", "fZ2F0", "bang!bang!", "this+that", "s p a c e s"],
+         !IO
+    ).
+
+
+:- pred test_mangle(string::in, io::di, io::uo) is det.
+
+test_mangle(String, !IO) :-
+    Mangled = mangled(String),
+    Demangled = demangled(Mangled),
+    io.format("mangled(\"%s\") = \"%s\"\n", [s(String), s(Mangled)], !IO),
+    io.format("demangled(\"%s\") = \"%s\"\n", [s(Mangled), s(Demangled)], !IO).
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list