[m-rev.] For review: mangle.m added to library
Ralph Becket
rafe at cs.mu.OZ.AU
Tue Dec 7 16:05:39 AEDT 2004
Estimated hours taken: 1.5
Branches: main
library/mangle.m:
Added. This module provides fairly general purpose string
(de)mangling functionality. The mangled names are suitable for use
as identifiers in C programs.
library/library.m:
Register the new addition.
NEWS:
Mention the new addition.
tests/hard_coded/test_mangle.m:
tests/hard_coded/test_mangle.exp:
tests/hard_coded/Mmakefile:
Added a test case.
Index: NEWS
===================================================================
RCS file: /home/mercury1/repository/mercury/NEWS,v
retrieving revision 1.350
diff -u -r1.350 NEWS
--- NEWS 11 Nov 2004 13:46:41 -0000 1.350
+++ NEWS 7 Dec 2004 04:53:08 -0000
@@ -16,11 +16,11 @@
* Experimental support for user-defined constrained types has been added.
Changes to the Mercury standard library:
-* We've added several new modules: cord, for sequences with O(1) consing and
- concatenation, array2d, for two-dimensional arrays, and version_array,
- version_array2d, version_bitmap, version_hash_table, and version_store,
- implementing non-unique versions of these types supporting O(1) access for
- non-persistent use.
+* We've added several new modules: mangle, for name mangling, cord, for
+ sequences with O(1) consing and concatenation, array2d, for two-dimensional
+ arrays, and version_array, version_array2d, version_bitmap,
+ version_hash_table, and version_store, implementing non-unique versions of
+ these types supporting O(1) access for non-persistent use.
* New procedures have been added to many of the existing standard library
modules. Most notably, these include procedures for creating
directories and symbolic links, for checking file types and file
@@ -114,6 +114,10 @@
chapter of the Mercury Language Reference Manual.
Changes to the Mercury standard library:
+
+* We've added a new module, mangle.m, providing the means for converting
+ arbitrary strings into forms suitable for use as, say, identifiers in C
+ programs, and back again.
* We've added some new higher-order predicates, rbtree.foldl2/6
and rbtree.foldl3 to the rbtree module. The predicate
Index: library/library.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/library.m,v
retrieving revision 1.75
diff -u -r1.75 library.m
--- library/library.m 16 Nov 2004 00:45:12 -0000 1.75
+++ library/library.m 7 Dec 2004 04:49:29 -0000
@@ -73,6 +73,7 @@
:- import_module io.
:- import_module lexer.
:- import_module list.
+:- import_module mangle.
:- import_module map.
:- import_module math.
:- import_module multi_map.
@@ -186,6 +187,7 @@
mercury_std_library_module("lexer").
mercury_std_library_module("library").
mercury_std_library_module("list").
+mercury_std_library_module("mangle").
mercury_std_library_module("map").
mercury_std_library_module("math").
mercury_std_library_module("multi_map").
Index: library/mangle.m
===================================================================
RCS file: library/mangle.m
diff -N library/mangle.m
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ library/mangle.m 7 Dec 2004 04:48:39 -0000
@@ -0,0 +1,165 @@
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
+% Copyright (C) 2004 The University of Melbourne.
+% This file may only be copied under the terms of the GNU Library General
+% Public License - see the file COPYING.LIB in the Mercury distribution.
+%---------------------------------------------------------------------------%
+% mangle.m
+% Ralph Becket <rafe at cs.mu.oz.au>
+% Tue Dec 7 14:17:10 EST 2004
+%
+% Reasonably general purpose name (de)mangling facilities for converting
+% arbitrary strings into a form suitable for use as identifiers in a C
+% program, say, and back again.
+%
+% The mangling scheme works by translating characters in the unmangled string
+% into sequences of characters in the mangled string as follows:
+%
+% Unmangled Mangled
+% --------- -------
+% Z ZZ
+% c in [a-zA-Y0-9_] c
+% c not in [a-zA-Z0-9_] Z## (where ## is the hexadecimal code for c)
+%
+% Examples (assuming ASCII character encoding):
+%
+% hello_world hello_world
+% foo/0 fooZ2F0
+% this+that! thisZ2BthatZ21
+% Zap ZZap
+% Buzz Buzz
+%
+%-----------------------------------------------------------------------------%
+
+:- module mangle.
+
+:- interface.
+
+:- import_module string.
+
+
+
+ % mangled(UnmangledString) = MangledString.
+ %
+:- func mangled(string) = string.
+
+ % demangled(MangledString) = UnmangledString.
+ %
+:- func demangled(string) = string.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module char.
+:- import_module exception.
+:- import_module int.
+:- import_module list.
+:- import_module string.
+
+
+
+:- type strings == list(string).
+
+%-----------------------------------------------------------------------------%
+
+ % We try to construct as few substrings as possible by traversing the
+ % input and only cutting it around special characters.
+ %
+mangled(S0) = S :-
+ N = length(S0) - 1,
+ int.fold_down2(mangle(S0), 0, N, N, J, [], Ss0),
+ Ss = ( if Ss0 \= [], J >= 0
+ then [unsafe_substring(S0, 0, J + 1) | Ss0]
+ else Ss0
+ ),
+ S = ( if Ss = [] then S0 else string.append_list(Ss) ).
+
+
+ % mangle(UnmangledString, CurrentIndex, NextCutPoint0, NextCutPoint,
+ % MangledStringParts0, MangledStringParts).
+ %
+:- pred mangle(string::in, int::in, int::in, int::out,
+ strings::in, strings::out) is det.
+
+mangle(S0, I, J0, J, Ss0, Ss) :-
+ C = S0 ^ unsafe_elem(I),
+ ( if C = 'Z' then
+ J = I - 1,
+ Ss = ( if I = J0
+ then ["ZZ" | Ss0]
+ else ["ZZ", unsafe_substring(S0, I + 1, J0 - I) | Ss0]
+ )
+ else if is_alnum_or_underscore(C) then
+ J = J0,
+ Ss = Ss0
+ else
+ HH = int_to_base_string(to_int(C), 16),
+ J = I - 1,
+ Ss = ( if I = J0
+ then ["Z", HH | Ss0]
+ else ["Z", HH, unsafe_substring(S0, I + 1, J0 - I) | Ss0]
+ )
+ ).
+
+%-----------------------------------------------------------------------------%
+
+ % We try to construct as few substrings as possible by traversing the
+ % input and only cutting it around special sequences.
+ %
+demangled(S0) = S :-
+ demangle(S0, 0, 0, length(S0), [], RevSs),
+ S = ( if RevSs = [] then S0 else string.append_list(reverse(RevSs)) ).
+
+
+ % demangle(MangledString, LastCutPoint, CurrentIndex, MangledStringLength,
+ % RevDemangledStringParts0, RevDemangledStringParts).
+ %
+:- pred demangle(string::in, int::in, int::in, int::in,
+ strings::in, strings::out) is det.
+
+demangle(S0, I, J, L, RevSs0, RevSs) :-
+ ( if J >= L then
+ RevSs = ( if RevSs0 \= [], I < L
+ then [unsafe_substring(S0, I, L - I) | RevSs0]
+ else RevSs0
+ )
+ else
+ C = S0 ^ unsafe_elem(J),
+ ( if C \= 'Z' then
+ % This is not an escaped sequence.
+ %
+ demangle(S0, I, J + 1, L, RevSs0, RevSs)
+ else if
+ J + 1 < L,
+ S0 ^ unsafe_elem(J + 1) = 'Z'
+ then
+ % This is a `ZZ' escaped sequence.
+ %
+ RevSs1 = ( if I = J
+ then ["Z" | RevSs0]
+ else ["Z", unsafe_substring(S0, I, J - I) | RevSs0]
+ ),
+ demangle(S0, J + 2, J + 2, L, RevSs1, RevSs)
+ else if
+ J + 2 < L,
+ HH = unsafe_substring(S0, J + 1, 2),
+ base_string_to_int(16, HH, CharCode),
+ char.to_int(Char, CharCode),
+ D = char_to_string(Char)
+ then
+ % This is a `Z##' escaped sequence.
+ %
+ RevSs1 = ( if I = J
+ then [D | RevSs0]
+ else [D, unsafe_substring(S0, I, J - I) | RevSs0]
+ ),
+ demangle(S0, J + 3, J + 3, L, RevSs1, RevSs)
+ else
+ throw("mangle.demangle: improperly mangled string `" ++ S0 ++ "'")
+ )
+ ).
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
Index: tests/hard_coded/Mmakefile
===================================================================
RCS file: /home/mercury1/repository/tests/hard_coded/Mmakefile,v
retrieving revision 1.242
diff -u -r1.242 Mmakefile
--- tests/hard_coded/Mmakefile 2 Dec 2004 08:03:57 -0000 1.242
+++ tests/hard_coded/Mmakefile 7 Dec 2004 05:03:10 -0000
@@ -166,6 +166,7 @@
test_bitset \
test_cord \
test_imported_no_tag \
+ test_mangle \
tim_qual1 \
time_test \
trans_intermod_user_equality \
Index: tests/hard_coded/test_mangle.exp
===================================================================
RCS file: tests/hard_coded/test_mangle.exp
diff -N tests/hard_coded/test_mangle.exp
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ tests/hard_coded/test_mangle.exp 7 Dec 2004 05:02:26 -0000
@@ -0,0 +1,24 @@
+mangled("foo") = "foo"
+demangled("foo") = "foo"
+mangled("foo_bar") = "foo_bar"
+demangled("foo_bar") = "foo_bar"
+mangled("") = ""
+demangled("") = ""
+mangled("FOO") = "FOO"
+demangled("FOO") = "FOO"
+mangled("FoO") = "FoO"
+demangled("FoO") = "FoO"
+mangled("Buzz") = "Buzz"
+demangled("Buzz") = "Buzz"
+mangled("BUZZ") = "BUZZZZ"
+demangled("BUZZZZ") = "BUZZ"
+mangled("f/0") = "fZ2F0"
+demangled("fZ2F0") = "f/0"
+mangled("fZ2F0") = "fZZ2F0"
+demangled("fZZ2F0") = "fZ2F0"
+mangled("bang!bang!") = "bangZ21bangZ21"
+demangled("bangZ21bangZ21") = "bang!bang!"
+mangled("this+that") = "thisZ2Bthat"
+demangled("thisZ2Bthat") = "this+that"
+mangled("s p a c e s") = "sZ20pZ20aZ20cZ20eZ20s"
+demangled("sZ20pZ20aZ20cZ20eZ20s") = "s p a c e s"
Index: tests/hard_coded/test_mangle.m
===================================================================
RCS file: tests/hard_coded/test_mangle.m
diff -N tests/hard_coded/test_mangle.m
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ tests/hard_coded/test_mangle.m 7 Dec 2004 05:02:00 -0000
@@ -0,0 +1,43 @@
+%-----------------------------------------------------------------------------%
+% testmangle.m
+% Ralph Becket <rafe at cs.mu.oz.au>
+% Tue Dec 7 14:48:28 EST 2004
+% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
+%
+%-----------------------------------------------------------------------------%
+
+:- module testmangle.
+
+:- interface.
+
+:- import_module io.
+
+
+
+:- pred main(io :: di, io :: uo) is det.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module string, list, mangle.
+
+%-----------------------------------------------------------------------------%
+
+main(!IO) :-
+ list.foldl(
+ test_mangle,
+ ["foo", "foo_bar", "", "FOO", "FoO", "Buzz", "BUZZ",
+ "f/0", "fZ2F0", "bang!bang!", "this+that", "s p a c e s"],
+ !IO
+ ).
+
+
+:- pred test_mangle(string::in, io::di, io::uo) is det.
+
+test_mangle(String, !IO) :-
+ Mangled = mangled(String),
+ Demangled = demangled(Mangled),
+ io.format("mangled(\"%s\") = \"%s\"\n", [s(String), s(Mangled)], !IO),
+ io.format("demangled(\"%s\") = \"%s\"\n", [s(Mangled), s(Demangled)], !IO).
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list