[mercury-users] Canonical terms
Michael Day
mikeday at yeslogic.com
Mon Jan 12 19:24:41 AEDT 2009
Hi,
In Prince we have some big style structures which tend to proliferate
and take up a lot of memory, so we canonicalise them by keeping a map
which we check every time a new style is created. Java programmers will
find this reminiscent of string interning, and Lisp programmers will
think of hash-consing, it's all the same idea.
Recently we have created a generic module for doing this with arbitrary
types which I include below. It has a very simple interface:
:- type canon(T).
:- func make_canon(T) = canon(T).
:- func get_canon_val(canon(T)) = T.
Declaratively, T = get_canon_val(make_canon(T)), so you only get back
what you put in. Operationally, it keeps a map of values it has seen
before in a mutable variable, and make_canon will return a reference to
a previous equal value if one exists.
Originally there was only a single function, make_canon(T) = T, however
introducing the new type provides an advantage for comparisons. Once you
canonicalise all the values of a type, testing for equality should only
require a pointer comparison, with no examination of the internal
structure of the object itself. Since canon(T) is just a c_pointer, it
should be compared more efficiently than comparing T itself.
Obviously this module is C-specific at the moment, but it would be easy
to make it portable.
One remaining weakness of this approach is that the mutable map will
keep a reference to canonical values indefinitely, even if all other
references go out of scope, so they can never be collected. The only
solution to this would appear to be something like the weak references
found in Java, which permit garbage collection if they are the only
remaining reference to an object. Alternatively, a reset predicate could
flush the entire map, which would be appropriate for situations when all
of the values are known to be out of scope.
:- module canon.
% Copyright (C) 2009 YesLogic Pty. Ltd.
:- interface.
:- type canon(T).
:- func make_canon(T) = canon(T).
:- func get_canon_val(canon(T)) = T.
%--------------------------------------------------------------------%
:- implementation.
:- import_module map, type_desc, univ.
:- type canon(T)
---> canon(c_pointer).
:- type canon_map == map(univ, c_pointer).
:- mutable(canon_maps, map(type_desc, canon_map), map.init, ground,
[untrailed]).
%--------------------------------------------------------------------%
:- pragma promise_pure(make_canon/1).
make_canon(T) = Canon :-
TypeDesc = type_of(T),
Univ = univ(T),
semipure get_canon_maps(CanonMaps0),
( if search(CanonMaps0, TypeDesc, Map0) then
( if search(Map0, Univ, Canon0) then
Canon = canon(Canon0)
else
Canon0 = unsafe_c_pointer(T),
Canon = canon(Canon0),
set(Map0, Univ, Canon0, Map),
set(CanonMaps0, TypeDesc, Map, CanonMaps),
impure set_canon_maps(CanonMaps)
)
else
Map0 = map.init,
Canon0 = unsafe_c_pointer(T),
Canon = canon(Canon0),
set(Map0, Univ, Canon0, Map),
set(CanonMaps0, TypeDesc, Map, CanonMaps),
impure set_canon_maps(CanonMaps)
).
%--------------------------------------------------------------------%
:- pragma foreign_proc(c,
get_canon_val(Canon::in) = (Val::out),
[will_not_call_mercury, thread_safe, promise_pure], "
Val = Canon;
").
:- func unsafe_c_pointer(T) = c_pointer.
:- pragma foreign_proc(c,
unsafe_c_pointer(Val::in) = (Ptr::out),
[will_not_call_mercury, thread_safe, promise_pure], "
Ptr = Val;
").
%--------------------------------------------------------------------%
:- end_module canon.
Best regards,
Michael
--
Print XML with Prince!
http://www.princexml.com
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to: mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions: mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the users
mailing list