[mercury-users] Canonical terms

Michael Day mikeday at yeslogic.com
Mon Jan 12 19:24:41 AEDT 2009


Hi,

In Prince we have some big style structures which tend to proliferate 
and take up a lot of memory, so we canonicalise them by keeping a map 
which we check every time a new style is created. Java programmers will 
find this reminiscent of string interning, and Lisp programmers will 
think of hash-consing, it's all the same idea.

Recently we have created a generic module for doing this with arbitrary 
types which I include below. It has a very simple interface:

:- type canon(T).

:- func make_canon(T) = canon(T).

:- func get_canon_val(canon(T)) = T.

Declaratively, T = get_canon_val(make_canon(T)), so you only get back 
what you put in. Operationally, it keeps a map of values it has seen 
before in a mutable variable, and make_canon will return a reference to 
a previous equal value if one exists.

Originally there was only a single function, make_canon(T) = T, however 
introducing the new type provides an advantage for comparisons. Once you 
canonicalise all the values of a type, testing for equality should only 
require a pointer comparison, with no examination of the internal 
structure of the object itself. Since canon(T) is just a c_pointer, it 
should be compared more efficiently than comparing T itself.

Obviously this module is C-specific at the moment, but it would be easy 
to make it portable.

One remaining weakness of this approach is that the mutable map will 
keep a reference to canonical values indefinitely, even if all other 
references go out of scope, so they can never be collected. The only 
solution to this would appear to be something like the weak references 
found in Java, which permit garbage collection if they are the only 
remaining reference to an object. Alternatively, a reset predicate could 
flush the entire map, which would be appropriate for situations when all 
of the values are known to be out of scope.

:- module canon.

% Copyright (C) 2009 YesLogic Pty. Ltd.

:- interface.

:- type canon(T).

:- func make_canon(T) = canon(T).

:- func get_canon_val(canon(T)) = T.

%--------------------------------------------------------------------%

:- implementation.

:- import_module map, type_desc, univ.

:- type canon(T)
     --->    canon(c_pointer).

:- type canon_map == map(univ, c_pointer).

:- mutable(canon_maps, map(type_desc, canon_map), map.init, ground,
     [untrailed]).

%--------------------------------------------------------------------%

:- pragma promise_pure(make_canon/1).

make_canon(T) = Canon :-
     TypeDesc = type_of(T),
     Univ = univ(T),
     semipure get_canon_maps(CanonMaps0),
     ( if search(CanonMaps0, TypeDesc, Map0) then
	( if search(Map0, Univ, Canon0) then
	    Canon = canon(Canon0)
	else
	    Canon0 = unsafe_c_pointer(T),
	    Canon = canon(Canon0),
	    set(Map0, Univ, Canon0, Map),
	    set(CanonMaps0, TypeDesc, Map, CanonMaps),
	    impure set_canon_maps(CanonMaps)
	)
     else
	Map0 = map.init,
	Canon0 = unsafe_c_pointer(T),
	Canon = canon(Canon0),
	set(Map0, Univ, Canon0, Map),
	set(CanonMaps0, TypeDesc, Map, CanonMaps),
	impure set_canon_maps(CanonMaps)
     ).

%--------------------------------------------------------------------%

:- pragma foreign_proc(c,
	get_canon_val(Canon::in) = (Val::out),
	[will_not_call_mercury, thread_safe, promise_pure], "
     Val = Canon;
").

:- func unsafe_c_pointer(T) = c_pointer.

:- pragma foreign_proc(c,
	unsafe_c_pointer(Val::in) = (Ptr::out),
	[will_not_call_mercury, thread_safe, promise_pure], "
     Ptr = Val;
").

%--------------------------------------------------------------------%

:- end_module canon.

Best regards,

Michael

-- 
Print XML with Prince!
http://www.princexml.com
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the users mailing list