[m-dev.] for review: new predicates in map.m

Fergus Henderson fjh at cs.mu.OZ.AU
Tue Oct 20 20:38:12 AEST 1998


On 20-Oct-1998, Peter Schachte <pets at cs.mu.OZ.AU> wrote:
> On Tue, Oct 20, 1998 at 03:03:28PM +1000, Fergus Henderson wrote:
> 
> > Let me ask another question, about map equality.
> > 
> > 	:- mode p(in, in, in, in) is semidet.
> > 	p(K1, V1, K2, V2) :-
> > 		map__init(MA0),
> > 		map__det_insert(MA0, K1, V1, MA1),
> > 		map__det_insert(MA1, K2, V2, MA),
> > 		map__init(MB0),
> > 		map__det_insert(MB0, K2, V2, MB1),
> > 		map__det_insert(MB1, K1, V1, MB),
> > 		MA = MB.
> > 
> > Should the unification `MA = MB' succeed?
> 
> Yes.  Maps should be independent of the order in which elements are
> inserted.

OK, call this "conceptual equality", as opposed to the
"representational equality". 
We discussed this quite a bit at the Mercury meeting tonight.

I prefer the "conceptual equality" approach, but
it does have some drawbacks: defining a user-defined equality
for the `map' type is straight-forward, but that implies a need to
promise representation-invariance for every operation which
examines the representation, and explicitly coding all those
promises may be tedious.  It's a fair bit of work for little
practical benefit.  On the other hand, the alternative of not
requiring such promises would be dangerous, because users might
unknowingly define operations on an abstract type which expose
the representation and which therefore break referential transparency.

Zoltan suggested a third approach, namely disallowing equality
for certain types.  This is a bit problematical because checking
this statically is difficult, since equality may be needed in
some modes but not others, and we don't have mode-dependent type classes.

However, even if that problem were overcome, disallowing equality
wouldn't solve the dilemma.  You still need to specify
the semantics for equality, even if you don't let users
call it, because referential transparency means that the semantics
for equality affect the semantics of other predicates. 
For example, since `map__keys' is declared `det',
then for every X, Y such that X = Y we have

	(map__keys(X, XKs), map__keys(Y, YKs)) => XKs = YKs

If map__keys is allowed to return different results for
maps that are conceptually equal but not representationally
equal, this implies that the equality on maps must be a
representational equality rather than a conceptual equality.

> If you take the oposite view, then you have to make sure
> that two maps unify iff the elements were added in the same order.

You need to ensure that two maps unify if the elements were added in
the same order.  But the converse (if two maps unify then the
elements were added in the same order) does not necessarily
have to hold.

> > > My knee-jerk reaction to adding a map__sorted_keys/2 predicate is that
> > > it's unnecessary
> > 
> > What do you suggest as the alternative?
> > Using map__keys plus list__sort?  That would be inefficient.
> 
> Yes.  How inefficient is sorting a sorted list?  I would expect it to
> be linear (you're not using an unmodified quicksort are you?), and
> since the cost of collecting the keys is linear, it's only a constant
> factor difference.

Yes, but constant factors may be important; in this case, I would guess
that the constant factor could easily be 2 or more.  I think that is
high enough to warrant a `map__sorted_keys'.

Having the unsorted version is less justifiable, since it is
hard to imagine a reasonable implementation of `map' for which `map__keys'
would not return the elements in sorted order anyway -- especially
if the equality on maps is conceptual equality rather than representational
equality.  Dynamically resizing hash tables would probably need to sort
the keys, just to ensure that map__keys returns the same results for
hash tables with the same elements that happened to have different sizes
(you don't want to enforce the invariant that hash tables with a given
number of elements always have the same size, because then a sequence
of insertions and deletions that cause the number of elements to
repeatedly cross a threshold would result in repeatedly resizing
the hash table, thus giving you O(N) for each such insertion/deletion
instead of amortized O(1)).  So the only type that I can think of for
which it would help would be fixed-size hash tables.

If `map__keys' wasn't there already, then I think it would not be
worth adding it anytime in the near future.  But since it is there
already, I don't think it is worth the bother of removing it.

Anyway, to summarize, I recommend that we do the following:

	- add `map__sorted_keys';

	- leave `map__keys' there, as is;

	- fix the implicit dependencies on the result of `map__keys'
	  in the compiler sources by doing a global search-and-replace
	  (s/map__keys/map__sorted_keys/g);
	
	- for the time being, leave equality on maps as is (representational);
	  but also make sure that future modifications do not add any
	  new predicates or change the behaviour of any existing predicates
	  in ways which could expose the representation (in other
	  words, preserve the status quo: currently the only predicate
	  that exposes the representation is the implicit unification
	  predicate).

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.



More information about the developers mailing list