[mercury-users] XML / DOM

Richard A. O'Keefe ok at atlas.otago.ac.nz
Tue Dec 19 12:26:54 AEDT 2000

Thomas Conway <conway at cs.mu.OZ.AU> wrote:

	I agree with you that a *real* tree is much nicer to deal with than
	an indirect representation, however I would note that if you're dealing
	with beasts such as XPath, then it is rather inconvenient that the
	nodes in the document tree don't have identity.

They can quite trivially be *given* identity.

	Being able to deal
	nicely with IDREF[S] and so forth using `pointers' is rather useful.
But in *NO* way requires the use of mutable data structures or cyclic terms!
I mean, I process SGML documents with Prolog, and I *do* deal "nicely"
with IDREFs.

	Like it or not, pointers/identity
	lets you perform some operations MUCH more efficiently. For example
	consider the case where I want the parent of a node.

By a not very suprising coincidence, I am putting the final touches on
a paper which points out that this doesn't even come close to being true.
Here's the opening:

\title{O(1) reversible tree navigation without cycles}
\author{Richard A. O'Keefe\\University of Otago}
\date{December 2000}
Imperative programmers often use cyclically linked trees
in order to achieve O(1) navigation time to neighbours.
Some logic programmers believe that cyclic terms are necessary
to achieve the same in logic-based languages.
An old but little-known technique
provides O(1) time and space navigation without cyclic links,
in the form of reversible predicates.
A small modification provides O(1) amortised time and space editing.

(You have to read very carefully there to realise that the editing
version is O(1) but *not* reversible.)

I have benchmark results for several Prologs, GHC, and Clean.
In fact the Clean version is competitive with C, largely due to the
storage inefficiencies forced by a DOM-style set of rigid interlocking links.

	If I have only
	values, first I have to unify subtrees to search for matches (which
	may be arbitarily expensive),


	and even then I have no way of knowing
	if this is the *real* parent,


	or if there is more than one possible
	parent (unless I return all possible parents, but then we have the
	same problem all over again).
and Wrong again.

	Without node identity, it would be almost impossible to implement
	XPath and XSLT efficiently.
I have no idea about implementing XSLT; I've tried to read the spec and
was so disgusted that I have trouble thinking about it.  Yet another typical
W3C spec with oodles of syntax, but only warm fuzzies where the hard parts
of semantics are concerned.

But XPath, no, you don't need node identity for that at all.
You only need O(1) navigation, and you *DON'T* need rigidly linked
cycles of mutable objects for that at all.

The really strange thing about this is that I was worried about getting
this paper published, because it's only a minor extension of an *old*
technique that David H. D. Warren showed me back in about 1983.

The really neat thing about it is that the crucial idea is to distinguish
carefully between a tree and a pointer to a place within a tree.  As long
as you think those have to be the same thing, you'll have a hard time
breaking the hypnosis of Javascript.

mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe

More information about the users mailing list