# [mercury-users] XML / DOM

Richard A. O'Keefe ok at atlas.otago.ac.nz
Tue Dec 19 12:26:54 AEDT 2000

Thomas Conway <conway at cs.mu.OZ.AU> wrote:

I agree with you that a *real* tree is much nicer to deal with than
an indirect representation, however I would note that if you're dealing
with beasts such as XPath, then it is rather inconvenient that the
nodes in the document tree don't have identity.

They can quite trivially be *given* identity.

Being able to deal
nicely with IDREF[S] and so forth using pointers' is rather useful.

But in *NO* way requires the use of mutable data structures or cyclic terms!
I mean, I process SGML documents with Prolog, and I *do* deal "nicely"
with IDREFs.

Like it or not, pointers/identity
lets you perform some operations MUCH more efficiently. For example
consider the case where I want the parent of a node.

By a not very suprising coincidence, I am putting the final touches on
a paper which points out that this doesn't even come close to being true.
Here's the opening:

\title{O(1) reversible tree navigation without cycles}
\author{Richard A. O'Keefe\\University of Otago}
\date{December 2000}
\begin{document}
\maketitle
\begin{abstract}
Imperative programmers often use cyclically linked trees
in order to achieve O(1) navigation time to neighbours.
Some logic programmers believe that cyclic terms are necessary
to achieve the same in logic-based languages.
An old but little-known technique
in the form of reversible predicates.
A small modification provides O(1) amortised time and space editing.
\end{abstract}

(You have to read very carefully there to realise that the editing
version is O(1) but *not* reversible.)

I have benchmark results for several Prologs, GHC, and Clean.
In fact the Clean version is competitive with C, largely due to the
storage inefficiencies forced by a DOM-style set of rigid interlocking links.

If I have only
values, first I have to unify subtrees to search for matches (which
may be arbitarily expensive),

Wrong.

and even then I have no way of knowing
if this is the *real* parent,

Wrong.

or if there is more than one possible
parent (unless I return all possible parents, but then we have the
same problem all over again).

and Wrong again.

Without node identity, it would be almost impossible to implement
XPath and XSLT efficiently.

I have no idea about implementing XSLT; I've tried to read the spec and
was so disgusted that I have trouble thinking about it.  Yet another typical
W3C spec with oodles of syntax, but only warm fuzzies where the hard parts
of semantics are concerned.

But XPath, no, you don't need node identity for that at all.
cycles of mutable objects for that at all.

this paper published, because it's only a minor extension of an *old*
technique that David H. D. Warren showed me back in about 1983.

The really neat thing about it is that the crucial idea is to distinguish
carefully between a tree and a pointer to a place within a tree.  As long
as you think those have to be the same thing, you'll have a hard time
breaking the hypnosis of Javascript.

--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
`