[mercury-users] XML / DOM

Richard A. O'Keefe ok at atlas.otago.ac.nz
Tue Dec 19 13:58:28 AEDT 2000

	So are you suggesting that nodes be identified by contatining a `thingy'
	that uniquely identifies them? eg
No.  You are still assuming that a tree and a pointer to a subtree
are the same data structure.

You do NOT in fact need identity at all.
XML is a little bit confusing, because it doesn't (yet) have an official
semantics other than the two (different ones) it inherited from SGML,
although it does have several semi-official ones.  One of the things that
has been giving the XML Core people a hard time is precisely this lack of
a single clear semantics; now they have to clean up a situation where every
XML-related specification assumes a *different* semantics.

But if we take the idea of a GROVE from SGML, nodes do NOT strictly speaking
have identity.  Nodes are distinguished by their properties.  ("A document
is a database", not a plex.)  Some nodes have ID properties, and in the
grove that a parser hands to an application, there must be only one node
occurrence with any particular ID, but that is not a pointer.

What you need is a data type "pointer" which gives you the following
operations, amongst others:

    start(+Tree, -Ptr)
    ptr_tree(+Ptr, -Tree) or equivalent
    left_right(?LeftPtr, ?RightPtr)
    parent_first_child(?ParentPtr, ?FirstChildPtr)
    up_down(?ParentPtr, ?ChildPtr)

I cannot recall XPath syntax off the top of my head, but suppose
I am given something like


and I want to find the TR element that contains a TD that contains
an A with NAME = fred anywhere.

interesting_tr(TablePtr, TrPtr) :-
    up_down(TablePtr, TrPtr),
    up_down(TrPtr, TdPtr),
    up_down_star(TdPtr, APtr),
    ptr_tree(APtr, element(a,Atts,_)),
    memberchk(name=fred, Atts).

I'd like to see anyone do *that* more easily with the DOM!

But now, let's suppose that we have a table mapping anchors (X) to
the <A> elements (<A ... NAME=X...>) that define them, and we want
to find the TablePtr and TrPtr as above.  I'll assume obvious
operations on the table.

interesting_tr(AnchorTable, TablePtr, TrPtr) :-
    lookup(fred, AnchorTable, APtr),
    up_down_star(TdPtr, APtr),
    up_down(TrPtr, TdPtr),
    up_down(TablePtr, TrPtr).

How can you get such a table?

It is 3:40 and I am still answering e-mail.  Sigh.

make_anchor_table(RootPtr, AnchorTable) :-
    make_anchor_table(RootPtr, AnchorTable0, AnchorTable).

make_anchor_table(P, T0, T) :-
    (   ptr_tree(P, element(a,Atts,_), member(name=X, Atts) ->
	insert_anchor(X, P, T0, T1)
    ;/* not an <A NAME=X> element */
	T1 = T0
    (   at_bottom(P) -> T2 = T1
    ;   parent_first_child(P, C), make_anchor_table(C, T1, T2)
    (   at_right(P) -> T = T2
    ;   left_right(P, R), make_anchor_table(R, T2, T)

This code is the equivalent of

    void walk_and_make_anchor_table(ptr p, tbl t) {
        if (p->is_element && streql(p->name, "a")) {
            att *a = find_attr(p, "name");
            if (a != 0) insert_anchor(a->name, a->value, t);
	if (p->first_child != 0)
	    walk_and_make_anchor_table(t->first_child, t);
	if (p->right_sibling != 0)
	    walk_and_make_anchor_table(t->right_sibling, t)
    tbl make_anchor_table(ptr root) {
	tbl t = init_anchor();

	walk_and_make_anchor_table(root, t);
	return t;

The predicate that shows child to parent navigation is
interesting_tr/3.  The steps it takes once it has looked up
the anchor name really are O(1) time,
 - without cycles
 - without mutable objects
 - without node identity

	If you could either provide a sketch of how this is done, or a link
	to your paper, then that would be very useful.
I am not willing to broadcast this until the paper is sent off, I'm sorry.
My publication record is rather thin of recent years, and if someone else
should publish this first, my job will quite literally be at risk.

But I have provided plenty of hints!

	> I have no idea about implementing XSLT; I've tried to read the spec and
	> was so disgusted that I have trouble thinking about it.  Yet another typical
	> W3C spec with oodles of syntax, but only warm fuzzies where the hard parts
	> of semantics are concerned.
	Indeed, the spec is light on semantics, however XSLT is nonetheless
	quite a nice *declarative* way of transforming documents.
Before I call something a "nice declarative way" to do anything,
I require
 - that it be readable, to count as nice
 - that it have a clear semantics, to count as declarative.
XSL fails on both counts.  If you want a nice declarative way of
transforming documents, look at HaXML.

By the way, I am proposing to set
    "Produce a formal semantics of CSS2 rule selection"
as a fourth-year project next year.  If anyone knows where I can find
such an animal, I'll have to take that project off the list.

XSL gets its formatting concepts from CSS, and I've read the CSS1 and CSS2
"specifications" *and* the book written by the people that invented it, and
I still don't know what the formatting model of CSS and XSL actually is.
They've given me lots of warm fuzzies, but no clear model, and in the real
world some warm fuzzies are tarantulas.

mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe

More information about the users mailing list