[m-dev.] for review: namespaces in xml interface

Tyson Dowd trd at cs.mu.OZ.AU
Wed Jan 3 13:42:02 AEDT 2001


On 02-Jan-2001, Ina Cheng <inch at students.cs.mu.oz.au> wrote:
> Hi,
> 
> Add some predicates in the xml interface to take into account 
> namespaces in xml.
> 
> Ina
> 
> ========================================================================
> 
> Estimated hours taken: 6 days
> 
> Add a module to turn an xml document into a namespace-aware xml document.
> 
> extra/xml/tryit.m
>         change the sample program to print out the new xml output
> 
> extra/xml/xml.m
>         add xml.ns.m to the wrapper module
> 
> extra/xml/xml.ns.m
>         the new module containing predicates to turn an xml document to a 
>         namespace-aware xml document
> 

> Index: xml.ns.m
> ===================================================================
> RCS file: xml.ns.m
> diff -N xml.ns.m
> --- /dev/null   Wed Nov 15 09:24:47 2000
> +++ xml.ns.m    Tue Jan  2 10:29:20 2001
> @@ -0,0 +1,228 @@
> +%---------------------------------------------------------------------------%
> +% Copyright (C) 2000 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU Library General
> +% Public License - see the file COPYING.LIB in the Mercury distribution.
> +%---------------------------------------------------------------------------%
> +%
> +% File: xml.ns.m
> +% Main author: conway at cs.mu.oz.au, inch at students.cs.mu.oz.au
> +%
> +% This module provides predicates to turn an xml document into a
> "namespace-
> +% aware" xml document.

Your mailer (pine?) seems to like wrapping long lines.  If you can, it
might be a good idea to turn it off when sending diffs.

You'll want to make 2000 into 2000,2001 in the copyright message.

> +%
> +%---------------------------------------------------------------------------%
> +:- module xml:ns.
> +
> +:- interface.
> +
> +:- import_module list, array, string, std_util, io.
> +:- import_module xml:doc.
> +
> +:- type nsDocument
> +       --->    nsDoc(
> +                   prestuff    :: list(ref(nsContent)),
> +                   root        :: ref(nsContent),
> +                   poststuff   :: list(ref(nsContent)),
> +                   content     :: array(nsContent)
> +               ).

Just because Tom gets away with writing code without any comments
doesn't mean you will ;-)

A line explaining what each type represents, and how it relates to the
types used in other modules which are similar would be nice.

Also a few lines at the top of the module explaining the difference
between a namespace aware XML document and a normal XML document would
be nice.  Also a reference to any relevant specifications.

> +
> +:- type nsContent
> +       --->    nsElement(nsElement)
> +       ;       pi(string, string)
> +       ;       comment(string)
> +       ;       data(string)
> +       .
> +
> +:- type nsElement
> +       --->    nsElement(
> +                   eName       :: qName,
> +                   eAttrs      :: list(nsAttribute),
> +                   eContent    :: list(ref(content)),
> +                   eNamespaces :: nsList       % Prefix - URI
> +               ).
> +
> +:- type nsAttribute
> +       --->    nsAttribute(
> +                   aName       :: qName,
> +                   aValue      :: string
> +               ).
> +
> +:- type qName
> +       --->    qName(
> +                   localName   :: string,
> +                   nsURI       :: string
> +               ).
> +
> +:- type nsList == list(pair(string, string)).
> +
> +:- pred nsTranslate((xml:doc):document::in, nsDocument::out) is det.

I think it would be clearer to define a nsURI type

	:- type nsURI == string. 

so that we can tell which strings are supposed to be URIs more easily.


Also, all these predicates could do with a single line comment
explaining what they do.

> +:- pred extractNamespaceDecls(list(attribute), nsList, list(attribute)).
> +:- mode extractNamespaceDecls(in, out, out) is det.
> +
> +extractNamespaceDecls([], [], []).
> +extractNamespaceDecls([Attr|Attrs], NSList, NewAttrs) :-
> +       split_on_colon(Attr^aName, Prefix, Suffix),
> +       (
> +               % eg. < book xmlns:isbn="someURI" >
> +               % Prefix = xmlns
> +               % Suffix = isbn
> +
> +               is_xmlns(Prefix) ->
> +               NSList = [(Suffix - Attr^aValue) | NSList0],
> +               NewAttrs = NewAttrs0
> +       ;
> +               NSList = NSList0,
> +               NewAttrs = [Attr|NewAttrs0]
> +       ),

You should format this code with the `->' in the same column as the
`(', `;' and `)'.

> +       extractNamespaceDecls(Attrs, NSList0, NewAttrs0).
> +
> +
> +:- pred namespaceizeName(namespaces, string, string, qName).
> +:- mode namespaceizeName(in, in, in, out) is det.
> +
> +namespaceizeName(Namespaces, Default, Name, QName) :-
> +       split_on_colon(Name, Prefix, Suffix),
> +       (
> +               % for case when element name = prefix:suffix
> +               map__search(Namespaces, Prefix, URI)
> +       ->
> +               QName = qName(Suffix, URI)
> +       ;
> +               % for case when attribute name = xmlns:suffix
> +               is_xmlns(Prefix),
> +               map__search(Namespaces, Suffix, URI)
> +       ->
> +               QName = qName(Suffix, URI)
> +       ;
> +               % for case when element name has no prefix
> +               QName = qName(Suffix, Default)
> +       ).

This level of commenting is good -- put that everywhere and I'd be very
happy.

> +
> +:- pred is_xmlns(string::in) is semidet.
> +is_xmlns("xmlns").

Explaining that this string comes from the specification of XML
namespaces would be good. 

> +
> +
> +:- func convert_type(content) = nsContent.
> +
> +convert_type(comment(S)) = comment(S).
> +convert_type(data(S)) = data(S).
> +convert_type(pi(S,S0)) = pi(S,S0).
> +
> +% XXX how to use inst such that I don't have to define the following line
> +
> +convert_type(element(_)) = nsElement(nsElement(qName("",""),[],[],[])).

Something like

:- inst not_an_element ==
        bound(  	comment(ground)
        	;       data(ground)
		;	pi(ground, ground)
	).

:- func convert_type(content) = nsContent.
:- mode convert_type(in(not_an_element)) = out is det.


Or you could just call require__error/1 on the element(_) case.

> +
> +
> +:- pred foldl(pred(array(content), namespaces, string, ref(content),
> +       T, T), array(content), namespaces, string, list(ref(content)), T, T).
> +:- mode foldl(pred(in, in, in, in, in, out) is det, in, in, in, in, in,
> +       out) is det.
> +
> +foldl(_Pred, _, _, _, [], Acc, Acc).
> +foldl(Pred, Content, NameSpaces, Default, [Ref|Refs], Acc0, Acc) :-
> +       foldl(Pred, Content, NameSpaces, Default, Refs, Acc1, Acc),
> +       call(Pred, Content, NameSpaces, Default, Ref, Acc0, Acc1).
> +
> 

Calling this foldl is a bit confusing.

At the very least, you should write the fully qualified name
	xml:ns:foldl
when you use it.

Finally, `:' will go away as the module qualifier at some point in future.
So it's best to use `__' for the moment, as we will be keeping it.

This is a problem with the entire xml directory, so I'm not too
concerned if you don't fix it now.

Otherwise the code is fine.  I'm happy for you to commit once you have
addressed these changes.  Don't take too long with the commenting, I'm
really only after a quick summary of what the code does and how it does
it. 

-- 
       Tyson Dowd           # 
                            #  Surreal humour isn't everyone's cup of fur.
     trd at cs.mu.oz.au        # 
http://www.cs.mu.oz.au/~trd #
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list