[m-rev.] For review: Convert Mercury terms to XML

Julien Fischer juliensf at cs.mu.OZ.AU
Mon Dec 6 16:49:32 AEDT 2004


On Sat, 4 Dec 2004, Ian MacLarty wrote:

> For review by anyone.
>
...
>
>  %---------------------------------------------------------------------------%
>  %---------------------------------------------------------------------------%
> Index: library/to_xml.m
> ===================================================================
> RCS file: library/to_xml.m
> diff -N library/to_xml.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ library/to_xml.m	4 Dec 2004 10:46:28 -0000
> @@ -0,0 +1,1046 @@
> +%-----------------------------------------------------------------------------r
> +% Copyright (C) 1993-2004 The University of Melbourne.
> +% This file may only be copied under the terms of the GNU Library General
> +% Public License - see the file COPYING.LIB in the Mercury distribution.
> +%-----------------------------------------------------------------------------%
> +%
> +% File: xml_out.m.
> +% Main author: maclarty.
> +% Stability: low.
> +%
> +% A Mercury term to XML converter.
> +%
> +% This module contains predicates to write arbitrary Mercury terms to
> +% an output stream as XML.
> +%
> +% Each functor in a term is given a corresponding well-formed element name
> +% in the XML document.
> +%
> +% The element names `String', `Int', `Char' and `Float' are reserved for the
> +% corresponding Mercury builtin types.  The values for these elements are
> +% stored as parsed characater data inside the element.
> +%
> +% Elements for discriminated unions get their name from the functor name and
> +% the type name.  Array element names consist of the string "Array--" followed
> +% by the type name of the elements of the array.  All other types are assigned
> +% the reserved `Unrecognised' element name.
> +%
> +% Each element (including `Unrecognised') may also have certain attributes set:
> +%
> +% functor - the original functor name as returned by
> +% 	deconstruct.deconstruct/5.  This attribute will be present for
> +%	every element except elements for builtin types.
> +%
> +% typename - the type name of the Mercury type the element represents.
> +%	This attribute will also always be present for all elements except
> +%	elements for builtin types.
> +%
> +% field - discriminated union functor arguments (including those with a
> +%	builtin type) that have a field name will have this attribute set.
> +%
> +% The XML document can also be annotated with a style sheet reference.  Once a
> +% Mercury term is in XML it can be converted to many different formats using
> +% the appropriate stylesheet.  For example in the extras/xml_stylesheets
> +% distribution there are stylesheets to convert XML documents generated with
> +% this library back to Mercury terms that can be read with io.read/3.  There is
> +% also a stylesheet in extras to browse a Mercury term with Mozilla using XUL.
> +%
> +% To support third parties generating XML which is compatible with the XML
> +% generated by this library, a DTD for a Mercury type can also be generated.
> +%
> +% The generated DTD is also a good reference when creating a stylesheet as
> +% it contains comments describing the mapping from functors to elements.
> +%
> +%-----------------------------------------------------------------------------%
> +%-----------------------------------------------------------------------------%
> +
> +:- module to_xml.
> +:- interface.
> +
> +:- import_module io, int, deconstruct, std_util.
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- type maybe_stylesheet
> +	--->	with_stylesheet(string, string)	% stylesheet type and href.
You could perhaps make stylesheet_type and stylesheet_href field names
here.

> +	;	no_stylesheet.
> +
> +	% Values of this type specifys the DOCTYPE of an XML document when
> +	% the DOCTYPE is defined by an external DTD.
> +	%
> +:- type doctype
> +	--->	public(string)			% FPI
> +	;	public(string, string)		% FPI, URL
> +	;	system(string).			% URL
> +
> +	% Values of this type specify whether a DTD should be included in
> +	% a generated XML document and if so how.
> +	%
> +:- type maybe_dtd
> +			% Embed the entire DTD in the document.
> +	--->	embed
> +			% Included a reference to an external DTD.
> +	;	external(doctype)
> +			% Do not include any DOCTYPE information.
> +	;	no_dtd.
> +
> +	% Values of this type indicate whether a DTD was successfully
> +	% generated or not.  A DTD cannot be generated for a type with more
> +	% than one top-level functor since only one root element can be
> +	% specified by a DTD.
> +	%
> +:- type dtd_generation_result
> +	--->	ok
> +	;	multiple_functors_for_root.
> +
> +	% write_xml_doc(Term, MaybeStyleSheet, MaybeDTD, DTDResult, !IO).
> +	% Write Term to the current output stream as an XML document.
> +	% MaybeStyleSheet and MaybeDTD specify whether or not a stylesheet
> +	% reference and/or a DTD should be included.  Any non-canonical terms
> +	% will be canonicalized.  If an embedded DTD was requested and the type
> +	% has more than one top level functor then multiplt_functors_for_root
> +	% will be returned in DTDResult and nothing will be written.
> +	%
> +:- pred write_xml_doc(T::in, maybe_stylesheet::in, maybe_dtd::in,
> +	dtd_generation_result::out, io::di, io::uo) is det.
> +
> +	% write_xml_doc(Stream, Term, MaybeStyleSheet, MaybeDTD, DTDResult,
> +	%	!IO).
> +	% Same as write_xml_doc/5 except write the XML doc to the given
> +	% output stream.
> +	%
> +:- pred write_xml_doc(io.output_stream::in, T::in, maybe_stylesheet::in,
> +	maybe_dtd::in, dtd_generation_result::out, io::di, io::uo) is det.
> +
> +	% write_xml_doc_cc(Term, MaybeStyleSheet, MaybeDTD, DTDResult, !IO).
> +	% Write Term to the current output stream as an XML document.
> +	% MaybeStyleSheet and MaybeDTD specify whether or not a stylesheet
> +	% reference and/or a DTD should be included.  Any non-canonical terms
> +	% will be be written out in full. If an embedded DTD was requested and
> +	% the type has more than one top level functor then
> +	% multiplt_functors_for_root will be returned in DTDResult and nothing
> +	% will be written.
> +	%
> +:- pred write_xml_doc_cc(T::in, maybe_stylesheet::in, maybe_dtd::in,
> +	dtd_generation_result::out, io::di, io::uo) is cc_multi.
> +
> +	% write_xml_doc_cc(Stream, Term, MaybeStyleSheet, MaybeDTD, DTDResult,
> +	%	!IO).
> +	% Same as write_xml_doc/5 except write the XML doc to the given
> +	% output stream.
> +	%
> +:- pred write_xml_doc_cc(io.output_stream::in, T::in, maybe_stylesheet::in,
> +	maybe_dtd::in, dtd_generation_result::out, io::di, io::uo) is cc_multi.
> +
> +	% True if the given type doesn't have multiple top level functors.
> +	%
> +:- pred ok_to_generate_dtd(type_desc::in) is semidet.
s/doesn't/does not/

> +
> +	% write_dtd(Term, DTDResult, !IO).
> +	% Write a DTD for the given term to the current output stream.  If the
> +	% type of Term has more than one top level functor then
> +	% multiple_functors_for_root will be returned in DTDResult and
> +	% nothing will be written, otherwise ok is returned in DTDResult.
> +	%
> +:- pred write_dtd(T::unused, dtd_generation_result::out, io::di, io::uo)
> +	is det.
> +
> +	% write_dtd(Stream, Term, DTDResult, !IO).
> +	% Write a DTD for the given term to the specified output stream. If the
> +	% type of Term has more than one top level functor then
> +	% multiple_functors_for_root will be returned in DTDResult and
> +	% nothing will be written, otherwise ok is returned in DTDResult.
> +	%
> +:- pred write_dtd(io.output_stream::in, T::unused, dtd_generation_result::out,
> +	io::di, io::uo) is det.
> +
> +	% write_dtd_for_type(Type, DTDResult, !IO).
> +	% Write a DTD for the given type to the current output stream. If the
> +	% type has more than one top level functor then
> +	% multiple_functors_for_root will be returned in DTDResult and nothing
> +	% will be written, otherwise ok is returned in DTDResult.
> +	%
> +:- pred write_dtd_from_type(type_desc::in, dtd_generation_result::out,
> +	io::di, io::uo) is det.
> +
> +	% write_dtd_for_type(Stream, Type, DTDResult, !IO).
> +	% Write a DTD for the given type to the given output stream. If the
> +	% type has more than one top level functor then
> +	% multiple_functors_for_root will be returned in DTDResult and nothing
> +	% will be written, otherwise ok is returned in DTDResult.
> +	%
> +:- pred write_dtd_from_type(io.output_stream::in, type_desc::in,
> +	dtd_generation_result::out, io::di, io::uo) is det.
> +
> +	% write_xml_element(NonCanon, IndentLevel, Term, !IO).
> +	% Write XML elements for the given term and all it's descendents,
> +	% using IndentLevel as the initial indentation level (each
> +	% indentation level is one space character).  No <?xml ... ?>
> +	% header will be written.  Non canonical terms will be handled
> +	% according to the value of NonCanon.  See the deconstruct
> +	% library for more information on this argument.
s/deconstruct library/deconstruct module in the standard library/

> +	%
> +:- pred write_xml_element(deconstruct.noncanon_handling, int, T, io, io).
> +:- mode write_xml_element(in(do_not_allow), in, in, di, uo) is det.
> +:- mode write_xml_element(in(canonicalize), in, in,  di, uo) is det.
> +:- mode write_xml_element(in(include_details_cc), in, in, di, uo) is cc_multi.
> +:- mode write_xml_element(in, in, in, di, uo) is cc_multi.
> +
Somewhere in the interface you should have a comment mentioning that
many of theses predicates can throw the xml_internal_error (or whatever
it is) exception.

> +%-----------------------------------------------------------------------------%
> +%-----------------------------------------------------------------------------%
> +
> +:- implementation.
> +
> +:- import_module string, list, char, exception, bool, array.
> +:- import_module exception, map.
> +
> +%-----------------------------------------------------------------------------%
> +
> +write_xml_doc(X, MaybeStyleSheet, MaybeDTD, DTDResult, !IO) :-
> +	(
> +		( MaybeDTD \= embed ; ok_to_generate_dtd(type_of(X)) )
> +	->
> +		DTDResult = ok,
> +		write_xml_header(no, !IO),
> +		write_stylesheet_ref(MaybeStyleSheet, !IO),
> +		write_doctype(canonicalize, X, MaybeDTD, _, !IO),
> +		write_xml_element(canonicalize, 0, X, !IO)
> +	;
> +		DTDResult = multiple_functors_for_root
> +	).
> +
> +write_xml_doc(Stream, X, MaybeStyleSheet, MaybeDTD, DTDResult, !IO) :-
> +	io.set_output_stream(Stream, OrigStream, !IO),
> +	write_xml_doc(X, MaybeStyleSheet, MaybeDTD, DTDResult, !IO),
> +	io.set_output_stream(OrigStream, _, !IO).
> +
> +write_xml_doc_cc(X, MaybeStyleSheet, MaybeDTD, DTDResult, !IO) :-
> +	(
> +		( MaybeDTD = embed ; ok_to_generate_dtd(type_of(X)) )
> +	->
> +		DTDResult = ok,
> +		write_xml_header(no, !IO),
> +		write_stylesheet_ref(MaybeStyleSheet, !IO),
> +		write_doctype(include_details_cc, X, MaybeDTD, _, !IO),
> +		write_xml_element(include_details_cc, 0, X, !IO)
> +	;
> +		DTDResult = multiple_functors_for_root
> +	).
> +
> +write_xml_doc_cc(Stream, X, MaybeStyleSheet, MaybeDTD, DTDResult, !IO) :-
> +	io.set_output_stream(Stream, OrigStream, !IO),
> +	write_xml_doc_cc(X, MaybeStyleSheet, MaybeDTD, DTDResult, !IO),
> +	io.set_output_stream(OrigStream, _, !IO).
> +
> +write_xml_element(NonCanon, IndentLevel, X, !IO) :-
> +	type_to_univ(X, Univ),
> +	write_xml_element_univ(NonCanon, IndentLevel, Univ, [], _, !IO).
> +
> +write_dtd(Term, DTDResult, !IO) :-
> +	type_of(Term) = TypeDesc,
> +	write_dtd_from_type(TypeDesc, DTDResult, !IO).
> +
> +write_dtd(Stream, Term, DTDResult, !IO) :-
> +	io.set_output_stream(Stream, OrigStream, !IO),
> +	write_dtd(Term, DTDResult, !IO),
> +	io.set_output_stream(OrigStream, _, !IO).
> +
> +write_dtd_from_type(Stream, TypeDesc, DTDResult, !IO) :-
> +	io.set_output_stream(Stream, OrigStream, !IO),
> +	write_dtd_from_type(TypeDesc, DTDResult, !IO),
> +	io.set_output_stream(OrigStream, _, !IO).
> +
> +:- pred write_xml_header(maybe(string)::in, io::di, io::uo) is det.
> +
> +write_xml_header(MaybeEncoding, !IO) :-
> +	io.write_string("<?xml version=""1.0""", !IO),
> +	(
> +		MaybeEncoding = yes(Encoding),
> +		io.write_string(" encoding=""", !IO),
> +		io.write_string(Encoding, !IO),
> +		io.write_string("""?>\n", !IO)
> +	;
> +		MaybeEncoding = no,
> +		io.write_string("?>\n", !IO)
> +	).
> +
> +:- pred write_stylesheet_ref(maybe_stylesheet::in, io::di, io::uo) is det.
> +
> +write_stylesheet_ref(no_stylesheet, !IO).
> +write_stylesheet_ref(with_stylesheet(Type, Href), !IO) :-
> +	io.write_string("<?xml-stylesheet type=""", !IO),
> +	io.write_string(Type, !IO),
> +	io.write_string(""" href=""", !IO),
> +	io.write_string(Href, !IO),
> +	io.write_string("""?>\n", !IO).
> +
> +:- pred write_doctype(deconstruct.noncanon_handling, T, maybe_dtd,
> +	dtd_generation_result, io, io).
> +:- mode write_doctype(in(canonicalize), in, in, out, di, uo) is det.
> +:- mode write_doctype(in(do_not_allow), in, in, out, di, uo) is det.
> +:- mode write_doctype(in(include_details_cc), in, in, out, di, uo) is cc_multi.
> +:- mode write_doctype(in, in, in, out, di, uo) is cc_multi.
> +
> +write_doctype(_, _, no_dtd, ok, !IO).
> +write_doctype(_, T, embed, DTDResult, !IO) :-
> +	write_dtd(T, DTDResult, !IO),
> +	io.nl(!IO).
> +write_doctype(NonCanon, T, external(DocType), ok, !IO) :-
> +	deconstruct.deconstruct(T, NonCanon, Functor, _, _),
> +	Root = get_element(type_of(T), from_functor_name(Functor)),
> +	io.write_string("<!DOCTYPE ", !IO),
> +	io.write_string(Root, !IO),
> +	(
> +		DocType = public(PUBLIC),
> +		io.write_string(" PUBLIC """, !IO),
> +		io.write_string(PUBLIC, !IO)
> +	;
> +		DocType = public(PUBLIC, SYSTEM),
> +		io.write_string(" PUBLIC """, !IO),
> +		io.write_string(PUBLIC, !IO),
> +		io.write_string(""" """, !IO),
> +		io.write_string(SYSTEM, !IO)
> +	;
> +		DocType = system(SYSTEM),
> +		io.write_string(" SYSTEM """, !IO),
> +		io.write_string(SYSTEM, !IO)
> +	),
> +	io.write_string(""">\n", !IO).
> +
> +%-----------------------------------------------------------------------------%
> +%
> +% Some reserved element names.  Reserved element names all start with a
> +% capital letter so as not to conflict with a mangled element name.
> +%
> +
> +:- func reserved_prefix = string.
> +
> +	% A prefix for functors that start with a capital letter or
> +	% a non-letter.
> +	%
> +reserved_prefix = "Tag_".
> +
Move the comment so that it is above the function declaration.

> +:- pred common_mercury_functor(string, string).
> +:- mode common_mercury_functor(in, out) is semidet.
> +:- mode common_mercury_functor(out, in) is semidet.
> +
> +	% These should all start with a capital letter so as not to
> +	% conflict with a mangled name.
> +	%
> +common_mercury_functor("[|]", "List").
> +common_mercury_functor("[]", "Nil").
> +common_mercury_functor("{}", "Tuple").
> +
> +	% A general element for types whos structure we do not generate
> +	% DTD rules for.
> +	%
s/whos/whose/

> +:- func unrecognized_element = string.
> +
> +unrecognized_element = "Unrecognized".
> +
> +:- func array_element = string.
> +
> +array_element = "Array".
> +
> +:- pred is_primitive_type(type_desc::in, string::out) is semidet.
> +
> +is_primitive_type(TypeDesc, Element) :-
> +	(
> +		type_of("") = TypeDesc
> +	->
> +		Element = "String"
> +	;
> +		type_of('c') = TypeDesc
> +	->
> +		Element = "Char"
> +	;
> +		type_of(1) = TypeDesc
> +	->
> +		Element = "Int"
> +	;
> +		type_of(1.0) = TypeDesc,
> +		Element = "Float"
> +	).
> +
> +%-----------------------------------------------------------------------------%
> +%
> +% Mangling functions.
> +%
> +% We use the following mangling scheme to create well formed element names
> +% that do not begin with a capital letter (capitals are used for reserved
> +% elements).
> +%
> +% If the string to be mangled begins with a capital letter then we prefix it
> +% with another string reserved for this purpose.  Then we replace all
> +% characters which aren't alpha numeric or underscores with '-' followed by
> +% the character code.
> +%
It would be helpful to give some examples of the mangling scheme in the
comment here.  Name mangling seems to be one of those things that is
prone to break easily, so it is important that this bit is documented
thoroughly.

> +
> +:- func mangle(string) = string.
> +
> +mangle(Functor) = Element :-
> +	string.split(Functor, 1, Head, Tail),
> +	(
> +		string.is_alpha(Head),
> +		string.to_lower(Head) = Head
> +	->
> +		First = Head,
> +		Rest = Tail
> +	;
> +		First = reserved_prefix,
> +		Rest = Head ++ Tail
> +	),
> +	string.foldl(mangle_char, Rest, [], ElementChrs),
> +	Element = First ++ string.from_char_list(ElementChrs).
> +
> +:- pred mangle_char(char::in, list(char)::in, list(char)::out)
> +	is det.
> +
> +	% XXX This is system dependent since char.to_int is system dependent.
> +	%
> +mangle_char(Chr, PrevChrs, list.append(PrevChrs, Chrs)) :-
> +	(
> +		char.is_alnum_or_underscore(Chr)
> +	->
> +		Chrs = [Chr]
> +	;
> +		Chrs = ['-' | string.to_char_list(string.int_to_string(
> +			char.to_int(Chr)))]
> +	).
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- type element_request
> +	--->	from_functor_name(string)
> +	;	from_functor_num(int).
> +
> +:- pragma memo(get_element/2).
> +
> +	% Return an element for a functor for the type.
> +	%
> +:- func get_element(type_desc, element_request) = string.
> +
> +get_element(TypeDesc, ElementRequest) = Element :-
> +	(
> +		(
> +			ElementRequest = from_functor_num(FunctorNum),
> +			get_functor(TypeDesc, FunctorNum, Functor, _, _)
> +		;
> +			ElementRequest = from_functor_name(Functor),
> +			is_discriminated_union(TypeDesc, _)
> +		)
> +	->
> +		(
> +			common_mercury_functor(Functor, ReservedElement)
> +		->
> +			MangledElement = ReservedElement
> +		;
> +			MangledElement = mangle(Functor)
> +		),
> +		MangledType = "--" ++ mangle(type_name(TypeDesc))
> +	;
> +		is_primitive_type(TypeDesc, PrimitiveElement)
> +	->
> +		% primitive element names are reserved, so no mangling is
> +		% required.
> +		MangledElement = PrimitiveElement,
> +		MangledType = ""
> +	;
> +		is_array(TypeDesc, _)
> +	->
> +		% array element name is also reserved.
> +		MangledElement = array_element,
> +		MangledType = "--" ++ mangle(type_name(TypeDesc))
> +	;
> +		MangledElement = unrecognized_element,
> +		MangledType = ""
> +	),
> +	Element = MangledElement ++ MangledType.
> +
> +	% Return a list of elements and argument types for all the
> +	% functors in a discriminated union or a list with just one
> +	% element for the type if it's not a du.
> +	%
> +:- pred get_elements_and_args(type_desc::in, list(string)::out,
> +	list(string)::out, list(list(type_desc))::out) is det.
> +
> +get_elements_and_args(TypeDesc, Elements, Functors, MaybeArgTypeLists) :-
> +	NumFunctors = num_functors(TypeDesc),
> +	(
> +		NumFunctors > 0
> +	->
> +		FunctorNums = 0`..`(NumFunctors - 1),
I would leave some space around the `..`

> +		Elements = list.map(func(X) = get_element(TypeDesc,
> +			from_functor_num(X)), FunctorNums),
> +		(
> +			list.map3(get_functor(TypeDesc), FunctorNums,
> +				Functors0, _, MaybeArgTypeLists0)
> +		->
> +			Functors = Functors0,
> +			MaybeArgTypeLists = MaybeArgTypeLists0
> +		;
> +			throw(write_xml_internal_error("get_elements_and_args",
> +				"get_functor failed for discriminated union"))
> +		)
> +	;
> +		Elements = [get_element(TypeDesc, from_functor_num(0))],
> +		(
> +			is_array(TypeDesc, ArgType)
> +		->
> +			MaybeArgTypeLists = [[ArgType]],
> +			array.from_list([1], Array),
> +			% We want the same functor name returned by
> +			% deconstruct so it matches the "functor" field in
> +			% the `Array--*' element.
> +			deconstruct.deconstruct(Array, canonicalize, Functor,
> +				_, _),
> +			Functors = [Functor]
> +		;
> +			MaybeArgTypeLists = [[]],
> +			% We make these the same so the "functor" attribute
> +			% isn't fixed.
> +			Functors = Elements
> +		)
> +	).
> +
> +:- pred primitive_value(univ::in, string::out) is semidet.
> +
> +primitive_value(Univ, PrimValue) :-
> +	(
> +		univ_to_type(Univ, String)
> +	->
> +		PrimValue = String`with_type`string
> +	;
> +		univ_to_type(Univ, Char)
> +	->
> +		PrimValue = char_to_string(Char)
> +	;
> +		univ_to_type(Univ, Int)
> +	->
> +		PrimValue = int_to_string(Int)
> +	;
> +		univ_to_type(Univ, Float),
> +		PrimValue = float_to_string(Float)
> +	).
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- pred write_xml_element_univ(deconstruct.noncanon_handling,
> +	int, univ, list(maybe(string)), list(maybe(string)), io, io).
> +:- mode write_xml_element_univ(in(do_not_allow), in, in, in, out, di, uo)
> +	is det.
> +:- mode write_xml_element_univ(in(canonicalize), in, in, in, out, di, uo)
> +	is det.
> +:- mode write_xml_element_univ(in(include_details_cc), in, in, in, out,
> +	di, uo) is cc_multi.
> +:- mode write_xml_element_univ(in, in, in, in, out, di, uo) is cc_multi.
> +
> +	% Write an element and all it's descendents to the current output
> +	% stream.
> +	% If MaybeFields isn't empty then its head is used for the "field"
> +	% attribute and the Tail is returned in RemainingMaybeFieldNames.
> +	% This is so it can be called using foldl2.
> +	%
> +write_xml_element_univ(NonCanon, IndentLevel, Univ,
> +		MaybeFieldNames, RemainingMaybeFieldNames, !IO) :-
> +	(
> +		MaybeFieldNames = [MaybeFieldName | RemainingMaybeFieldNames]
> +	;
> +		MaybeFieldNames = [],
> +		RemainingMaybeFieldNames = [],
> +		MaybeFieldName = no
> +	),
> +	deconstruct.deconstruct(Term, NonCanon, Functor, Arity, Args),
> +	Term = univ_value(Univ),
> +	TypeDesc = type_of(Term),
> +	Element = get_element(TypeDesc, from_functor_name(Functor)),
> +	(
> +		primitive_value(Univ, PrimValue)
> +	->
> +		indent(IndentLevel, !IO),
> +		write_primitive_element(Element, PrimValue, MaybeFieldName,
> +			!IO)
> +	;
> +		(
> +			Args = [],
> +			indent(IndentLevel, !IO),
> +			write_empty_element(Element, Functor, MaybeFieldName,
> +				TypeDesc, !IO)
> +		;
> +			Args = [_ | _],
> +			get_field_names(TypeDesc, Functor, Arity,
> +				ChildMaybeFieldNames),
> +			indent(IndentLevel, !IO),
> +			write_element_start(Element, Functor, MaybeFieldName,
> +				TypeDesc, !IO),
> +			write_child_xml_elements(NonCanon, IndentLevel + 1,
> +				Args, ChildMaybeFieldNames, !IO),
> +			indent(IndentLevel, !IO),
> +			write_element_end(Element, !IO)
> +		)
> +	).
> +
> +:- pred is_discriminated_union(type_desc::in, int::out) is semidet.
> +
> +is_discriminated_union(TypeDesc, NumFunctors) :-
> +	NumFunctors = num_functors(TypeDesc),
> +	NumFunctors > -1.
> +
> +:- pred is_array(type_desc::in, type_desc::out) is semidet.
> +
> +is_array(TypeDesc, ArgType) :-
> +	type_ctor_and_args(TypeDesc, TypeCtor, ArgTypes),
> +	ArgTypes = [ArgType],
> +	type_ctor_name(TypeCtor) = "array",
> +	type_ctor_module_name(TypeCtor) = "array".
> +
> +:- pragma memo(get_field_names/4).
> +
> +:- pred get_field_names(type_desc::in, string::in, int::in,
> +	list(maybe(string))::out) is det.
> +
Is there any reason this couldn't be a function?

> +get_field_names(TypeDesc, Functor, Arity, MaybeFields) :-
> +	NumFunctors = num_functors(TypeDesc),
> +	(
> +		NumFunctors > 0
> +	->
> +		FunctorNums = 0`..`(NumFunctors - 1),
> +		(
> +			find_field_names(TypeDesc, FunctorNums, Functor,
> +				Arity, FoundMaybeFields)
> +		->
> +			MaybeFields = FoundMaybeFields
> +		;
> +			MaybeFields = []
> +		)
> +	;
> +		MaybeFields = []
> +	).
> +
> +:- pred find_field_names(type_desc::in, list(int)::in, string::in, int::in,
> +	list(maybe(string))::out) is semidet.
> +
> +find_field_names(TypeDesc, [FunctorNum | FunctorNums], Functor, Arity,
> +		MaybeFieldNames) :-
> +	(
> +		get_functor(TypeDesc, FunctorNum, Functor, Arity, _,
> +			FoundFieldNames)
> +	->
> +		MaybeFieldNames = FoundFieldNames
> +	;
> +		find_field_names(TypeDesc, FunctorNums, Functor, Arity,
> +			MaybeFieldNames)
> +	).
> +
> +%-----------------------------------------------------------------------------%
> +%
> +% The following is done to get around an unimplemented feature where higher
> +% order terms with more than one mode can't be passed around (so we can't just
> +% pass write_xml_element_univ to foldl).
> +%
I'd put an XXX in front of that comment.

> +
> +:- pred write_child_xml_elements(deconstruct.noncanon_handling,
> +	int, list(univ), list(maybe(string)), io, io).
> +:- mode write_child_xml_elements(in(do_not_allow), in, in, in, di, uo)
> +	is det.
> +:- mode write_child_xml_elements(in(canonicalize), in, in, in, di, uo)
> +	is det.
> +:- mode write_child_xml_elements(in(include_details_cc), in, in, in,
> +	di, uo) is cc_multi.
> +:- mode write_child_xml_elements(in, in, in, in, di, uo) is cc_multi.
> +
> +write_child_xml_elements(NonCanon, IndentLevel, Args,
> +		MaybeFieldNames, !IO) :-
> +	(
> +		NonCanon = do_not_allow,
> +		list.foldl2(
> +			write_xml_element_univ_do_not_allow(
> +				IndentLevel), Args,
> +			MaybeFieldNames, _, !IO)
> +	;
> +		NonCanon = canonicalize,
> +		list.foldl2(
> +			write_xml_element_univ_canonicalize(
> +				IndentLevel), Args,
> +			MaybeFieldNames, _, !IO)
> +	;
> +		NonCanon = include_details_cc,
> +		list.foldl2(
> +			write_xml_element_univ_include_details_cc(
> +				IndentLevel), Args,
> +			MaybeFieldNames, _, !IO)
> +	).
> +
> +:- pred write_xml_element_univ_do_not_allow( int, univ,
> +	list(maybe(string)), list(maybe(string)), io, io).
> +:- mode write_xml_element_univ_do_not_allow(in, in, in, out, di, uo)
> +	is det.
> +
You could use predmode syntax here.

> +write_xml_element_univ_do_not_allow(IndentLevel, Univ,
> +		MaybeFieldNames0, MaybeFieldNames, !IO) :-
> +	write_xml_element_univ(do_not_allow, IndentLevel,
> +		Univ, MaybeFieldNames0, MaybeFieldNames, !IO).
> +
> +:- pred write_xml_element_univ_canonicalize( int, univ,
> +	list(maybe(string)), list(maybe(string)), io, io).
> +:- mode write_xml_element_univ_canonicalize(in, in, in, out, di, uo)
> +	is det.
> +
and here

> +write_xml_element_univ_canonicalize(IndentLevel, Univ,
> +		MaybeFieldNames0, MaybeFieldNames, !IO) :-
> +	write_xml_element_univ(canonicalize, IndentLevel,
> +		Univ, MaybeFieldNames0, MaybeFieldNames, !IO).
> +
> +:- pred write_xml_element_univ_include_details_cc(int, univ,
> +	list(maybe(string)), list(maybe(string)), io, io).
> +:- mode write_xml_element_univ_include_details_cc(in, in, in, out, di, uo)
> +	is cc_multi.
> +
and here

> +write_xml_element_univ_include_details_cc(IndentLevel, Univ,
> +		MaybeFieldNames0, MaybeFieldNames, !IO) :-
> +	write_xml_element_univ(include_details_cc,
> +		IndentLevel, Univ, MaybeFieldNames0, MaybeFieldNames, !IO).
> +
> +%-----------------------------------------------------------------------------%
> +%
> +% Predicates for writing elements
> +%
> +
> +:- pred indent(int::in, io::di, io::uo) is det.
> +
> +indent(IndentLevel, !IO) :-
> +	(
> +		IndentLevel > 0
> +	->
> +		io.write_char('\t', !IO),
> +		indent(IndentLevel - 1, !IO)
> +	;
> +		true
> +	).
> +
> +:- pred write_primitive_element(string::in, string::in,
> +	maybe(string)::in, io::di, io::uo) is det.
> +
> +write_primitive_element(Element, Value, MaybeFieldName, !IO) :-
> +	io.write_string("<", !IO),
> +	io.write_string(Element, !IO),
> +	write_field_name_attribute(MaybeFieldName, !IO),
> +	io.write_string(">", !IO),
> +	write_xml_escaped_string(Value, !IO),
> +	io.write_string("</", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(">\n", !IO).
> +
> +:- pred write_xml_escaped_string(string::in, io::di, io::uo) is det.
> +
> +write_xml_escaped_string(Str, !IO) :-
> +	string.foldl(write_xml_escaped_char, Str, !IO).
> +
> +:- pred write_xml_escaped_char(char::in, io::di, io::uo) is det.
> +
> +write_xml_escaped_char(Chr, !IO) :-
> +	(
> +		xml_predefined_entity(Chr, Str)
> +	->
> +		io.write_string(Str, !IO)
> +	;
> +		io.write_char(Chr, !IO)
> +	).
> +
> +:- pred write_element_start(string::in, string::in,
> +	maybe(string)::in, type_desc::in, io::di, io::uo) is det.
> +
> +write_element_start(Element, Functor, MaybeField, TypeDesc, !IO) :-
> +	io.write_string("<", !IO),
> +	io.write_string(Element, !IO),
> +	write_functor_attribute(Functor, !IO),
> +	write_field_name_attribute(MaybeField, !IO),
> +	write_type_name_attribute(TypeDesc, !IO),
> +	io.write_string(">\n", !IO).
> +
> +:- pred write_empty_element(string::in, string::in, maybe(string)::in,
> +	type_desc::in, io::di, io::uo) is det.
> +
> +write_empty_element(Element, Functor, MaybeField, TypeDesc, !IO) :-
> +	io.write_string("<", !IO),
> +	io.write_string(Element, !IO),
> +	write_functor_attribute(Functor, !IO),
> +	write_field_name_attribute(MaybeField, !IO),
> +	write_type_name_attribute(TypeDesc, !IO),
> +	io.write_string(" />\n", !IO).
> +
> +:- pred write_field_name_attribute(maybe(string)::in, io::di, io::uo) is det.
> +
> +write_field_name_attribute(no, !IO).
> +write_field_name_attribute(yes(Field), !IO) :-
> +	io.write_string(" field=""", !IO),
> +	write_xml_escaped_string(Field, !IO),
> +	io.write_string("""", !IO).
> +
> +:- pred write_type_name_attribute(type_desc::in, io::di, io::uo) is det.
> +
> +write_type_name_attribute(TypeDesc, !IO) :-
> +	io.write_string(" typename=""", !IO),
> +	write_xml_escaped_string(type_name(TypeDesc), !IO),
> +	io.write_string("""", !IO).
> +
> +:- pred write_functor_attribute(string::in, io::di, io::uo) is det.
> +
> +write_functor_attribute(Functor, !IO) :-
> +	io.write_string(" functor=""", !IO),
> +	write_xml_escaped_string(Functor, !IO),
> +	io.write_string("""", !IO).
> +
> +:- pred write_element_end(string::in, io::di, io::uo) is det.
> +
> +write_element_end(Element, !IO) :-
> +	io.write_string("</", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(">\n", !IO).
> +
> +:- pred xml_predefined_entity(char::in, string::out) is semidet.
> +
> +xml_predefined_entity(('<'), "<").
> +xml_predefined_entity(('>'), ">").
> +xml_predefined_entity(('&'), "&").
> +xml_predefined_entity(('\''), "'").
> +xml_predefined_entity(('\"'), """).
> +
> +%-----------------------------------------------------------------------------%
> +%
> +% Predicates to write the DTD for a type.
> +%
> +
> +write_dtd_from_type(TypeDesc, DTDResult, !IO) :-
> +	(
> +		get_elements_and_args(TypeDesc, [RootElement], [_],
> +			[ArgTypes])
> +	->
> +		io.write_string("<!DOCTYPE ", !IO),
> +		io.write_string(RootElement, !IO),
> +		io.write_string(" [\n\n<!-- Builtin Mercury types -->\n\n",
> +			!IO),
> +		list.foldl(write_primitive_dtd_element,
> +			["String", "Char", "Int", "Float"], !IO),
> +		io.nl(!IO),
> +		some [!AlreadyDone] (
> +			!:AlreadyDone = map.init,
> +			map.set(!.AlreadyDone, type_of(1), unit,
> +				!:AlreadyDone),
> +			map.set(!.AlreadyDone, type_of('c'), unit,
> +				!:AlreadyDone),
> +			map.set(!.AlreadyDone, type_of(""), unit,
> +				!:AlreadyDone),
> +			map.set(!.AlreadyDone, type_of(1.0), unit,
> +				!:AlreadyDone),
> +			AlreadyDone = !.AlreadyDone
> +		),
> +		write_dtd_types([TypeDesc | ArgTypes], AlreadyDone, [],
> +			Unrecognised, !IO),
> +		(
> +			Unrecognised = []
> +		;
> +			Unrecognised = [_ | _],
> +			io.write_string("<!-- The following types have ", !IO),
> +			io.write_string("been assigned to the ", !IO),
> +			io.write_string("`Unrecognised' element:\n\t", !IO),
> +			io.write_list(list.map(type_name, Unrecognised),
> +				"\n\t", io.write_string, !IO),
> +			io.write_string("\n-->\n\n", !IO),
> +			write_unrecognised_dtd_element(!IO)
> +		),
> +		io.write_string("\n]>", !IO),
> +		DTDResult = ok
> +	;
> +		DTDResult = multiple_functors_for_root
> +	).
> +
> +ok_to_generate_dtd(TypeDesc) :-
> +	get_elements_and_args(TypeDesc, [_], [_], [_]).
> +
> +:- pred write_primitive_dtd_element(string::in, io::di, io::uo)
> +	is det.
> +
> +write_primitive_dtd_element(Element, !IO) :-
> +	io.write_string("<!ELEMENT ", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(" (#PCDATA)>\n", !IO),
> +	write_dtd_field_attlist(Element, !IO).
> +
> +	% Write out the DTD entries for all the given types and add the written
> +	% types to AlreadyDone.  Children types found along the way are added
> +	% to the 1st argument.  We stop when all the types have had their DTD
s/1st/first/
> +	% entry written.  We also keep track of types assigned to the
> +	% `Unrecognised' element so we can print a comment about them in the
> +	% DTD.
> +	%
> +:- pred write_dtd_types(list(type_desc)::in, map(type_desc, unit)::in,
> +	list(type_desc)::in, list(type_desc)::out, io::di, io::uo) is det.
> +
> +write_dtd_types([], _, Unrecognised, Unrecognised, !IO).
> +write_dtd_types([TypeDesc | TypeDescs], AlreadyDone, Unrecognised0,
> +		Unrecognised, !IO) :-
> +	(
> +		map.search(AlreadyDone, TypeDesc, _)
> +	->
> +		write_dtd_types(TypeDescs, AlreadyDone, Unrecognised0,
> +			Unrecognised, !IO)
> +	;
> +		write_dtd_type_elements(TypeDesc, ChildArgTypes,
> +			IsUnrecognised, !IO),
> +		(
> +			IsUnrecognised = yes
> +		->
> +			list.merge([TypeDesc], Unrecognised0, NewUnrecognised)

> +		;
> +			NewUnrecognised = Unrecognised0
> +		),
> +		map.set(AlreadyDone, TypeDesc, unit, NewAlreadyDone),
> +		write_dtd_types(append(ChildArgTypes, TypeDescs),
> +			NewAlreadyDone, NewUnrecognised, Unrecognised, !IO)
> +	).
> +
> +:- pred write_unrecognised_dtd_element(io::di, io::uo) is det.
> +
> +write_unrecognised_dtd_element(!IO) :-
> +	io.write_string("<!ELEMENT ", !IO),
> +	io.write_string(unrecognized_element, !IO),
> +	io.write_string(" ANY>\n", !IO),
> +	io.write_string("<!ATTLIST ", !IO),
> +	io.write_string(unrecognized_element, !IO),
> +	io.write_string(" functor CDATA #REQUIRED>\n", !IO),
> +	write_dtd_field_attlist(unrecognized_element, !IO),
> +	write_dtd_type_attlist(unrecognized_element, !IO).
> +
> +:- pred write_dtd_field_attlist(string::in, io::di, io::uo) is det.
> +
> +write_dtd_field_attlist(Element, !IO) :-
> +	io.write_string("<!ATTLIST ", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(" field CDATA #IMPLIED>\n", !IO).
> +
> +:- pred write_dtd_type_attlist(string::in, io::di, io::uo) is det.
> +
> +write_dtd_type_attlist(Element, !IO) :-
> +	io.write_string("<!ATTLIST ", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(" typename CDATA #IMPLIED>\n", !IO).
> +
> +:- pred write_dtd_type_elements(type_desc::in, list(type_desc)::out, bool::out,
> +	io::di, io::uo) is det.
> +
> +	% Write DTD entries for all the functors for a type
> +	%
> +write_dtd_type_elements(TypeDesc, ChildArgTypes, IsUnrecognised, !IO) :-
> +	get_elements_and_args(TypeDesc, Elements, Functors, ArgTypeLists),
> +	list.condense(ArgTypeLists, ChildArgTypes),
> +	(
> +		% unrecognized elements don't have multiple functors.
> +		Elements \= [unrecognized_element]
> +	->
> +		io.write_string("<!-- Elements for functors of type """, !IO),
> +		io.write_string(type_name(TypeDesc), !IO),
> +		io.write_string(""" -->\n\n", !IO),
> +		write_dtd_entries(TypeDesc, Elements, Functors,
> +			ArgTypeLists, !IO),
> +		IsUnrecognised = no
> +	;
> +		IsUnrecognised = yes
> +	).
> +
> +:- type write_xml_internal_error
> +	--->	write_xml_internal_error(string, string).
> +
Is this name still appropriate now that you've changed the
module name?  Are you expecting users to (possibly) catch exceptions
of this type? If so, this type should be exported.

> +:- pred write_dtd_entries(type_desc::in,
> +	list(string)::in, list(string)::in, list(list(type_desc))::in,
> +	io::di, io::uo) is det.
> +
> +	% Write all the given DTD entries.
> +	%
> +write_dtd_entries(_, [], [], [], !IO).
> +write_dtd_entries(TypeDesc, [Element | Elements], [Functor | Functors],
> +		[ArgTypeList | ArgTypeLists], !IO) :-
> +	io.write_string("<!ELEMENT ", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(" ", !IO),
> +	(
> +		ArgTypeList = [],
> +		io.write_string("EMPTY>\n", !IO)
> +	;
> +		ArgTypeList = [Head | Tail],
> +		(
> +			Tail = [_ | _],
> +			Braces = yes
> +		;
> +			Tail = [],
> +			(
> +				num_functors(Head) > 1
> +			->
> +				Braces = no
> +			;
> +				Braces = yes
> +			)
> +		),
> +
> +		% Put extra braces for arrays for the * at the end.
> +		( is_array(TypeDesc, _) -> io.write_string("(", !IO) ; true ),
> +
> +		( Braces = yes, io.write_string("(", !IO) ; Braces = no ),
> +
> +		io.write_list(ArgTypeList, ",",
> +			write_dtd_allowed_functors_regex, !IO),
> +
> +		( Braces = yes, io.write_string(")", !IO) ; Braces = no ),
> +
> +		( is_array(TypeDesc, _) -> io.write_string("*)", !IO) ; true ),
> +
> +		io.write_string(">\n", !IO)
> +	),
> +	write_dtd_field_attlist(Element, !IO),
> +	write_dtd_type_attlist(Element, !IO),
> +	io.write_string("<!ATTLIST ", !IO),
> +	io.write_string(Element, !IO),
> +	io.write_string(" functor CDATA #FIXED """, !IO),
> +	write_xml_escaped_string(Functor, !IO),
> +	io.write_string(""">\n\n", !IO),
> +	write_dtd_entries(TypeDesc, Elements, Functors, ArgTypeLists,
> +		!IO).
> +
> +write_dtd_entries(_, [_ | _], [], [], !IO) :-
> +	throw(write_xml_internal_error("write_dtd_cons_elements",
> +		"lists not of equal length")).
> +write_dtd_entries(_, [], [_ | _], [], !IO) :-
> +	throw(write_xml_internal_error("write_dtd_cons_elements",
> +		"lists not of equal length")).
> +write_dtd_entries(_, [_ | _], [_ | _], [], !IO) :-
> +	throw(write_xml_internal_error("write_dtd_cons_elements",
> +		"lists not of equal length")).
> +write_dtd_entries(_, [], [], [_ | _], !IO) :-
> +	throw(write_xml_internal_error("write_dtd_cons_elements",
> +		"lists not of equal length")).
> +write_dtd_entries(_, [_ | _], [], [_ | _], !IO) :-
> +	throw(write_xml_internal_error("write_dtd_cons_elements",
> +		"lists not of equal length")).
> +write_dtd_entries(_, [], [_ | _], [_ | _], !IO) :-
> +	throw(write_xml_internal_error("write_dtd_cons_elements",
> +		"lists not of equal length")).
> +
> +	% Write the allowed functors for the type as a DTD rule regular
> +	% expression.
> +	%
> +:- pred write_dtd_allowed_functors_regex(type_desc::in, io::di, io::uo)
> +	is det.
> +
> +write_dtd_allowed_functors_regex(TypeDesc, !IO) :-
> +	get_elements_and_args(TypeDesc, Elements, _, _),
> +	(
> +		length(Elements) > 1
> +	->
> +		io.write_string("(", !IO),
> +		io.write_list(Elements, "|", io.write_string, !IO),
> +		io.write_string(")", !IO)
> +	;
> +		io.write_list(Elements, "|", io.write_string, !IO)
> +	).
> +

That looks good.  I think you can commit it, although until the
situation with the tabled functions is sorted out it would be better
to comment out the memoing pragmas and put an XXX comment with them.

Cheers,
Julien.



--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list