[m-rev.] For revew: a new, improved pretty printer
Julien Fischer
juliensf at csse.unimelb.edu.au
Wed Aug 1 16:47:22 AEST 2007
On Wed, 1 Aug 2007, Ralph Becket wrote:
> Estimated hours taken: 30
> Branches: main
>
> Add a new, improved pretty printer to the library. The key advantages over
> pprint are
> - better performance on large terms;
> - better output (line overruns are completely avoided where possible);
> - better control (now supports maximum lines output and two different
> styles of limit on how deeply formatting of arbitrary terms can go);
> - support for user-specifiable formatting for arbitrary types.
A TODO item here is that the old pprint module is till used in the
debugger and in the compiler.
> NEWS:
> Mention the new change.
>
> library/library.m:
> Add pretty_printer.m.
>
> library/pprint.m:
> Add a comment to say this module has been superceded.
>
> library/pretty_printer.m:
> Added.
>
> tests/hard_coded/Mmakefile:
> tests/hard_coded/test_pretty_printer.exp:
> tests/hard_coded/test_pretty_printer.m:
> A test suite.
>
> Index: NEWS
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/NEWS,v
> retrieving revision 1.473
> diff -u -r1.473 NEWS
> --- NEWS 31 Jul 2007 07:59:23 -0000 1.473
> +++ NEWS 1 Aug 2007 04:04:24 -0000
> @@ -21,6 +21,13 @@
>
> Changes to the Mercury standard library:
>
> +* An improved pretty printer module, pretty_printer.m has been added. This
> + supercedes pprint.m in that it is more economical, produces better
> + quality output (line overruns are completely avoided wherever possible),
> + has better control over the amount of output produced, and supports
> + user-specifiable formatting for arbitrary types. Further use of pprint is
> + deprecated.
...
> Index: library/pretty_printer.m
> ===================================================================
> RCS file: library/pretty_printer.m
> diff -N library/pretty_printer.m
> --- /dev/null 1 Jan 1970 00:00:00 -0000
> +++ library/pretty_printer.m 1 Aug 2007 00:33:23 -0000
> @@ -0,0 +1,783 @@
> +%-----------------------------------------------------------------------------%
> +% pretty_printer.m
> +% Ralph Becket <rafe at csse.unimelb.edu.au>
> +% Fri Jun 1 14:49:30 EST 2007
> +% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
Add the copyright message and format this as per the other library
modules.
> +% This module defines the doc type and a pretty printer for formatting
> +% doc lists.
> +%
> +% The doc type includes data constructors for outputting strings, newlines,
> +% forming groups, indented blocks, and arbitrary values.
> +%
> +% The key feature of the algorithm is this: newlines in a group are ignored if
> +% the group can fit on the remainder of the current line. [The algorithm is
> +% similar to those of Oppen and Wadler, although it uses neither coroutines or
> +% laziness.]
> +%
> +% When a newline is printed, indentation is also output according to the
> +% current indentation level.
> +%
> +% The pretty printer includes special support for formatting Mercury style
> +% terms in a way that respects Mercury's operator precedence and
> +% bracketing rules.
> +%
> +% The pretty printer takes a parameter specifying a collection of user-defined
> +% formatting functions for handling certain types rather than using the
> +% default built-in mechanism. This allows one to, say, format maps as
> +% sequences of (key -> value) pairs rather than exposing the underlying
> +% 234-tree structure.
> +%
> +% The amount of output produced is controlled via limit parameters. Three
> +% kinds of limits are supported: the output line width, the maximum number of
> +% lines to be output, and a limit on the depth for formatting arbitrary terms.
> +% Output is replaced with ellipsis ("...") when limits are exceeded.
> +%
...
> +:- type doc
> + ---> s(string) % Output a literal string. Strings
> + % containing newlines, hard
> + % tabs, etc. will lead to strange
> + % output.
> + ; nl % Output a newline if the enclosing
> + % group does not fit on the current
> + % line.
> + ; open_group % Open a new group (groups control
> + % how nls are handled).
s/nls/newlines/
> + ; close_group % Close a group.
> + ; indent(string) % Append a string to indentation.
> + ; outdent % Remove the last indentation string.
> + ; docs(docs) % An embedded sequence of docs.
> + ; pp_univ(univ) % Use a specialised pretty printer
> + % if available, otherwise use the
> + % generic pretty printer.
> + ; pp_list(list(univ), doc) % Pretty print a list of items
> + % using the given doc as a
> + % separator between items. Each
> + % item - separator pair is placed
> + % inside a group, preceded by nl
> + % and set_arg_priority.
> + ; pp_term(string, list(univ)) % Pretty print a term with zero or
> + % more arguments. If the term
> + % corresponds to a Mercury operator
> + % it will be printed with appropriate
> + % fixity and, if necessary, in
> + % parentheses. The term name will be
> + % quoted and escaped if necessary.
> + ; set_op_priority(ops.priority)
> + % Set the current priority for printing
> + % operator terms with the correct
> + % parenthesisation.
> + ; set_limit(pp_limit). % Set the truncation limit. This
> + % should not be necessary for user
> + % defined pretty printers!
> +
> +:- type docs == list(doc).
> +
> + % indent = indent(" ").
> + % A convenient abbreviation.
> + %
> +:- func indent = doc.
> +
> + % pp(X) = pp_univ(univ(X)).
> + % A convenient abbreviation.
> + %
> +:- func pp(T) = doc.
> +
> + % set_arg_priority =
> + % set_op_priority(ops.arg_priority(ops.init_mercury_op_table))
> + %
> + % This is a useful shorthand when pretty-printing term arguments.
> + %
> +:- func set_arg_priority = doc.
> +
> + % The pretty-printer limit type, used to truncate conversion to docs
> + % after the limit has been reached. The linear version simply emits
> + % the first N functors before truncating. The triangular version
> + % allocates N - 1 "units" to printing the first argument of the current
> + % term, N - 2 "units" to printing the second argument of the current
> + % term, and so forth. [The term "functor" is not quite correct here:
> + % strictly speaking one "unit" is consumed every time a user defined
> + % pretty printer or the generic term printer is used.] Truncation is
> + % indicated by "..." in the output.
> + %
> +:- type pp_limit
> + ---> linear(int) % Print this many functors.
> + ; triangular(int). % Print first arg with limit n-1,
> + % second arg with limit n-2, ...
> +
> + % The type and inst of pretty-printer converters.
> + % The first argument is the univ of the value to be formatted.
> + % The second argument is the list of argument type_descs for
> + % the type of the first argument.
> + %
> +:- type pp == ( func(univ, list(type_desc)) = docs ).
I suggest naming this pp_convertor.
> + % A pp_map maps types to pps. Types are identified by module name, type
> + % name, and type arity.
> + %
> +:- type pp_map.
> +
> + % Construct a new pp_map.
> + %
> +:- func new_pp_map = pp_map.
> +
> + % set_pp_mapping(ModuleName, TypeName, TypeArity, PP, PPMap)
> + % Update PPMap to use PP to format the type
> + % ModuleName.TypeName/TypeArity.
> + %
Document whether ModuleName must be fully qualified or if partial
qualifications will work.
> +:- func set_pp_mapping(string, string, int, pp, pp_map) = pp_map.
> +
> +
> +
> + % format(Stream, PPMap, LineWidth, MaxLines, Limit, Docs, !State).
> + % Format Docs to fit on lines of LineWidth chars, truncating after
> + % MaxLines lines, fomatting pp_univ(_) docs using pretty-printer
> + % converters PPs starting with pretty-printer limits Limit.
> + %
> +:- pred format(Stream::in, pp_map::in, int::in, int::in, pp_limit::in,
> + docs::in, State::di, State::uo)
> + is det
> + <= stream.writer(Stream, string, State).
> +
> + % Convenience predicates. A user-configurable set of type-specific
> + % pretty-printers and formatting parameters are attached to the IO state.
I would use I/O state rather than IO state since that's what the rest
of the library documentation uses.
> + % The io state-specific format predicate below uses this settings.
Likewise, I/O state here as well.
> +:- type pp_params
> + ---> pp_params(
> + pp_line_width :: int,
> + pp_max_lines :: int,
> + pp_limit :: pp_limit
> + ).
> +
Document this.
> +:- pred get_default_pp_map(pp_map::out, io::di, io::uo) is det.
> +:- pred set_default_pp_map(pp_map::in, io::di, io::uo) is det.
> +:- pred set_default_pp(string::in, string::in, int::in, pp::in,
> + io::di, io::uo) is det.
> +
> + % The initial default pp_params are pp_params(78, 100, triangular(100)).
> + %
> +:- pred get_default_pp_params(pp_params::out, io::di, io::uo) is det.
> +:- pred set_default_pp_params(pp_params::in, io::di, io::uo) is det.
> +
> + % format(Docs, !IO)
> + % format(Stream, Docs, !IO)
> + % Format Docs to io.stdout_stream or Stream respectively, using
> + % the default pp_map and pp_params.
> + %
> +:- pred format(docs::in, io::di, io::uo) is det.
> +:- pred format(io.output_stream::in, docs::in, io::di, io::uo) is det.
I suggest s/Stream/FileStream/ in the above comment.
...
> +:- mutable(io_pp_map, pp_map, new_pp_map, ground,
> + [attach_to_io_state, untrailed]).
> +
> +:- mutable(io_pp_params, pp_params, pp_params(78, 100, triangular(100)),
> + ground, [attach_to_io_state, untrailed]).
I think that both of these should also be thread_local.
(The fact that the default map attached to the I/O state is thread
local and that each thread can have its own set should be documented
in the interface.)
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list