[m-rev.] For revew: a new, improved pretty printer

Julien Fischer juliensf at csse.unimelb.edu.au
Wed Aug 1 16:47:22 AEST 2007


On Wed, 1 Aug 2007, Ralph Becket wrote:

> Estimated hours taken: 30
> Branches: main
>
> Add a new, improved pretty printer to the library.  The key advantages over
> pprint are
> - better performance on large terms;
> - better output (line overruns are completely avoided where possible);
> - better control (now supports maximum lines output and two different
>  styles of limit on how deeply formatting of arbitrary terms can go);
> - support for user-specifiable formatting for arbitrary types.

A TODO item here is that the old pprint module is till used in the
debugger and in the compiler.

> NEWS:
> 	Mention the new change.
>
> library/library.m:
> 	Add pretty_printer.m.
>
> library/pprint.m:
> 	Add a comment to say this module has been superceded.
>
> library/pretty_printer.m:
> 	Added.
>
> tests/hard_coded/Mmakefile:
> tests/hard_coded/test_pretty_printer.exp:
> tests/hard_coded/test_pretty_printer.m:
> 	A test suite.
>
> Index: NEWS
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/NEWS,v
> retrieving revision 1.473
> diff -u -r1.473 NEWS
> --- NEWS	31 Jul 2007 07:59:23 -0000	1.473
> +++ NEWS	1 Aug 2007 04:04:24 -0000
> @@ -21,6 +21,13 @@
>
> Changes to the Mercury standard library:
>
> +* An improved pretty printer module, pretty_printer.m has been added.  This
> +  supercedes pprint.m in that it is more economical, produces better
> +  quality output (line overruns are completely avoided wherever possible),
> +  has better control over the amount of output produced, and supports
> +  user-specifiable formatting for arbitrary types.  Further use of pprint is
> +  deprecated.

...

> Index: library/pretty_printer.m
> ===================================================================
> RCS file: library/pretty_printer.m
> diff -N library/pretty_printer.m
> --- /dev/null	1 Jan 1970 00:00:00 -0000
> +++ library/pretty_printer.m	1 Aug 2007 00:33:23 -0000
> @@ -0,0 +1,783 @@
> +%-----------------------------------------------------------------------------%
> +% pretty_printer.m
> +% Ralph Becket <rafe at csse.unimelb.edu.au>
> +% Fri Jun  1 14:49:30 EST 2007
> +% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0

Add the copyright message and format this as per the other library
modules.

> +% This module defines the doc type and a pretty printer for formatting
> +% doc lists.
> +%
> +% The doc type includes data constructors for outputting strings, newlines,
> +% forming groups, indented blocks, and arbitrary values.
> +%
> +% The key feature of the algorithm is this: newlines in a group are ignored if
> +% the group can fit on the remainder of the current line.  [The algorithm is
> +% similar to those of Oppen and Wadler, although it uses neither coroutines or
> +% laziness.]
> +%
> +% When a newline is printed, indentation is also output according to the
> +% current indentation level.
> +%
> +% The pretty printer includes special support for formatting Mercury style
> +% terms in a way that respects Mercury's operator precedence and
> +% bracketing rules.
> +%
> +% The pretty printer takes a parameter specifying a collection of user-defined
> +% formatting functions for handling certain types rather than using the
> +% default built-in mechanism.  This allows one to, say, format maps as
> +% sequences of (key -> value) pairs rather than exposing the underlying
> +% 234-tree structure.
> +%
> +% The amount of output produced is controlled via limit parameters.  Three
> +% kinds of limits are supported: the output line width, the maximum number of
> +% lines to be output, and a limit on the depth for formatting arbitrary terms.
> +% Output is replaced with ellipsis ("...") when limits are exceeded.
> +%

...

> +:- type doc
> +    --->    s(string)                   % Output a literal string.  Strings
> +                                        %   containing newlines, hard
> +                                        %   tabs, etc. will lead to strange
> +                                        %   output.
> +    ;       nl                          % Output a newline if the enclosing
> +                                        %   group does not fit on the current
> +                                        %   line.
> +    ;       open_group                  % Open a new group (groups control
> +                                        %   how nls are handled).

s/nls/newlines/

> +    ;       close_group                 % Close a group.
> +    ;       indent(string)              % Append a string to indentation.
> +    ;       outdent                     % Remove the last indentation string.
> +    ;       docs(docs)                  % An embedded sequence of docs.
> +    ;       pp_univ(univ)               % Use a specialised pretty printer
> +                                        %  if available, otherwise use the
> +                                        %  generic pretty printer.
> +    ;       pp_list(list(univ), doc)    % Pretty print a list of items
> +                                        %  using the given doc as a
> +                                        %  separator between items.  Each
> +                                        %  item - separator pair is placed
> +                                        %  inside a group, preceded by nl
> +                                        %  and set_arg_priority.
> +    ;       pp_term(string, list(univ)) % Pretty print a term with zero or
> +                                        %  more arguments.  If the term
> +                                        %  corresponds to a Mercury operator
> +                                        %  it will be printed with appropriate
> +                                        %  fixity and, if necessary, in
> +                                        %  parentheses.  The term name will be
> +                                        %  quoted and escaped if necessary.
> +    ;       set_op_priority(ops.priority)
> +                                        % Set the current priority for printing
> +                                        %  operator terms with the correct
> +                                        %  parenthesisation.
> +    ;       set_limit(pp_limit).        % Set the truncation limit.  This
> +                                        %  should not be necessary for user
> +                                        %  defined pretty printers!
> +
> +:- type docs == list(doc).
> +
> +    % indent = indent("  ").
> +    %   A convenient abbreviation.
> +    %
> +:- func indent = doc.
> +
> +    % pp(X) = pp_univ(univ(X)).
> +    %   A convenient abbreviation.
> +    %
> +:- func pp(T) = doc.
> +
> +    % set_arg_priority =
> +    %   set_op_priority(ops.arg_priority(ops.init_mercury_op_table))
> +    %
> +    % This is a useful shorthand when pretty-printing term arguments.
> +    %
> +:- func set_arg_priority = doc.
> +
> +    % The pretty-printer limit type, used to truncate conversion to docs
> +    % after the limit has been reached.  The linear version simply emits
> +    % the first N functors before truncating.  The triangular version
> +    % allocates N - 1 "units" to printing the first argument of the current
> +    % term, N - 2 "units" to printing the second argument of the current
> +    % term, and so forth.  [The term "functor" is not quite correct here:
> +    % strictly speaking one "unit" is consumed every time a user defined
> +    % pretty printer or the generic term printer is used.]  Truncation is
> +    % indicated by "..." in the output.
> +    %
> +:- type pp_limit
> +    --->    linear(int)                 % Print this many functors.
> +    ;       triangular(int).            % Print first arg with limit n-1,
> +                                        % second arg with limit n-2, ...
> +
> +    % The type and inst of pretty-printer converters.
> +    % The first argument is the univ of the value to be formatted.
> +    % The second argument is the list of argument type_descs for
> +    % the type of the first argument.
> +    %
> +:- type pp == ( func(univ, list(type_desc)) = docs ).

I suggest naming this pp_convertor.

> +    % A pp_map maps types to pps.  Types are identified by module name, type
> +    % name, and type arity.
> +    %
> +:- type pp_map.
> +
> +    % Construct a new pp_map.
> +    %
> +:- func new_pp_map = pp_map.
> +
> +    % set_pp_mapping(ModuleName, TypeName, TypeArity, PP, PPMap)
> +    %   Update PPMap to use PP to format the type
> +    %   ModuleName.TypeName/TypeArity.
> +    %

Document whether ModuleName must be fully qualified or if partial
qualifications will work.

> +:- func set_pp_mapping(string, string, int, pp, pp_map) = pp_map.
> +
> +
> +
> +    % format(Stream, PPMap, LineWidth, MaxLines, Limit, Docs, !State).
> +    %   Format Docs to fit on lines of LineWidth chars, truncating after
> +    %   MaxLines lines, fomatting pp_univ(_) docs using pretty-printer
> +    %   converters PPs starting with pretty-printer limits Limit.
> +    %
> +:- pred format(Stream::in, pp_map::in, int::in, int::in, pp_limit::in,
> +        docs::in, State::di, State::uo)
> +        is det
> +        <= stream.writer(Stream, string, State).
> +
> +    % Convenience predicates.  A user-configurable set of type-specific
> +    % pretty-printers and formatting parameters are attached to the IO state.

I would use I/O state rather than IO state since that's what the rest
of the library documentation uses.

> +    % The io state-specific format predicate below uses this settings.

Likewise, I/O state here as well.

> +:- type pp_params
> +    --->    pp_params(
> +                pp_line_width   :: int,
> +                pp_max_lines    :: int,
> +                pp_limit        :: pp_limit
> +            ).
> +

Document this.


> +:- pred get_default_pp_map(pp_map::out, io::di, io::uo) is det.
> +:- pred set_default_pp_map(pp_map::in, io::di, io::uo) is det.
> +:- pred set_default_pp(string::in, string::in, int::in, pp::in,
> +        io::di, io::uo) is det.
> +
> +    % The initial default pp_params are pp_params(78, 100, triangular(100)).
> +    %
> +:- pred get_default_pp_params(pp_params::out, io::di, io::uo) is det.
> +:- pred set_default_pp_params(pp_params::in, io::di, io::uo) is det.
> +
> +    % format(Docs, !IO)
> +    % format(Stream, Docs, !IO)
> +    %   Format Docs to io.stdout_stream or Stream respectively, using
> +    %   the default pp_map and pp_params.
> +    %
> +:- pred format(docs::in, io::di, io::uo) is det.
> +:- pred format(io.output_stream::in, docs::in, io::di, io::uo) is det.

I suggest s/Stream/FileStream/ in the above comment.

...

> +:- mutable(io_pp_map, pp_map, new_pp_map, ground,
> +    [attach_to_io_state, untrailed]).
> +
> +:- mutable(io_pp_params, pp_params, pp_params(78, 100, triangular(100)),
> +    ground, [attach_to_io_state, untrailed]).

I think that both of these should also be thread_local. 
(The fact that the default map attached to the I/O state is thread
local and that each thread can have its own set should be documented
in the interface.)

Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list