[m-dev.] For review: pprint performance bug fix and misc. cha nges

Mark Anthony BROWN dougl at cs.mu.OZ.AU
Wed May 24 02:59:29 AEST 2000


Hi,

Aside from the comments below, this diff looks fine to me.

Cheers,
Mark.

Ralph Becket writes:
> The diff I posted wasn't quite correct (I'd forgotten to do an update
> first).  Here's the correct diff.  The only real difference is that
> I've taken out Mark's try_flatten/4 predicate and reinserted flatten/1.
> The other changes remain the same.
> 
> Index: pprint.m
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/library/pprint.m,v
> retrieving revision 1.3
> diff -u -u -r1.3 pprint.m
> --- pprint.m	2000/05/23 07:43:38	1.3
> +++ pprint.m	2000/05/23 12:09:55
> @@ -28,7 +28,7 @@
>  %    unless that is unavoidable; and
>  % 3. the pretty printer is bounded in that it never needs to
>  %    look more than k characters ahead to make a formatting
> -%    decision (although see the XXX comment below).
> +%    decision (although see the XXX comments below).
>  %

Hmm.  Point 3 now holds for this pretty printer, right?  So is there
still a need for the reference to the XXX comments here?

>  % I have made three small changes:
>  %
> @@ -52,6 +52,11 @@
>  % errors and warnings that should be prefixed with the
>  % offending source file and line number.
>  %
> +% Performance problems due to the current lack of support
> +% for laziness in Mercury has meant that the formatting

s/has/have/

> +% decision procedure has had to be recoded to preserve
> +% linear runtime behaviour in an eager language.
> +%
>  % I have also added several obvious general purpose
>  % formatting functions.
>  %
> @@ -212,7 +217,13 @@
>  :- func braces(doc)                     = doc.
>  
>      % separated(PP, Sep, [X1,...,Xn]) =
> -    %   PP(X1) `<>` Sep `<>` ... Sep `<>` PP(Xn)
> +    %   PP(X1) `<>` (Sep `<>` ... (Sep `<>` PP(Xn)) ... )
> +    %
> +    % Note that if you want to pack as many things on one
> +    % line as possible with some sort of separator, the
> +    % following example illustrates a suitable idiom:
> +    %
> +    %   separated(PP, group(comma_space_line), Xs)
>      %
>  :- func separated(func(T) = doc, doc, list(T)) = doc.
>  
> @@ -346,6 +357,11 @@
>  
>  
> %---------------------------------------------------------------------------
> ---%
>  
> +    % XXX The last clause in this predicate has been recoded to use
> +    % flattening_works/3, an eager decision procedure, to avoid
> +    % unacceptable performance on large terms.  A spot of laziness
> +    % would do away with the need for the `fits_XXX' etc. predicates.
> +
>  :- func be(int, int, list(pair(string, doc))) = simple_doc.
>  
>  be(_, _, [])                      = nil.
> @@ -356,52 +372,86 @@
>  be(W, K, [_ - 'TEXT'(S)     | Z]) = S `text` be(W, (K + string__length(S)),
> Z).
>  be(W, _, [I - 'LINE'        | Z]) = I `line` be(W, string__length(I), Z).
>  be(W, K, [I - 'GROUP'(X)    | Z]) =
> -    ( if
> -        try_flatten(X, FlatX, W - K, _),
> -        Flattened = be(W, K, [I - FlatX | Z]),
> -        fits(W - K, Flattened)
> -      then
> -        Flattened
> +    ( if flattening_works(X, Z, W - K) then
> +        be(W, K, [I - flatten(X) | Z])
>        else
>          be(W, K, [I - X | Z])
>      ).
>  
> -%--------------------------------------------------------------------------
> ----%
> +%
> ----------------------------------------------------------------------------
> %
>  
> -:- func extend(string, int) = string.
> +    % Decide whether flattening a given doc will allow it and
> +    % up to the next possible 'LINE' in the following docs to
> +    % fit on the remainder of the line.
> +    %
> +    % XXX This solution is necessary to avoid crippling performance
> +    % problems on large terms.  A spot of laziness would do away
> +    % with the need for the next three predicates.
> +    %
> +:- pred flattening_works(doc, list(pair(string, doc)), int).
> +:- mode flattening_works(in, in, in) is semidet.
>  
> -extend(I, J) = string__append(I, string__duplicate_char(' ', J)).
> +flattening_works(DocToFlatten, FollowingDocs, RemainingWidth) :-
> +    fits_flattened([DocToFlatten], RemainingWidth, RemainingWidth0),
> +    fits_on_rest(FollowingDocs, RemainingWidth0).
> +
> +%
> ----------------------------------------------------------------------------
> %
> +
> +    % Decide if a flattened list of docs will fit on the remainder
> +    % of the line.  Computes the space left over if so.
> +    %
> +:- pred fits_flattened(list(doc), int, int).
> +:- mode fits_flattened(in, in, out) is semidet.
> +
> +fits_flattened([]                 ) --> [].
> +fits_flattened(['NIL'         | Z]) --> fits_flattened(Z).
> +fits_flattened(['SEQ'(X, Y)   | Z]) --> fits_flattened([X, Y | Z]).
> +fits_flattened(['NEST'(_, X)  | Z]) --> fits_flattened([X | Z]).
> +fits_flattened(['LABEL'(_, X) | Z]) --> fits_flattened([X | Z]).
> +fits_flattened(['LINE'        | Z]) --> fits_flattened(Z).
> +fits_flattened(['GROUP'(X)    | Z]) --> fits_flattened([X | Z]).
> +fits_flattened(['TEXT'(S)     | Z], R0, R) :-
> +    L = string__length(S),
> +    R0 > L,

Shouldn't this be '>='?

> +    fits_flattened(Z, R0 - L, R).
> +
> +%
> ----------------------------------------------------------------------------
> %
> +
> +    % Decide if a list of indent-doc pairs, up to the first 'LINE',
> +    % will fit on the remainder of the line.
> +    %
> +:- pred fits_on_rest(list(pair(string, doc)), int).
> +:- mode fits_on_rest(in, in) is semidet.
> +
> +fits_on_rest([]                     , _).
> +fits_on_rest([_ - 'NIL'         | Z], R) :- fits_on_rest(Z, R).
> +fits_on_rest([I - 'SEQ'(X, Y)   | Z], R) :- fits_on_rest([I - X, I - Y |
> Z], R).
> +fits_on_rest([I - 'NEST'(_, X)  | Z], R) :- fits_on_rest([I - X | Z], R).
> +fits_on_rest([I - 'LABEL'(_, X) | Z], R) :- fits_on_rest([I - X | Z], R).
> +fits_on_rest([_ - 'LINE'        | _], _).
> +fits_on_rest([I - 'GROUP'(X)    | Z], R) :- fits_on_rest([I - X | Z], R).
> +fits_on_rest([_ - 'TEXT'(S)     | Z], R) :-
> +    L = string__length(S),
> +    R > L,

Likewise.

> +    fits_on_rest(Z, R - L).
>  
>  
> %---------------------------------------------------------------------------
> ---%
>  
> -    % While flattening documents, we keep track of the amount of
> -    % space available on the line.  This predicate fails if there
> -    % is not enough space for the flattened term.
> -    %
> -:- pred try_flatten(doc, doc, int, int).
> -:- mode try_flatten(in, out, in, out) is semidet.
> -
> -try_flatten('NIL', 'NIL') -->
> -    [].
> -try_flatten('SEQ'(X, Y), 'SEQ'(FX, FY)) -->
> -    try_flatten(X, FX),
> -    try_flatten(Y, FY).
> -try_flatten('NEST'(_, X), FX) -->
> -    try_flatten(X, FX).
> -try_flatten('LABEL'(_, X), FX) -->
> -    try_flatten(X, FX).
> -try_flatten('TEXT'(S), 'TEXT'(S)) -->
> -    =(W0),
> -    { W = W0 - string__length(S) },
> -    { W >= 0 },
> -    :=(W).
> -try_flatten('LINE', 'NIL') -->
> -    [].
> -try_flatten('GROUP'(X), FX) -->
> -    try_flatten(X, FX).
> +:- func flatten(doc) = doc.
>  
> +flatten('NIL')          = 'NIL'.
> +flatten('SEQ'(X, Y))    = 'SEQ'(flatten(X), flatten(Y)).
> +flatten('NEST'(_, X))   = flatten(X).
> +flatten('LABEL'(_, X))  = flatten(X).
> +flatten('TEXT'(S))      = 'TEXT'(S).
> +flatten('LINE')         = 'NIL'.
> +flatten('GROUP'(X))     = flatten(X).
> +
>  
> %---------------------------------------------------------------------------
> ---%
>  
> +    % XXX This predicate has been obviated by the eager code above.
> +
> +/*
>  :- pred fits(int, simple_doc).
>  :- mode fits(in, in) is semidet.
>  
> @@ -414,6 +464,13 @@
>      ;
>          X = _ `line` _
>      ).
> +*/
> +
> +%--------------------------------------------------------------------------
> ----%
> +
> +:- func extend(string, int) = string.
> +
> +extend(I, J) = string__append(I, string__duplicate_char(' ', J)).
>  
>  
> %---------------------------------------------------------------------------
> ---%
>  
> @@ -434,7 +491,7 @@
>      ( if Xs = [] then
>          PP(X)
>        else
> -        PP(X) `<>` Sep `<>` separated(PP, Sep, Xs)
> +        PP(X) `<>` (Sep `<>` separated(PP, Sep, Xs))
>      ).
>  
>  
> %---------------------------------------------------------------------------
> ---%
> @@ -478,21 +535,33 @@
>              parentheses(
>                  group(
>                      nest(2,
> -		        line `<>` separated(id, comma_space_line, Args)
> -		    ) `<>` line
> +                        line `<>` separated(id, comma_space_line, Args)
> +                    )

Personally, I prefer the original---I like matching parentheses to
either be on the same line, or else at the same level of indentation.
But I'm not too fussed.

>                  )
> -            )
> +        )
>      ).
>  
> -%--------------------------------------------------------------------------
> ----%
> +%
> ----------------------------------------------------------------------------
> %
> +
> +:- func list_members_to_doc(list(T)) = doc.
> +
> +list_members_to_doc([]) =
> +    nil.
> +
> +list_members_to_doc([H | T]) =
> +    ( if T = [] then
> +        to_doc(H)
> +      else
> +        to_doc(H) `<>` group(comma_space_line) `<>` list_members_to_doc(T)
> +    ).

This doesn't appear to be used anywhere.

>  
> +%
> ----------------------------------------------------------------------------
> %
> +
>  word_wrapped(String) =
> -    list__foldr(
> -        ( func(Word, Sequel) =
> -            group(line `<>` text(Word) `<>` space) `<>` Sequel
> -        ),
> -        string__words(char__is_whitespace, String),
> -        nil
> +    separated(
> +        text,
> +        group(space_line),
> +        string__words(char__is_whitespace, String)
>      ).
>  
>  
> %---------------------------------------------------------------------------
> ---%
> 

-- 
Mark Brown, PhD student            )O+  |  "Another of Fortran's breakthroughs
(m.brown at cs.mu.oz.au)                   |  was the GOTO statement, which was...
Dept. of Computer Science and Software  |  uniquely simple and understandable"
Engineering, University of Melbourne    |              -- IEEE, 1994
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list