[m-rev.] Fixes to pprint and change for --dump-mlds

Ralph Becket rafe at cs.mu.OZ.AU
Wed Oct 24 15:03:14 AEST 2001
Previous message: [m-rev.] Fixes to pprint and change for --dump-mlds
Next message: [m-rev.] Fixes to pprint and change for --dump-mlds
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Mark Brown, Wednesday, 24 October 2001:
> This looks good.  I haven't finished the review yet, but here is a first
> round of comments.
> 
> On 23-Oct-2001, Ralph Becket <rafe at cs.mu.OZ.AU> wrote:
> > Estimated hours taken: 2.5
> > Branches: main
> > 
> > Fixed a performance bug in pprint and a couple of formatting bugs.
> 
> There should be regression tests for the formatting bug fixes.  Does
> this change affect any of the existing tests?  If so, the expected
> output of the test may need to be updated.

There are currently no regression tests that involve pprint.  I'll
sort some out (see comment below).

> >  best(W, K, X)           = Best
> >  :-
> > -    be(W, K, ["" - X], Best, []).
> > +    be(W, K, ["" - X], [], RevBest),
> > +    Best = list__reverse(RevBest).
> 
> This raises a (minor) issue though: is the reversed list still of type
> simple_doc?  I don't think it should be, since it is intended to be
> interpreted differently.  So you could define a rev_simple_doc type, or
> just use list(string) in the type declarations of be/5 and push_string/3.
> 
> Also, the new list order should be documented, e.g. in the comment above
> the code for be/5.

Done.

> This change is fine, provided you have tested that the performance is
> acceptable.  Doing an MLDS dump of compiler/make_hlds.m might make a
> good test case.

I tried that.  On ceres it took about 2-5mins per mlds_dump file (there
are six), each of which contained more than 1.1million lines!  The
compiler hit a peak memory usage of 450MBytes without thrashing.

I'd say that was acceptable.

> Please address the above issues and post another diff, including test
> cases for where formatting used to go wrong.  I'll review the remaining
> changes after that (since I'll be able to review the formatting changes
> more effectively if I review the test cases at the same time).

It would probably be more useful to put in current correct output tests.
I'd like to put that off until I've finished another change I'm working
on to do with formatting infix operators.

- Ralph


Index: pprint.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/pprint.m,v
retrieving revision 1.5
diff -u -r1.5 pprint.m
--- pprint.m	26 Apr 2001 14:55:13 -0000	1.5
+++ pprint.m	24 Oct 2001 04:31:03 -0000
@@ -217,13 +217,13 @@
 :- func braces(doc)                     = doc.
 
     % separated(PP, Sep, [X1,...,Xn]) =
-    %   PP(X1) `<>` (Sep `<>` ... (Sep `<>` PP(Xn)) ... )
+    %   PP(X1) `<>` group(Sep `<>` ... group(Sep `<>` PP(Xn)) ... )
     %
     % Note that if you want to pack as many things on one
     % line as possible with some sort of separator, the
     % following example illustrates a suitable idiom:
     %
-    %   separated(PP, group(comma_space_line), Xs)
+    %   separated(PP, comma_space_line, Xs)
     %
 :- func separated(func(T) = doc, doc, list(T)) = doc.
 
@@ -305,6 +305,9 @@
 :- type simple_doc
     ==      list(string).
 
+:- type rev_simple_doc
+    ==      simple_doc.
+
 %------------------------------------------------------------------------------%
 
 nil                     = 'NIL'.
@@ -352,7 +355,8 @@
 
 best(W, K, X)           = Best
 :-
-    be(W, K, ["" - X], Best, []).
+    be(W, K, ["" - X], [], RevBest),
+    Best = list__reverse(RevBest).
 
 %------------------------------------------------------------------------------%
 
@@ -362,18 +366,40 @@
     % running times under a strict language.  The second important
     % change is that be/5 is now a predicate that accumulates its output
     % as a list of strings, doing away with the need for a more elaborate
-    % simple_doc type.
+    % simple_doc type.  The accumulated strings must be reversed to obtain
+    % the printing order.
+    %
+    % W is the number of characters on a line.
+    % K is the number of characters for output on the current line so far.
+    % I is the current indentation string as affected by NEST and LABEL.
+
+:- pred be(int, int, list(pair(string, doc)), rev_simple_doc, rev_simple_doc).
+:- mode be(in, in, in, in, out) is det.
+
+be(_, _, [])                      -->
+    [].
+
+be(W, K, [_ - 'NIL'         | Z]) -->
+    be(W, K, Z).
 
-:- pred be(int, int, list(pair(string, doc)), simple_doc, simple_doc).
-:- mode be(in, in, in, out, in) is det.
+be(W, K, [I - 'SEQ'(X, Y)   | Z]) -->
+    be(W, K, [I - X, I - Y | Z]).
+
+be(W, K, [I - 'NEST'(J, X)  | Z]) -->
+    be(W, K, [extend(I, J) - X | Z]).
+
+be(W, K, [I - 'LABEL'(L, X) | Z]) -->
+    be(W, K, [(I ++ L) - X | Z]).
+
+be(W, K, [_ - 'TEXT'(S)     | Z]) -->
+    push_string(S),
+    be(W, (K + string__length(S)), Z).
+
+be(W, _, [I - 'LINE'        | Z]) -->
+    push_string("\n"),
+    push_string(I),
+    be(W, string__length(I), Z).
 
-be(_, _, [], Out, In)             :-  Out = list__reverse(In).
-be(W, K, [_ - 'NIL'         | Z]) --> be(W, K, Z).
-be(W, K, [I - 'SEQ'(X, Y)   | Z]) --> be(W, K, [I - X, I - Y | Z]).
-be(W, K, [I - 'NEST'(J, X)  | Z]) --> be(W, K, [extend(I, J) - X | Z]).
-be(W, K, [I - 'LABEL'(L, X) | Z]) --> be(W, K, [string__append(I, L) - X | Z]).
-be(W, K, [_ - 'TEXT'(S)     | Z]) --> [S], be(W, (K + string__length(S)), Z).
-be(W, _, [I - 'LINE'        | Z]) --> ["\n", I], be(W, string__length(I), Z).
 be(W, K, [I - 'GROUP'(X)    | Z]) -->
     ( if { flattening_works(X, Z, W - K) } then
         be(W, K, [I - flatten(X) | Z])
@@ -381,13 +407,20 @@
         be(W, K, [I - X | Z])
     ).
 
+
+
+:- pred push_string(string, rev_simple_doc, rev_simple_doc).
+:- mode push_string(in, in, out) is det.
+
+push_string(S, Ss, [S | Ss]).
+
 %------------------------------------------------------------------------------%
 
     % Decide whether flattening a given doc will allow it and
     % up to the next possible 'LINE' in the following docs to
     % fit on the remainder of the line.
     %
-    % XXX This solution is necessary to avoid crippling performance
+    % This solution is necessary to avoid crippling performance
     % problems on large terms.  A spot of laziness would do away
     % with the need for the next three predicates.
     %
@@ -454,7 +487,7 @@
 
 :- func extend(string, int) = string.
 
-extend(I, J) = string__append(I, string__duplicate_char(' ', J)).
+extend(I, J) = I ++ string__duplicate_char(' ', J).
 
 %------------------------------------------------------------------------------%
 
@@ -473,7 +506,7 @@
 
 separated(PP, Sep, [X | Xs]) =
     ( if Xs = [] then PP(X)
-                 else PP(X) `<>` (Sep `<>` separated(PP, Sep, Xs))
+                 else PP(X) `<>` group(Sep `<>` separated(PP, Sep, Xs))
     ).
 
 %------------------------------------------------------------------------------%
@@ -521,14 +554,15 @@
             ( func(UnivArg) = to_doc(Depth - 1, univ_value(UnivArg)) ),
             UnivArgs
         ),
-        Doc = text(Name) `<>`
-            parentheses(
+        Doc =
+            text(Name) `<>` parentheses(
                 group(
                     nest(2,
-                        line `<>` separated(id, comma_space_line, Args)
+                        line `<>`
+                        separated(id, comma_space_line, Args)
                     )
                 )
-        )
+            )
     ).
 
 % ---------------------------------------------------------------------------- %
@@ -618,7 +652,13 @@
 :- func list_to_doc(int, list(T)) = doc.
 
 list_to_doc(Depth, Xs) =
-    brackets(separated_to_depth(to_doc, group(comma_space_line), Depth, Xs)).
+    brackets(
+        nest(1,
+            group(
+                separated_to_depth(to_doc, comma_space_line, Depth, Xs)
+            )
+        )
+    ).
 
 %------------------------------------------------------------------------------%
 
@@ -657,7 +697,7 @@
         ( func(UnivArg) = to_doc(Depth - 1, univ_value(UnivArg)) ),
         UnivArgs
     ),
-    Doc = braces(separated(id, group(comma_space_line), Args)).
+    Doc = braces(separated(id, comma_space_line, Args)).
 
 %------------------------------------------------------------------------------%
 
@@ -676,16 +716,12 @@
         Doc = PP(Depth, X)
       else
         Doc = PP(Depth, X) `<>`
-              (Sep `<>` separated_to_depth(PP, Sep, Depth - 1, Xs))
+              group(Sep `<>` separated_to_depth(PP, Sep, Depth - 1, Xs))
     ).
 
 %------------------------------------------------------------------------------%
 
 word_wrapped(String) =
-    separated(
-        text,
-        group(space_line),
-        string__words(char__is_whitespace, String)
-    ).
+    separated(text, space_line, string__words(char__is_whitespace, String)).
 
 %------------------------------------------------------------------------------%

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
Previous message: [m-rev.] Fixes to pprint and change for --dump-mlds
Next message: [m-rev.] Fixes to pprint and change for --dump-mlds
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list