[m-dev.] pprint performance

Ralph Becket rafe at cs.mu.OZ.AU
Sat Feb 23 11:49:11 AEDT 2002


I've been looking into the performance problems experienced when using
pprint to format very large terms.

I have improved the performance of the pprint write algorithm by a
factor of four on large terms (balanced n-ary trees containing ~1
million nodes now take ~20s to write rather than ~80s).  Writing now
also only allocates a single cons cell for each literal to be written,
which is significantly more economical than before.  The writing
procedure is now definitely linear in the size of the input term.

The real problem now is that converting a term to a doc form suitable
for pprint__write expands the size of the term significantly (every pair
of brackets costs about eight words of storage and every comma six or
seven words of storage.)  One would expect that in most very large terms
there will be a significant amount of sharing; unfortunately all this
sharing is lost in the conversion process.  This can lead to an
exponential blow-up in the memory footprint of formatted terms.

I've tried some experiments making judicious use of pragma memo, but
this is not a solution (it actually makes matters much worse.)

A reasonable solution is probably to use some form of lazy evaluation,
whereby the term conversion is done on-demand by the write procedure.
This should improve matters on those back-ends that support the
extras/lazy_evaluation/lazy.m module (currently the C backends), but
may cause problems for back ends that don't.

I'll do some experiments and report back.

- Ralph


--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list