[m-dev.] storing GetVars vs Vars Re: source-to-source debugger proposal

Peter Moulder Peter.Moulder at infotech.monash.edu.au
Fri Sep 28 21:58:15 AEST 2007

On Fri, Sep 28, 2007 at 06:14:05PM +1000, Zoltan Somogyi wrote:

> The closure will be even more expensive to create than a simple association
> list,

That isn't obvious to me at least (being less familiar with Mercury
implementation).  I'm assuming that the usual case is that
VarsA/GetVarsA won't actually be called, so the runtime cost of adding a
closure to the shadow stack is adding a function pointer, and the closure
information; is the closure information implemented as a pointer to a
record of two–three words (typeinfo-for-T, L, and (in the case of
GetVarsB) E), or is it more expensive than that?  If it is that then the
GetVars runtime costs are

  - Allocate the closure record.
  - Fill it.  (Just two/three word copies, they're already local variables.)
  - Include a function pointer and pointer to this record in the
    shadow stack.  (Two word copies.)  (I'm assuming that each shadow
    stack frame has more information than this, so one pointer or two
    doesn't change the number of allocations required for creating the
    shadow stack frame.)

The VarsA/VarsB approach doesn't require those things, but does require
constructing a list (n allocations and 2n word copies), constructing n
pairs (another n allocations and 2n word copies), and constructing n
univ's (which I guess is another n allocations and 2n word copies,
without looking at univ's implementation); and inserting one pointer
into the shadow stack instead of the two pointers for the GetVars

So GetVars approach has

  1 allocation
  Up to 2n+2 word copies (2n+2 if all vars are bound and each has a
    different typeinfo)
  2 (extra) words of space in the shadow stack frame

(for cases where GetVars doesn't actually get called) while the Vars
approach has

  3n allocations
  6n+1 word copies
  1 (extra) word of space in the shadow stack frame

So the GetVars approach looks as if it wins in terms of runtime cost
(ignoring speed costs of code size differences, see below) in the case
that GetVars doesn't actually get called.

If GetVars does get called, then of course it still needs to construct
Vars, so it's the sum of the above two plus the cost of an extra
function call (GetVars).  Without actually testing, it looks as though
the break-even point is for the number of GetVars calls to be very
roughly half the number of these VarsA/GetVarsA constructions.

The GetVars approach does require extra code size, because of the
GetVars function definition and the construction of the closure stuff.
Even so, it's not clear that this code size increase causes worse cache
usage: the construction of the closure doesn't need as much code as the
construction of the list; so the CPU pulls in (reads) less code in the
assumed-common case that GetVars doesn't actually get called.  The
reduced memory usage from fewer allocations may also result in a more
compact heap (depending on the gc implementation).

I may well have made several mistakes in the above: I haven't been very
careful, but I thought it worth posting to at least show that GetVars
might be cheaper and is worthy of a second look.


mercury-developers mailing list
Post messages to:       mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions:          mercury-developers-request at csse.unimelb.edu.au

More information about the developers mailing list