[m-dev.] Mercury and GCC split stacks

Zoltan Somogyi zoltan.somogyi at runbox.com
Mon Sep 8 13:28:50 AEST 2014

On Mon, 8 Sep 2014 12:12:05 +1000, Paul Bone <paul at bone.id.au> wrote:
> > Unfortunately, it is not only address space. Address space must be mapped
> > by page table entries, and these are cached in TLBs. TLBs are typically the
> > smallest caches in a CPU (since each entry corresponds to a page's worth
> > of data, not e.g. a 64 byte cache block worth of data). There are some CPUs
> > today whose TLBs are too small to hold the PTEs required to address just
> > the data items that fit into their the last level cache. These can suffer
> > more from TLB misses than from cache misses.
> If you don't hit the protected pages often, then they shouldn't be present
> in these caches most of the time.

True, but that isn't my point. My point is: this is the reason why using superpages,
if possible, is a good idea.

> > The best way to avoid such problems is to use whatever support for large pages
> > (sometimes called superpages) the TLB supports, since in that case a TLB entry
> > can map e.g. 4Mb, not just e.g. 4kb.
> Ah, so you can trade-off between TLB utilisation and address space
> utilisation.  Which is good because the latter is under much less pressure.

Unfortunately, this does not work very well either. The number of superpages
you can use is typically a small fixed number, so the "tradeoff" is not
very flexible. (Superpages were initially intended only for mapping
the kernel's code and data structures.)

> What if your stack segments were 4MB in size (or a multiple) plus 4kb for
> the protected page and were 4MB aligned.

You could do this, but only until you ran out of the small number
of superpages you are allowed. This number is in the single, maybe double
digits on current and foreseeable machines; the 10,000 threads
you were talking about in an earlier email are right out.


More information about the developers mailing list