[m-dev.] Mercury and GCC split stacks

Zoltan Somogyi zoltan.somogyi at runbox.com
Mon Sep 8 10:13:25 AEST 2014



On Mon, 8 Sep 2014 00:08:54 +1000, Paul Bone <paul at bone.id.au> wrote:
> If it's only in address space then I don't think ti would be excessive on a
> 64bit system.
> 
> If we assume 10,000 threads with an average of 10 stack segments each that's
> 100,000 stack segments  with one protected page is 400,000KB of memory
> (assuming 4KB pages) or 409,600,000 bytes.  It's true that this would be
> excessive if these pages were mapped to physical memory.

Unfortunately, it is not only address space. Address space must be mapped
by page table entries, and these are cached in TLBs. TLBs are typically the
smallest caches in a CPU (since each entry corresponds to a page's worth
of data, not e.g. a 64 byte cache block worth of data). There are some CPUs
today whose TLBs are too small to hold the PTEs required to address just
the data items that fit into their the last level cache. These can suffer
more from TLB misses than from cache misses.

The best way to avoid such problems is to use whatever support for large pages
(sometimes called superpages) the TLB supports, since in that case a TLB entry
can map e.g. 4Mb, not just e.g. 4kb.

However, the OS can replace an aligned sequence of regular-sized virtual
pages with one such superpage only if (a) those regular pages are loaded into
contiguous locations in main memory, and (b) they all have the exact
same permissions. If those regular sized pages hold stack segments
with redzones, the second condition will virtually never be true.

I haven't yet heard of any version of Linux that employs superpages
like this automatically, but all recent versions will do that for you
if you explicitly ask them to. This problem with redzones prevents us
from asking, and prevents us to benefiting from any future automatic
employment of superpages.

Zoltan.




More information about the developers mailing list