[m-rev.] diff: improve simplify_info

Paul Bone paul at bone.id.au
Thu Mar 9 11:02:26 AEDT 2017


On Wed, Mar 08, 2017 at 12:25:13PM +1100, Zoltan Somogyi wrote:
> 
> On Wed, 8 Mar 2017 10:23:29 +1100, Peter Wang <novalazy at gmail.com> wrote:
> 
> > On Wed, 08 Mar 2017 08:14:15 +1100 (AEDT), "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> >
> > You can check Boehm GC's size rounding for small objects by printing
> > the GC_size_map with GC_mercury_write_size_map().  Here are the results
> > on x86-64.  The i'th number in the list gives the allocation size in
> > words for a request of i words.  (The zeros are yet to be populated.)
> > 
> >       2    2    4    4    6    6    8    8
> >      10   10   12   12   14   14   16   16
> >      18   18   20   20   22   22   24   24
> >      26   26   28   28   30   30   32   32
> >      34   34   36   36   38   38   40   40
> >      42   42   44   44   46   46   48   48
> 
> Thanks for that.
> 
> I was under the impression that there was much more rounding up
> of allocation sizes, though that came from what Fergus told me about
> much older versions of Boehm. Any idea *when* that changed? I just looked
> on Hans Boehm's gc page, and in "git blame" for the seemingly-relevant
> gc header files in the boehm directory, and did not find that information.
>
> At least this explains why making the top level of module_info 11 words
> in size could be better than making it either 8 or 16 words in size.
> 
> I will fix the comments to say that boehm rounds up not to the next
> power of two, but only to the next multiple of two, but I may also do
> some experiments to see whether this opens up the possibility of
> any new speedups.

AIUI Boehm GC tries to tune these based on the allocation patterns it sees.
However they will always be rounded to the nearest granule
(currently/usually 1 granule = 2 words).

If the GC does tune these allocation sizes it may be best not to segregate
objects as we have done in the past, particularly with 9, 10, 11 etc word
objects.  However I think it will still be useful to partition them into
usually updated and not-usually updated parts (I think we do this for
code_info during code generation) particularly if there's a lot of data
that's not usually updated.  This will decrease memory churn measured in
bytes and therefore be more cache-friendly and trigger GC events less often.

Thanks.


-- 
Paul Bone
http://paul.bone.id.au


More information about the reviews mailing list