Functional syntax (Was: RE: [mercury-users] Hi, and determini sm)

Richard A. O'Keefe ok at atlas.otago.ac.nz
Thu Feb 8 12:42:17 AEDT 2001

Previous message: Functional syntax (Was: RE: [mercury-users] Hi, and determini sm)
Next message: Functional syntax (Was: RE: [mercury-users] Hi, and determini sm)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I wrote:
	> My experience using bit-fields in C suggests the opposite.
	> Any time that fetching/storing a field requires more than a simple
	> memory reference instruction, performance suffers.
	> Packing to the byte level is fine, especially if the compiler
	> re-orders fields in descending order of alignment
	> (double[8] > int[4] > short[2] > char[1], to use C terminology).
	
Ralph Becket replied:
	I'm surprised: I would have imagined that being cache friendly 
	through a smaller memory footprint and (more importantly) reducing 
	the amount of boxing required would net good results.

To clarify:
(a) I explicitly said "in C".  There is no boxing in C.
(b) I did not say NO PACKING AT ALL.  Indeed, I explicitly said
    "Packing to the byte level is fine."  I am not saying that using 32
    bits is faster than using 5 bits, I am saying that a lot of the time
    using 8 bits is faster than using 5 bits.
(c) There is nothing wrong with being cache-friendly.  But there are three
    ways of being cache-friendly:
    - Have smaller data
    - Have smaller code
    - Control the timing of when code/data are used so that you need
      less at one time.
    An important thing to understand about using bitfields is that they
    bloat your object code, especially storing bitfields.  (The DEC-10
    of fond memory was a notable exception.)  For example, given
	struct { int a:7, b:5, c:4; } x;
	...
	x.b = y;
    the code would be
	load x into r1
	mask r1 with 0xF07F
        load y into r2
        shift r2 left 7 bits
        add r2 to r1
        store r1 into x
    whereas 
	struct { char a, b, c; } x;
	...
	x.b = y;
    would have
	load y into r1
	store y into x.b
    as its code.

    If there's just one place that accesses the bit fields, it's a touch
    slow, but not to worry.  If, thanks to in-line expansion, the field
    access gets replicated a few thousand times, it is going to do bad
    things to your instruction cache, as well as your cycle count.

What about data cache?  The question is not whether packing can ever help
the data cache, of course it can.  The question is HOW MUCH packing is
USEFUL.  In this little example, both packed records would be 32 bits, so
there would be no space advantage to the tighter packing.  (Yes, I know
there are 16 bits, but most C compilers make records a multiple of 32 bits
anyway.)

There is another point about being kind to the data cache, which is that
very often the best thing to do is NOT to pack the fields tighter together
but to split them into entirely separate array elements.  If, for example,
some field is a key, then putting the keys in a separate array means you
get more keys in per cache reload, and finally touch one cache block of
satellite data.  I've know people get really spectacular speedups that
way (enough data and you'll reduce page faults, let alone cache refills),
WITHOUT actually packing the data any tighter.
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------

Previous message: Functional syntax (Was: RE: [mercury-users] Hi, and determini sm)
Next message: Functional syntax (Was: RE: [mercury-users] Hi, and determini sm)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the users mailing list