[mercury-users] version_array vs. field update performance

Ondrej Bojar bojar at csse.unimelb.edu.au
Wed Feb 7 11:48:10 AEDT 2007
Previous message: [mercury-users] version_array vs. field update performance
Next message: [mercury-users] version_array vs. field update performance
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi.

The issue was a bit suspicious to me, so I'm now collecting data for a 
bigger picture. I'm now using benchmark_det instead of looking at the 
profile as I did in my first experiment.

This is the va code (full source is attached):

:- pred varr(int::in,
            version_array(float)::in,
            version_array(float)::out) is det.
varr(Iters, V, OutV) :-
   fold_up(
     (pred(I::in, IV::in, OV::out) is det :-
       Pos = I mod 8,
       PrePos = (-I) mod 8,
       OV = IV^elem(Pos) := IV^elem(PrePos)+1.0
     ), 0, Iters, V, OutV).
% dummily increase value from a field and put it to a different
% field, repeat Iters times.

There is no backtracking involved.

The whole predicate is run N times.

Here are milisecond counts when varying N (i.e. benchmark_det repetition 
count) and Iters (i.e. the number of reads and the number of writes to 
the array).

The machine is 64 bit, floats are untreated (I'm not sure if they are 
boxed or not in rotd-2006-11-06), and so optimization: I am not using 
any optimization flags to mmc. The grade is asm_fast.gc.

All the vectors are 8 fields wide (you were right, Ralph :-).

representation: c(c1::float, c2::float, c3::float...)
see the source code for the details on converting field indices Pos and 
PrePos to the correct tuple field. memoization is involved.

N\RWs   10      100     1000    10000
10      0       0       30      60
100     0       30      60      370
1000    40      60      390     -

representation: version_array(float)

N\RWs   10      100     1000    10000
10      0       0       60      6040
100     0       10      740     72440
1000    0       50      7510    -

...so something's weird here, there is no backtracking involved, but 
still the more updates to a vector, the slower the array becomes

representation: array(float), the array is N times copied to create a 
unique copy, at each write, the change is destructive

N\RWs   10      100     1000    10000
10      0       0       0       20
100     0       0       30      270
1000    0       30      280     -

representation: array(float), the array is updated with set_slow, which 
possibly makes a copy at each update. I'm not really sure if this is 
happening, of if it has been optimized away.

N\RWs   10      100     1000    10000
10      0       20      0       40
100     30      0       30      510
1000    10      30      480     -

My previous result with version arrays equal to tuples was rather a 
coincidence of the number of iterations and read/writes, and probably 
heavily influenced by profiling.

Now I'd say that for short vectors as I have, arrays win, 'slow' arrays 
are comparable to tuples, version_arrays are slower, due to a bug or 
such now much slower...

More numbers are being calculated, and I will rerun it again just to be 
sure that there are no effects of other processes possibly running on 
the machine.

O.



Ralph Becket wrote:
> Ondrej Bojar, Tuesday,  6 February 2007:
> 
>>Hi,
>>
>>I'm working with many vectors of floats, all of a fixed length. I create 
>>lots of them, then stepwise set and update some of the fields and sort 
>>them according to a scalar product with a constant "weights" vector.
>>
>>Currently I use version_array to store each vector.
>>
>>About 7% of run time is used on 'version_array.elem :=', and another 6% 
>>is used on 'version_array.elem'.
> 
> 
> Are you using unsafe_rewind after backtracking?  If you aren't then
> every version_array access is paying a cost for any updates on previous
> search paths.
> 
> Also, unless you're working on a 64 bit machine, floats are normally
> boxed which has a significant impact of FP intensive programs.  You can
> try using --single-precision-floats if you like if you have a very
> recent version of the compiler (I think Peter Wang checked this in the
> other day - check the reviews mailing list archive for this month to be
> sure).
> 
> 
>>Given that the vector length is fixed, I could use tuples or data terms 
>>with field access functions to get or update vector components.
> 
> 
> It's hard to imagine tuple updates beating version arrays unless they
> are very small (under ten or twenty fields would be my guess).
> 
> 
>>In a small test, I tried to compared the performance of each of the 
>>representations and I was surprised to see that there is nearly no speed 
>>difference observable. Would this be what you expect?
> 
> 
> Yes: a version array update allocates a four word heap cell to record
> the overwritten contents and then does an in-place update of the array;
> a tuple update has to copy the entire tuple.
> 
> 
>>I was hoping to see there is less overhead in the field update
>>routines.
>>
>>Can arrays (with the di/uo workaround hassle) be faster?
> 
> 
> Arrays are about 1.5 - 3 times faster, but you can't backtrack over
> them.
> 
> -- Ralph
> --------------------------------------------------------------------------
> mercury-users mailing list
> Post messages to:       mercury-users at csse.unimelb.edu.au
> Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
> Subscriptions:          mercury-users-request at csse.unimelb.edu.au
> --------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testva_versus_tuple.m
Type: text/x-objcsrc
Size: 4064 bytes
Desc: not available
URL: <http://lists.mercurylang.org/archives/users/attachments/20070207/d0f1e162/attachment.bin>
Previous message: [mercury-users] version_array vs. field update performance
Next message: [mercury-users] version_array vs. field update performance
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the users mailing list