[m-dev.] tail call to loop optimisation for low-level grades
Peter Wang
novalazy at gmail.com
Tue Jul 1 17:35:42 AEST 2008
On 2008-07-01, Zoltan Somogyi <zs at csse.unimelb.edu.au> wrote:
> On 27-Jun-2008, Peter Wang <novalazy at gmail.com> wrote:
> > I was looking at why the asm_fast.gc implementation of string.hash is
> > about 85% times slower than the C version. The loop looks like this,
> > after some cleanup:
> >
> > MR_def_static(shash__unchecked_hash_2_5_0)
> > if (MR_r2 >= MR_r3) {
> > MR_GOTO_LAB(shash__unchecked_hash_2_5_0_i2);
> > }
> > {
> > MR_String Str = (MR_String) MR_r1;
> > MR_Word MR_tempr1 = Str[MR_r2];
> >
> > MR_r2 = MR_r2 + 1;
> > MR_r4 = (MR_r4 ^ (MR_r4 << 5)) ^ MR_tempr1;
> > MR_np_localtailcall(shash__unchecked_hash_2_5_0);
> > }
>
> Your cleanup shows the C << operator, whereas the Mercury source in string.m
> uses the Mercury << operator, which does bounds checks. Did you replace
> << with unchecked_left_shift in your copy of string.m?
Yes. That's what it should be if string.hash is to match the
MR_hash_string() macro.
BTW, for this example, it turns out that the slowdown is because r4 is a
fake register. If you just exchange r1 and r4 before entering the loop,
you get the same (or similar, I forget) speedup. So maybe that is a
better way to go?
Peter
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to: mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions: mercury-developers-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the developers
mailing list