[m-rev.] for review: improve hash tables

Julien Fischer juliensf at csse.unimelb.edu.au
Mon Jul 19 15:31:45 AEST 2010


On Mon, 19 Jul 2010, Peter Wang wrote:

> [This should be safe for 10.04 as well]

Ok, although it's now part of 10.04.1.

> Branches: main
>
> Speed up hash tables using a "fat" list representation for bucket chains,
> instead of assoc_list.  The benefits are less type_info constructions (an
> assoc_list requires a type_info for list and a type_info for pair) and,
> likely, better locality.  This change speeds up hash_table_test.m by 25%,
> on an x86-64 machine.

How did you test for that?

> For non-version hash tables, also use unsafe array operations to avoid
> bounds checks and the remaining type_info constructions.  This improves on
> the previous change by another 20%.
>
> library/hash_table.m:
>        Use fat list representation.
>
>        Use unsafe array operations.
>
> library/version_hash_table.m:
>        Use fat list representation.

You should announce the improvements in the NEWS file.

> diff --git a/library/hash_table.m b/library/hash_table.m
> index 7989291..e535123 100644
> --- a/library/hash_table.m
> +++ b/library/hash_table.m
> @@ -223,11 +223,17 @@
>                 num_occupants           :: int,
>                 max_occupants           :: int,
>                 hash_pred               :: hash_pred(K),
> -                buckets                 :: array(assoc_list(K, V))
> +                buckets                 :: array(hash_table_alist(K, V))
>             ).
>
> :- implementation.
>
> +:- type buckets(K, V) == array(hash_table_alist(K, V)).
> +
> +:- type hash_table_alist(K, V)
> +    --->    ht_nil
> +    ;       ht_cons(K, V, hash_table_alist(K, V)).
> +

I suggest adding a comment there explaining why you use that type rather
than assoc_list.

> set(!.HT, K, V) = !:HT :-
>     H = find_slot(!.HT, K),
> -    AL0 = !.HT ^ buckets ^ elem(H),
> -    ( if assoc_list.remove(AL0, K, _, AL1) then
> -        AL = [K - V | AL1],
> +    Buckets0 = !.HT ^ buckets,
> +    array.unsafe_lookup(Buckets0, H, AL0),

As a matter of style, I think that calls to unsafe operations should
have a comment explaining *why* there is safe**.  (In this module, that's
probably quite a few comments, so perhaps just a comment at the head of
the implementation section explaining why all the calls to usnafe ops
are safe.)

** or in the (hopefully) rare circumstances where it is unsafe, giving
a rationale for their use.

...

That looks fine.

Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list