[m-rev.] for review: improve hash tables

Julien Fischer juliensf at csse.unimelb.edu.au
Mon Jul 19 17:12:51 AEST 2010


On Mon, 19 Jul 2010, Peter Wang wrote:

> On 2010-07-19, Julien Fischer <juliensf at csse.unimelb.edu.au> wrote:
>>
>> On Mon, 19 Jul 2010, Peter Wang wrote:
>>
>>> [This should be safe for 10.04 as well]
>>
>> Ok, although it's now part of 10.04.1.
>>
>>> Branches: main
>>>
>>> Speed up hash tables using a "fat" list representation for bucket chains,
>>> instead of assoc_list.  The benefits are less type_info constructions (an
>>> assoc_list requires a type_info for list and a type_info for pair) and,
>>> likely, better locality.  This change speeds up hash_table_test.m by 25%,
>>> on an x86-64 machine.
>>
>> How did you test for that?
>
> Test for what?

Test that the speedup was 25%.  (I guess I actually mean how did you measure that?)

>>> set(!.HT, K, V) = !:HT :-
>>>    H = find_slot(!.HT, K),
>>> -    AL0 = !.HT ^ buckets ^ elem(H),
>>> -    ( if assoc_list.remove(AL0, K, _, AL1) then
>>> -        AL = [K - V | AL1],
>>> +    Buckets0 = !.HT ^ buckets,
>>> +    array.unsafe_lookup(Buckets0, H, AL0),
>>
>> As a matter of style, I think that calls to unsafe operations should
>> have a comment explaining *why* there is safe**.  (In this module, that's
>> probably quite a few comments, so perhaps just a comment at the head of
>> the implementation section explaining why all the calls to usnafe ops
>> are safe.)
>>
>> ** or in the (hopefully) rare circumstances where it is unsafe, giving
>> a rationale for their use.
>>
>
> Committed with these changes.
>
> diff --git a/NEWS b/NEWS
> index 831b662..74bdee7 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -7,6 +7,8 @@ Changes to the Mercury standard library:
>   map_keys_only/3 map_values_only/3 and map_values/3.  They complement the
>   functions of the same name.
>
> +* We have improved the performance of hash tables and version hash tables.
> +
> NEWS for Mercury 10.04
> ----------------------
>
> diff --git a/library/hash_table.m b/library/hash_table.m
> index e535123..699b3da 100644
> --- a/library/hash_table.m
> +++ b/library/hash_table.m
> @@ -228,6 +228,13 @@
>
> :- implementation.
>
> +    % We use a custom association list representation to reduce type_info
> +    % allocations (assoc_list requires a type_info for list, and one for pair),
> +    % and better locality.
> +    %
> +    % Array bounds checks may be omitted in this module because the array
> +    % indices are computed by: hash(Key) mod size(Array)
> +
> :- type buckets(K, V) == array(hash_table_alist(K, V)).
>
> :- type hash_table_alist(K, V)

That's fine - thanks for that.

Cheers,
Julien.
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list