[m-rev.] for review: improve hash tables

Peter Wang novalazy at gmail.com
Mon Jul 19 17:03:49 AEST 2010


On 2010-07-19, Julien Fischer <juliensf at csse.unimelb.edu.au> wrote:
> 
> On Mon, 19 Jul 2010, Peter Wang wrote:
> 
> >[This should be safe for 10.04 as well]
> 
> Ok, although it's now part of 10.04.1.
> 
> >Branches: main
> >
> >Speed up hash tables using a "fat" list representation for bucket chains,
> >instead of assoc_list.  The benefits are less type_info constructions (an
> >assoc_list requires a type_info for list and a type_info for pair) and,
> >likely, better locality.  This change speeds up hash_table_test.m by 25%,
> >on an x86-64 machine.
> 
> How did you test for that?

Test for what?

> >set(!.HT, K, V) = !:HT :-
> >    H = find_slot(!.HT, K),
> >-    AL0 = !.HT ^ buckets ^ elem(H),
> >-    ( if assoc_list.remove(AL0, K, _, AL1) then
> >-        AL = [K - V | AL1],
> >+    Buckets0 = !.HT ^ buckets,
> >+    array.unsafe_lookup(Buckets0, H, AL0),
> 
> As a matter of style, I think that calls to unsafe operations should
> have a comment explaining *why* there is safe**.  (In this module, that's
> probably quite a few comments, so perhaps just a comment at the head of
> the implementation section explaining why all the calls to usnafe ops
> are safe.)
> 
> ** or in the (hopefully) rare circumstances where it is unsafe, giving
> a rationale for their use.
> 

Committed with these changes.

diff --git a/NEWS b/NEWS
index 831b662..74bdee7 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,8 @@ Changes to the Mercury standard library:
   map_keys_only/3 map_values_only/3 and map_values/3.  They complement the
   functions of the same name.
 
+* We have improved the performance of hash tables and version hash tables.
+
 NEWS for Mercury 10.04
 ----------------------
 
diff --git a/library/hash_table.m b/library/hash_table.m
index e535123..699b3da 100644
--- a/library/hash_table.m
+++ b/library/hash_table.m
@@ -228,6 +228,13 @@
 
 :- implementation.
 
+    % We use a custom association list representation to reduce type_info
+    % allocations (assoc_list requires a type_info for list, and one for pair),
+    % and better locality.
+    %
+    % Array bounds checks may be omitted in this module because the array
+    % indices are computed by: hash(Key) mod size(Array)
+
 :- type buckets(K, V) == array(hash_table_alist(K, V)).
 
 :- type hash_table_alist(K, V)

Peter
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list