[m-dev.] For review: hash table implementation
Ralph Becket
rbeck at microsoft.com
Fri Feb 2 01:37:53 AEDT 2001
>From Fergus Henderson on 01/02/2001 13:36:00
> That looks good.
>
> The main worry is how it will interact with uniqueness checking.
Yes and that's going to become more of an issue in general as time
goes on, I reckon.
> > % Insert an item into a hash table; if one is already there then
> > % it is overwritten.
> > %
> > :- func insert(hash_table(K, V), K, V) = hash_table(K, V).
> > :- mode insert(hash_table_di, in, in) = hash_table_uo is det.
>
> That should be spelt `set', for consistence with `map__set'.
> It would be nice to have `det_insert' and `det_update' too.
Done.
> I suggest s/(if any)// and instead spell out a bit more clearly
> exactly what happens in that case:
Done.
> > :- func elem(K, hash_table(K, V)) = V.
> > :- mode elem(in, hash_table_ui) = out is det.
>
> A comment here, like the one in map.m, would make things much
> clearer.
Done, and for 'elem :=', too.
> Hmm, did you consider using a bitmap rather than using maybe(K)?
No, but an excellent idea.
> That would reduce the number of cells allocated quite a bit.
> For a hash table with B buckets and N occupants, it would be four
> cells instead of N + 3 cells.
>
> cells words
> ralph's original ht 1 6
> representation + key array 1 B + 1
> + yes/1 maybes N N
> + value array 1 B + 1
> = N + 3 2B + N + 8
>
> with bitmap ht 1 7
> + key array 1 B + 1
> + bit map 1 B/8 + 1
> + value array 1 B + 1
> = 4 2.125*B + 10
>
> The access cost would probably be reasonable similar.
I'd be surprised to see any difference in access time. Also, let's hope
we get of the order of 32 bits/word rather than 8 :)
I'll implement this and re-present the diff.
> > % We calculate D = H2 \/ 1 (to ensure that D is non-zero and
> > % odd and is therefore coprime to the number of buckets)
>
> It might be better to use (H2 << 1) \/ 1; I imagine that for typical
> hash functions the lower bits of H2 will be more random that the upper
ones.
Good idea - done. I've used (H2 + H2 + 1) since << is more heavyweight
than is required.
> > :- pred find_slot_0(hash_table(K, V), K, int, int, int, bool).
> > :- mode find_slot_0(hash_table_ui, in, in, in, out, out) is det.
>
> Don't put a ------- line between things like `find_slot' and
> `find_slot_0'. Conceptually these are a unit.
>
> Also, `find_slot_0' should be named `find_slot_2',
> according to our coding guidelines, and for consistency
> with the code elsewhere.
Consider that (and my coding style) changed.
New diff coming any time now...
--
Ralph Becket | MSR Cambridge | rbeck at microsoft.com
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to: mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions: mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------
More information about the developers
mailing list