[m-dev.] hash tables in standard library

Jonathan Morgan jonmmorgan at gmail.com
Fri Aug 15 16:02:40 AEST 2008

On Fri, Aug 15, 2008 at 3:51 PM, Ralph Becket <rafe at csse.unimelb.edu.au> wrote:
> Zoltan Somogyi, Friday, 15 August 2008:
>> On 15-Aug-2008, Ralph Becket <rafe at csse.unimelb.edu.au> wrote:
>> > My understanding is that Python uses hash tables as method tables, from
>> > which you never delete anything.
>> That is one limitation of open addressing. The other is that as the
>> hash table gets closer and closer to full, the performance falls off a cliff.
>> With chaining, there is no such effect, and a table with N slots can in fact
>> contain more than N items. As the ratio between the number of items
>> and the number of slots increases beyond one, the performance comes to
>> be dominated more and more by the linear search of the chains, but
>> even this can be fixed by using expandable hash tables (as in e.g.
>> linear hashing).
> I guess one could use a map rather than a list for each bucket in order
> to limit worst case behaviour.

Or possibly (depending on how you implement it) just resize the hash
table.  There will be different tradeoffs for different
implementations, and I'm sure all of these ideas have been tried
before (I know in 253 using a balanced tree or some other log n search
mechanism was suggested as one way to prevent linear searches).  I'm
not sure what it is best for a standard library to do - I just know if
I were designing a generic hash table I would use separate chaining,
because it seems more logical than open addressing to me.  The Python
implementers obviously think differently, and their dictionary is
pretty fast and has been very carefully tuned to suit their usage

mercury-developers mailing list
Post messages to:       mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions:          mercury-developers-request at csse.unimelb.edu.au

More information about the developers mailing list