[m-rev.] for review: string switches using tries

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue Feb 24 16:19:06 AEDT 2015



On Tue, 24 Feb 2015 11:02:00 +0800, Sebastian Godelet <sebastian.godelet at outlook.com> wrote:
> I have a question of how using tries for implementing string switches
> compares speed and complexity wise with using a switch on the hash codes
> of the case constants followed by a if else chain using string compare,
> like it is done in Java 8 and C# as far as I recall. Cheers, Sebastian.

I am also interested, but I don't have a definitive answer, and I don't have
a way of getting one by myself.

Theoretically, the comparison between the performance characteristics
of tries and hash tables with open addressing stack up like this:

- Trie switches examine each code unit in the value of the switched-on variable
   exactly once, while hash switches examine each code unit two or times:
   once to calculate the hash value, once to confirm the match, and may examine
   an initial prefix of code units zero or more times between those if the
   selected hash slot has a collision.

- On the other hand, each trie node is a jump that is reasonably likely
   to be hard for the CPU to predict.

The first consideration argues in favour of tries being faster, the second
in favour of hash switches being faster. Which consideration wins in practice
depends on (a) the relative raw costs of things such as mispredicted jumps,
which depend on the hardware microarchitecture, and (b) the relative frequency
with which those costs are incurred, which depends on the set of strings
being selected among and the frequency distribution of the switched-on
key strings, which in turn depend on the Mercury program. 

I don't have access to any real workloads in which switches on strings
are a nontrivial part of the task, so I am missing real data for part (b).
When other developers have had a chance to try out this diff and have found
that it works for them, I was planning to ask for volunteer performance
evaluators on the mercury-users list. I think it likely that Michael Day
would be interested.

Zoltan.




More information about the reviews mailing list