[m-rev.] for review: fix and speed up trie string switches
Julien Fischer
jfischer at opturion.com
Sun Mar 31 02:56:59 AEDT 2024
On Sat, 30 Mar 2024, Zoltan Somogyi wrote:
> Fix trie string switches, speed them up, ...
>
> ... and turn them back on.
>
> compiler/c_util.m:
> Provide version of output_quoted_string_c which accepts strings
*a* version
> that may not be well formed, and prints out any code_units that
> do not belong to a well-formed UTF-8 code point using an octal
> escape sequence.
>
> Ask for some predicates with trivial bodies to be inlined.
>
> compiler/llds_out_data.m:
> Use the new predicate in c_util to output string constants.
> This fixes the bug that could cause trie string switches to fail
> in the presence of strings containing multi-byte code points.
> In such cases, given the string constants that trie switches can generate,
> which contain only part of such a multi-byte code unit sequence,
> llds_out_data.m used to write out a C string constant that contained
> the unicode replacement character instead of those code units.
>
> compiler/string_switch.m:
> It a trie node has four or more code units that can lead to matches, then
s/It/If/
> use binary search instead of linear search to find out what to do next.
>
> To make this possible, separate the action of generating such search code
> from the action of finding out what to do for each possible code unit
> value.
>
> Generate comments that can be helpful in tracking down bugs such as
> the one described above.
>
> compiler/switch_gen.m:
> Allow trie string switches once again.
>
> runtime/mercury_ml_expand_body.h:
> Add an XXX for new behavior that I found the need for while
> debugging this change.
>
> tests/hard_coded/string_switch*.{m,exp}:
> Add some new alternatives to each swith to create nodes
s/swith/switch/
> at which the LLDS code generator will now use binary search.
> Add a query string to test one of these alternatives.
> (All four test cases are identical, apart from their module names.)
>
> Expect the changes for the existing keys caused by the new switch arms,
> as well as the extra outputs for the new query string.
That looks fine otherwise.
Julien.
More information about the reviews
mailing list