[m-dev.] Re: tabled_for_io interacts badly with c code which contains labels

Zoltan Somogyi zs at cs.mu.OZ.AU
Tue Aug 9 15:31:36 AEST 2005

On 09-Aug-2005, Ian MacLarty <maclarty at cs.mu.OZ.AU> wrote:
> What about changing the transformation to something like the following?
> p(A, B, S0, S) :-
>   (if
>           % Get the global I/O table, the global I/O
>           % counter, and the starting point for tabling
>           % I/O actions, if we are in the tabled range.
>       table_io_in_range(T0, Counter, Start)
>   then
>           % Look up the input arguments.
>       impure table_lookup_insert_start_int(T0, Counter,
>           Start, T),
>       (if
>           semipure table_io_has_occurred(T)
>       then
>           semipure table_memo_get_answer_block(T, Block),
>           impure table_restore_string_answer(Block, 0, B),
>           RunOrigCode = no,
>           SaveAnswers = no
>       else
>           RunOrigCode = yes,
>           SaveAnswers = yes
>       )
>   else
>       RunOrigCode = yes,
>       SaveAnswers = no
>   ),
>   (
>       RunOrigCode = yes,
>       <original code>,
>       (
>           SaveAnswers = yes,
>           % Save the answers in the table.
>           impure table_io_create_answer_block(T, 1, Block),
>           impure table_save_string_answer(Block, 0, B)
>       ;
>           SaveAnswers = no
>       )
>   ;
>       RunOrigCode = no,
>       table_io_copy_io_state(S0, S)
>   ).

That code is not mode correct: B is produced in the first conjunct
(the big if-then-else) if RunOrigCode = no and in the second conjunct
(the switch on RunOrigCode) if RunOrigCode = yes. I guess you could
make it mode correct by moving the table_memo_get_answer_block and
table_restore_string_answer goals to the RunOrigCode = no switch arm,
but that would require computing a dummy value for T, etc.

> Surely not copying the code would be more cache efficient?

(1) The size of the copied code is negligible.

(2) Your code has the overhead of an extra switch, which with the current
code generator requires several copies between abstract machine registers,
which on x86 are in main memory.

(3) Only one copy of the code is every executed (and thus needs to be
in the instruction cache) during normal execution. After a retry, the
overhead of MR_trace is a far bigger concern than the cache effects
of copied code.

Overall, I think your solution is a performance loss on pretty much all
procedures (on normal execution). I think it is far simpler to tell people
"if the code you want to put in an I/O primitive has a label, put the code
in a separate function and call the function". That way, any performance
effects are felt only by that predicate, not all predicates.

mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au

More information about the developers mailing list