[m-users.] Returning a predicate from a function

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue Dec 27 00:50:16 AEDT 2022


2022-12-26 20:47 GMT+11:00 "Mark Clements" <mark.clements at ki.se>:
> As a follow-up, I was interested in the speed and memory requirements for lists compared with nondet predicates. I set up an experiment to sum the integers between 1 and 10 million (yes, I know that there is closed form solution:). The code is here: https://gist.github.com/mclements/b09eee918c24710f59a2eb3a6a0a7a90. I found:
> 1. foldl using range(1,10000000) was the least time and used the least maximum resident memory
> 2. unsorted_aggregate with promise_equivalent_solutions was 3 x least time and similar memory
> 3. foldl using 1..10000000 was 6 x least time and 5 x least memory
> 4. foldl using unsorted_solutions and promise_equivalent_solutions was 8 x least time and 5 x least memory
> 5. aggregate was 150 x least time and 13 x least memory.

Nondet code is inherently slower than det code, because

- the stack frames of nondet predicates are much bigger; and
- these stack frames must be kept around significantly longer.

Both of those things reduce the effectiveness of data caches.

> Note that aggregate creates a sorted list and then uses foldl. Sorting a large (already sorted) list is apparently very slow.

Sorting an already sorted list is very fast, but that is irrelevant here, because
aggregate does NOT build an unsorted list and then sort it. Instead, it adds
each solution to the current set of solutions as it is calculated. This has better
worst-case behavior, in that a predicate that returns the same solution say
a million times would chug along with a set of size 1, while the build-a-list-then-
sort-it approach would probably run out of memory before getting to the
sorting part.

It is the repeatedly-adding-to-the-set, where the set is represented by a
sorted list without duplicates, that is responsible for your algorithm being
"very slow". It could be improved by e.g. using a set representation
using log(N) operations, but there is no point, because no possible
improvements in the implementation of the solutions or aggregate predicates
will beat the closed form solution, and, as you discovered, they won't beat
an approach using unsorted_aggregate with a purpose-specific aggregation
operation.

> I realise that this experiment may not generalise to most Mercury programs.

It does not. Your test does not resemble the workload of typical Mercury programs,
because it does way less computation per memory access, and its structure is much
more uniform.

Zoltan.


More information about the users mailing list