[m-users.] A more precise determinism checker

Sean Charles (emacstheviking) objitsu at gmail.com
Sun Mar 20 22:26:19 AEDT 2022


This has been a most interesting thread of conversation indeed.

As a relative newcomer to the Mercury language (almost 2 years of daily use now), I initially found the determinism / mode / inst compiler messages the most frustrating part of the system but I have come to understand what they mean, what the compiler is trying to do and, running on a ten year old iMac I truly appreciate the sheer speed of the compiler to not only do its determinism analysis but also the conversion of the internal form to C (my chosen output grade) and to build the executable.

Having read the arguments for and against, for me, as a novice user, I don't think the current compiler could really add more than it has to offer without slowing down its overall execution time. I chose Mercury because I had heard it was designed for 'large applications' that needed to be robust. It hasn't let me down.

On another slightly tangential note, your comment Zoltan,

The "code" in this case consists mainly of a set of machine learning systems
that were not written by humans at all, so you can't understand it by asking its
authors.

...this beautifully sums up why I believe that
	(a) there will always be the need for human developers and
	(b) I will never trust any code I can't see, examine and --understand--.

And I am with Roger Penrose who says that until we understand what we mean by 'understand', that building a true fully-sentient artificial intelligence isn't something achievable. Yet. Maybe? Who knows?!

Sean.

> On 19 Mar 2022, at 17:07, Zoltan Somogyi <zoltan.somogyi at runbox.com> wrote:
> 
> 
> 2022-03-20 03:12 GMT+11:00 "Philip White" <philip at pswhite.org>:
>> I've been working in my spare time on a compiler for a relational
>> programming language heavily inspired by mercury - it has strong and
>> statically checked types, modes, and determinism.
> 
> That is nice to hear.
> 
>> The determinism checking approach of this language is totally different
>> than Mercury's approach, and I believe it yields much more precise
>> results, albeit via a more expensive algorithm. I'm curious if the
>> compiler writers ever considered something like this. If not, would you
>> every consider using it in the current mercury compiler?
> 
> No, and no.
> 
> We the implementors of the Mercury compiler consider that compilation
> speed is very important in making a language practical. If, after changing
> the code of a module, the process of rebuilding the executable takes long enough
> for you to lose your focus, your productivity will suffer, and we don't want that.
> 
> The Mercury determinism analysis algorithm was purposely designed to be simple.
> Its simplicity does not only make it fast, it also makes it *easily understandable*.
> From time to time, we get proposals to extend it, e.g. by teaching it that if the
> type of X contains only the function symbols f, g and h, then in the else case
> of an if-then-else of the form "( if X = f then ... else ...)", X can only be g or h,
> so a switch on X with only those two arms should be considered "det".
> We have always turned such proposals down, not because of the impact
> on compiler speed (which would probably be small), but because it would make
> the rules of determinism analysis both harder to learn and harder to predict
> even once learned.
> 
>> Predicates written in this language go through a mode-resolution phase
>> in which clauses in a conjunction are re-ordered so that variables are
>> properly instantiated for each predicate that needs to be invoked; this
>> phase also selects the mode to use at each invocation. It's after this
>> phase that I run the determinism checker. I imagine the Mercury
>> compiler does something pretty similar.
> 
> Yes, it does. You can see it yourself by invoking mmc with the -v (verbose)
> option, which tells the compiler to print a line for each phase of compilation.
> You will see that type and mode analysis are followed by two passes
> that gather information for determinism analysis, and then determinism
> analysis itself.
> 
>> Naively, we "execute" the predicate with each combination of
>> constructors that each variable can have. This guarantees that we will
>> hit every possible code-path of the predicate. The result of each
>> execution is a determinism. For example, if the predicate finishes
>> twice, then the result of that execution is "multi". The final
>> determinism of the predicate is the join of all the execution results
>> (i.e. the join operation of the determinism lattice). For example, if
>> one path yields "multi" and another yields "failure", then the final
>> determinism is "nondet".
>> 
>> This naive approach would be prohibitively expensive for predicates
>> with lots variables, especially if those variables have types with many
>> constructors.
> 
> And some types, such as integers and floats, have billions of constructors,
> and strings even more.
> 
>> We can improve it by initially treating every variable as
>> abstract; every time the predicate attempts to pattern match a
>> variable, we restart execution once for each of that variable's
>> constructors. The algorithm is roughly the same, but we only end up
>> running variable combinations on-demand, so we save a lot of work
>> for variables that are merely moved around and never inspected.
> 
> This approach multiplies the amount of work you have to do by *many*
> orders of magnitude. Using techniques from areas such as model checking
> or concolic testing, you may be able to get most of those orders of magnitude
> back, but you would have to do a *lot* of work to achieve even 1% of the
> speed of the Mercury determinism analysis algorithm.
> 
>> There are a few details that I've glossed over, but this is the general
>> idea. It's definitely a more expensive algorithm than something that is
>> more syntax directed, but I think it will notice switch statements more
>> often than Mercury does, and it will also determine that a set of
>> variables is exhaustive more often than the Mercury compiler. (I make
>> this claim mostly based on my understanding of the Mercury compiler
>> from the documentation, rather than from experience; I bet I can
>> furnish examples if you would like to see any, though)
> 
> I believe the claim that you could get more precise determinisms
> than the Mercury compiler. I don't believe that the effort will be worth it,
> for a simple reason: the current Mercury determinism analysis algorithm,
> simple as it is, is *already* smart enough in practice. It is very rare for me
> to get an error message from the compiler about a determinism error
> where that error is caused by the compiler not being smart enough to see
> the actual determinism of the code. In virtually all cases, when the error
> message says that the inferred determinism of this predicate is e.g. semidet,
> then that will be the actual determinism of the code, because of a bug
> in the actual code, such as a missing case in a switch. Very rarely, the cause
> is a bug in the mode declaration; I forgot to tell the compiler that e.g.
> this predicate expects its caller to specify a value for an argument
> that makes unreachable the "missing" case in that switch. Having the cause of the
> error message be a limitation in determinism analysis is even rarer,
> and the only such limitation that I bump into more than about once a year
> is that switch detection does not look into disjunctions deep enough.
> That is a problem so rare you probably don't even know what I am talking about :-(
> 
>> I'd love to hear what people think of this approach. I haven't yet
>> tested it at large scale with huge predicates, so my biggest concern is
>> that it will be really, really slow.
> 
> I can tell you right away that I don't believe anyone can make your approach
> fast enough for people to accept, simply because the threshold of how long
> people are willing to wait for compilation is so low nowadays. Look at the
> Rust community, which is struggling to make the speed of the Rust compiler acceptable,
> despite the fact that they are using algorithms with much lower complexities
> than what you are proposing.
> 
> Sorry to be the bearer of bad news, but you did ask ...
> 
> Zoltan.
> _______________________________________________
> users mailing list
> users at lists.mercurylang.org
> https://lists.mercurylang.org/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurylang.org/archives/users/attachments/20220320/3727825f/attachment-0001.html>


More information about the users mailing list