[m-rev.] HLC backend speedup for N-queens

fabrice nicol fabrnicol at gmail.com
Mon Jan 18 20:09:53 AEDT 2021

Previous message: [m-rev.] diff: use global registers on aarch64
Next message: [m-rev.] HLC backend speedup for N-queens
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Below is a benchmark of N-queens runs for N=14 to N=16, using the 
Mercury repository test file cqueens.m, slightly modified using an extra 
`solutions' predicate added to `main' and cutting down on output.

Mercury does a good job, somewhere in-between two fairly naive C++ 
implementations, and twice as fast as the closest Prolog compiler.

The C benchmark is only given as incidental reference, since its 
algorithm is heavily optimized using geometrical properties and 
low-level techniques.

What struck me particularly is that the HLC backend now outperforms the 
LLC one significantly in this test case, which is pleasantly unexpected. 
Further, the speedup increases with N (Table 2).

*Table 1   Prolog/C++/Mercury  Benchmark for N-queens*

*N                    14        15  16 *
    ------------------------------------------
    Yap 6.4 interpreter [1] 265
    SWI interpreter [1]     185    1225
    C++ naive  [2]           31.8   221    1608
    GNU-Prolog 1.4.5* [1]    24.8   165    1192
    Mercury asm_fast.gc      10.8    93     732
    Mercury hlc.gc            9.5    80     615
    C++ naive  [3]            4.3    28.5   190
    C optimized [4]           0.05    0.25    1.6
    -------------------------------------------
    Intel Core-i7 5820K CPU @ 3.30GHz, 78 GiB RAM.
    Ubuntu 20.10, gcc/g++ 10.2.0. Mercury 20.06.1 built from source.
    *Used as a compiler. Real execution times averaged over 5 runs.

    [1] https://github.com/SWI-Prolog/bench/blob/master/queens_8.pl 
generalized to N queens
    [2] https://gist.github.com/iyedb/9300013
    [3] 
https://stackoverflow.com/questions/20024429/n-queens-solutions-c, 
top-rated reply
    [4] http://www.ic-net.or.jp/home/takaken/e/queen/


*Table 2   HLC to LLC speedup for N-queens*

*    N                       14     15    16*
     ----------------------------------------

     speedup                 12%    14%   16%

However, optimization options (-O1 to -O6) had no statistically 
significant effect on either Mercury grades, performance-wise.

If these results proved essentially correct for other common test cases, 
it might be relevant to switch the default Unix compilation grade from 
asm_fast.gc (as is currently the case) to hlc.gc.

Fabrice Nicol

Disclaimer: I'm a statistician with some experience in programming, not 
a computer scientist.

Previous message: [m-rev.] diff: use global registers on aarch64
Next message: [m-rev.] HLC backend speedup for N-queens
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the reviews mailing list