[m-dev.] MLDS back-end performance: a preliminary benchmark
Fergus Henderson
fjh at cs.mu.OZ.AU
Thu Nov 11 06:03:39 AEDT 1999
On 09-Nov-1999, Fergus Henderson <fjh at cs.mu.OZ.AU> wrote:
> I ran the N-queens benchmark (for N = 10,
> and finding the first solution only).
The previous benchmark ran too quick for the times to be reliable.
So I tried again, this time finding all solutions.
This was also after I had implemented tail recursion for the MLDS
back-end. The MLDS back-end beats the asm_fast.gc back-end by 37%
on both user time and elapsed time. So it looks like that result
is going to hold up.
On the other hand, for `crypt', the MLDS back-end loses by 28%
at -O6 (but wins by 41% at -O2). For `poly', the MLDS back-end
loses by 4% at -O6 (but wins by 24% at -O2). Note that I used
`--cflags -O3' for the MLDS back-end throughout, so maybe that
explains why the MLDS back-end often wins at -O2.
For `deriv', the MLDS back-end wins by about 15%.
Interestingly, the MLDS back-end runs faster at `-O2'
than at `-O6' for this benchmark.
For `tak', with `-O2' the times are basically a wash -- the asm_fast.gc
back-end wins by a whisker (0.5%). But with `-O6' the MLDS back-end
slows down, and the LLDS back-end speeds up;
the LLDS back-end at -O6 beets the MLDS back-end at -O2 by 11%.
Since `-O6' often slows the MLDS back-end code down,
it seems that some of the front-end optimizations which
are good for the LLDS back-end are not good for the MLDS back-end.
I didn't try other optimization levels yet.
But it looks like we'll need to tweak things a bit.
Anyway, what follows is the raw benchmark times,
plus most of the very hacked-together stuff that I used
to run the benchmark.
Cheers,
Fergus.
The following times are on hg
(a 500MHz Pentium III running Red Hat Linux 6.0).
Name Options
---- -------
foo.hlc.gcc --high-level-c --gcc-nested-functions \
--gcc-local-labels --inline-alloc --cflags -O3
foo.hlc --high-level-c --inline-alloc --cflags -O3
foo (none)
foo.hlc.gcc.O6 --high-level-c --gcc-nested-functions \
--gcc-local-labels -O6 --cflags -O3
foo.hlc.O6 --high-level-c -O6 --cflags -O3
foo.O6 -O6
User System Elapsed CPU
time time time usage Page faults
---- ------ ------- ----- -----------
queens10all.hlc.gcc
29.680u 0.000s 0:29.68 100.0% 0+0k 0+0io 33pf+0w
29.670u 0.000s 0:29.70 99.8% 0+0k 0+0io 33pf+0w
29.660u 0.000s 0:29.82 99.4% 0+0k 0+0io 33pf+0w
29.910u 0.030s 0:30.02 99.7% 0+0k 0+0io 33pf+0w
29.710u 0.000s 0:29.79 99.7% 0+0k 0+0io 33pf+0w
29.670u 0.000s 0:29.68 99.9% 0+0k 0+0io 33pf+0w
29.740u 0.000s 0:29.75 99.9% 0+0k 0+0io 33pf+0w
29.730u 0.000s 0:29.73 100.0% 0+0k 0+0io 33pf+0w
queens10all.hlc.gcc.O6
26.640u 0.020s 0:26.68 99.9% 0+0k 0+0io 33pf+0w
26.680u 0.010s 0:26.69 100.0% 0+0k 0+0io 33pf+0w
26.710u 0.000s 0:26.71 100.0% 0+0k 0+0io 33pf+0w
26.660u 0.000s 0:26.66 100.0% 0+0k 0+0io 33pf+0w
26.660u 0.020s 0:26.69 99.9% 0+0k 0+0io 33pf+0w
26.670u 0.020s 0:26.68 100.0% 0+0k 0+0io 33pf+0w
26.660u 0.000s 0:26.66 100.0% 0+0k 0+0io 33pf+0w
26.660u 0.000s 0:26.66 100.0% 0+0k 0+0io 33pf+0w
queens10all.hlc
8.850u 0.000s 0:08.86 99.8% 0+0k 0+0io 32pf+0w
8.860u 0.000s 0:08.90 99.5% 0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0% 0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0% 0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0% 0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0% 0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.85 99.8% 0+0k 0+0io 32pf+0w
8.860u 0.000s 0:08.87 99.8% 0+0k 0+0io 32pf+0w
queens10all.hlc.O6
8.220u 0.010s 0:08.22 100.1% 0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.22 100.0% 0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.22 100.0% 0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.22 100.0% 0+0k 0+0io 33pf+0w
8.230u 0.000s 0:08.23 100.0% 0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.23 99.8% 0+0k 0+0io 33pf+0w
8.220u 0.010s 0:08.23 100.0% 0+0k 0+0io 33pf+0w
8.240u 0.000s 0:08.23 100.1% 0+0k 0+0io 33pf+0w
queens10all
14.420u 0.050s 0:14.48 99.9% 0+0k 0+0io 53pf+0w
14.510u 0.040s 0:14.56 99.9% 0+0k 0+0io 53pf+0w
14.410u 0.050s 0:14.48 99.8% 0+0k 0+0io 53pf+0w
14.660u 0.080s 0:14.86 99.1% 0+0k 0+0io 53pf+0w
14.440u 0.040s 0:14.48 100.0% 0+0k 0+0io 53pf+0w
14.520u 0.020s 0:14.54 100.0% 0+0k 0+0io 53pf+0w
14.440u 0.030s 0:14.48 99.9% 0+0k 0+0io 53pf+0w
14.520u 0.020s 0:14.55 99.9% 0+0k 0+0io 53pf+0w
queens10all.O6
11.330u 0.030s 0:11.37 99.9% 0+0k 0+0io 53pf+0w
11.260u 0.050s 0:11.31 100.0% 0+0k 0+0io 53pf+0w
11.310u 0.030s 0:11.37 99.7% 0+0k 0+0io 53pf+0w
11.270u 0.040s 0:11.30 100.0% 0+0k 0+0io 53pf+0w
11.330u 0.030s 0:11.36 100.0% 0+0k 0+0io 53pf+0w
11.280u 0.030s 0:11.31 100.0% 0+0k 0+0io 53pf+0w
11.310u 0.050s 0:11.36 100.0% 0+0k 0+0io 53pf+0w
11.280u 0.030s 0:11.31 100.0% 0+0k 0+0io 53pf+0w
crypt.hlc.gcc
1.020u 0.010s 0:01.02 100.9% 0+0k 0+0io 30pf+0w
1.020u 0.000s 0:01.02 100.0% 0+0k 0+0io 30pf+0w
1.030u 0.010s 0:01.03 100.9% 0+0k 0+0io 30pf+0w
1.010u 0.010s 0:01.02 100.0% 0+0k 0+0io 30pf+0w
1.020u 0.010s 0:01.02 100.9% 0+0k 0+0io 30pf+0w
1.020u 0.000s 0:01.02 100.0% 0+0k 0+0io 30pf+0w
1.030u 0.010s 0:01.03 100.9% 0+0k 0+0io 30pf+0w
1.420u 0.000s 0:01.42 100.0% 0+0k 0+0io 30pf+0w
crypt.hlc.gcc.O6
0.690u 0.000s 0:00.69 100.0% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.68 101.4% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.68 101.4% 0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0% 0+0k 0+0io 28pf+0w
crypt.hlc
0.880u 0.000s 0:00.88 100.0% 0+0k 0+0io 28pf+0w
0.890u 0.000s 0:00.88 101.1% 0+0k 0+0io 28pf+0w
0.890u 0.000s 0:00.88 101.1% 0+0k 0+0io 28pf+0w
0.880u 0.000s 0:00.88 100.0% 0+0k 0+0io 28pf+0w
0.880u 0.010s 0:00.88 101.1% 0+0k 0+0io 28pf+0w
0.880u 0.000s 0:00.88 100.0% 0+0k 0+0io 28pf+0w
0.880u 0.000s 0:00.88 100.0% 0+0k 0+0io 28pf+0w
0.880u 0.010s 0:00.88 101.1% 0+0k 0+0io 28pf+0w
crypt.hlc.O6
0.680u 0.000s 0:00.68 100.0% 0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0% 0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0% 0+0k 0+0io 30pf+0w
0.680u 0.010s 0:00.68 101.4% 0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0% 0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0% 0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0% 0+0k 0+0io 30pf+0w
0.690u 0.000s 0:00.68 101.4% 0+0k 0+0io 30pf+0w
crypt
1.200u 0.040s 0:01.24 100.0% 0+0k 0+0io 64pf+0w
1.190u 0.060s 0:01.24 100.8% 0+0k 0+0io 59pf+0w
1.220u 0.030s 0:01.24 100.8% 0+0k 0+0io 59pf+0w
1.220u 0.030s 0:01.24 100.8% 0+0k 0+0io 59pf+0w
1.190u 0.060s 0:01.24 100.8% 0+0k 0+0io 59pf+0w
1.190u 0.060s 0:01.24 100.8% 0+0k 0+0io 59pf+0w
1.210u 0.040s 0:01.25 100.0% 0+0k 0+0io 59pf+0w
1.220u 0.030s 0:01.25 100.0% 0+0k 0+0io 59pf+0w
crypt.O6
0.510u 0.030s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
0.500u 0.040s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
0.490u 0.050s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
0.490u 0.050s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
0.490u 0.040s 0:00.53 100.0% 0+0k 0+0io 60pf+0w
0.480u 0.060s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
0.510u 0.030s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
0.500u 0.040s 0:00.53 101.8% 0+0k 0+0io 60pf+0w
deriv.hlc.gcc
3.600u 0.000s 0:03.64 98.9% 0+0k 0+0io 30pf+0w
3.480u 0.000s 0:03.60 96.6% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.47 100.0% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2% 0+0k 0+0io 30pf+0w
3.580u 0.000s 0:03.58 100.0% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2% 0+0k 0+0io 30pf+0w
3.460u 0.000s 0:03.46 100.0% 0+0k 0+0io 30pf+0w
deriv.hlc.gcc.O6
3.570u 0.000s 0:03.57 100.0% 0+0k 0+0io 30pf+0w
3.580u 0.000s 0:03.59 99.7% 0+0k 0+0io 30pf+0w
3.570u 0.000s 0:03.57 100.0% 0+0k 0+0io 30pf+0w
3.560u 0.010s 0:03.57 100.0% 0+0k 0+0io 30pf+0w
3.580u 0.020s 0:03.59 100.2% 0+0k 0+0io 30pf+0w
3.580u 0.000s 0:03.57 100.2% 0+0k 0+0io 30pf+0w
3.560u 0.010s 0:03.58 99.7% 0+0k 0+0io 30pf+0w
3.590u 0.000s 0:03.59 100.0% 0+0k 0+0io 30pf+0w
deriv.hlc
3.460u 0.010s 0:03.46 100.2% 0+0k 0+0io 30pf+0w
3.470u 0.010s 0:03.48 100.0% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.47 100.0% 0+0k 0+0io 30pf+0w
3.460u 0.000s 0:03.46 100.0% 0+0k 0+0io 30pf+0w
3.460u 0.000s 0:03.46 100.0% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2% 0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.47 100.0% 0+0k 0+0io 30pf+0w
3.560u 0.000s 0:03.55 100.2% 0+0k 0+0io 30pf+0w
deriv.hlc.O6
3.570u 0.000s 0:03.56 100.2% 0+0k 0+0io 30pf+0w
3.570u 0.000s 0:03.57 100.0% 0+0k 0+0io 30pf+0w
3.650u 0.010s 0:03.66 100.0% 0+0k 0+0io 30pf+0w
3.570u 0.020s 0:03.58 100.2% 0+0k 0+0io 30pf+0w
3.560u 0.000s 0:03.61 98.6% 0+0k 0+0io 30pf+0w
3.680u 0.000s 0:03.74 98.3% 0+0k 0+0io 30pf+0w
3.570u 0.000s 0:03.72 95.9% 0+0k 0+0io 30pf+0w
3.600u 0.000s 0:03.63 99.1% 0+0k 0+0io 30pf+0w
deriv
4.590u 0.040s 0:04.62 100.2% 0+0k 0+0io 59pf+0w
4.600u 0.050s 0:04.64 100.2% 0+0k 0+0io 59pf+0w
4.570u 0.050s 0:04.63 99.7% 0+0k 0+0io 59pf+0w
4.580u 0.070s 0:04.64 100.2% 0+0k 0+0io 59pf+0w
4.600u 0.030s 0:04.64 99.7% 0+0k 0+0io 59pf+0w
4.610u 0.040s 0:04.64 100.2% 0+0k 0+0io 59pf+0w
4.580u 0.050s 0:04.63 100.0% 0+0k 0+0io 59pf+0w
4.610u 0.030s 0:04.66 99.5% 0+0k 0+0io 59pf+0w
deriv.O6
3.960u 0.040s 0:04.00 100.0% 0+0k 0+0io 59pf+0w
3.980u 0.040s 0:04.01 100.2% 0+0k 0+0io 59pf+0w
3.950u 0.050s 0:04.00 100.0% 0+0k 0+0io 59pf+0w
3.960u 0.060s 0:04.01 100.2% 0+0k 0+0io 59pf+0w
3.970u 0.040s 0:04.00 100.2% 0+0k 0+0io 59pf+0w
3.970u 0.040s 0:04.01 100.0% 0+0k 0+0io 59pf+0w
3.960u 0.040s 0:04.00 100.0% 0+0k 0+0io 59pf+0w
3.970u 0.040s 0:04.01 100.0% 0+0k 0+0io 59pf+0w
tak.hlc.gcc
5.010u 0.000s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:04.96 101.2% 0+0k 0+0io 25pf+0w
5.000u 0.010s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
tak.hlc.gcc.O6
5.120u 0.000s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0% 0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0% 0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0% 0+0k 0+0io 22pf+0w
tak.hlc
5.010u 0.000s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.02 100.0% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1% 0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0% 0+0k 0+0io 25pf+0w
tak.hlc.O6
5.120u 0.000s 0:05.12 100.0% 0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.12 100.0% 0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0% 0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1% 0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0% 0+0k 0+0io 22pf+0w
tak
4.990u 0.020s 0:05.00 100.2% 0+0k 0+0io 57pf+0w
5.000u 0.020s 0:05.01 100.1% 0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.00 100.2% 0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.01 100.0% 0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.01 100.0% 0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.01 100.0% 0+0k 0+0io 57pf+0w
4.980u 0.020s 0:05.00 100.0% 0+0k 0+0io 57pf+0w
5.010u 0.010s 0:05.01 100.1% 0+0k 0+0io 57pf+0w
tak.O6
4.490u 0.010s 0:04.49 100.2% 0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0% 0+0k 0+0io 56pf+0w
4.460u 0.030s 0:04.49 100.0% 0+0k 0+0io 56pf+0w
4.460u 0.040s 0:04.49 100.2% 0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0% 0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0% 0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0% 0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0% 0+0k 0+0io 56pf+0w
poly.hlc.gcc
9.570u 0.010s 0:09.58 100.0% 0+0k 0+0io 29pf+0w
9.610u 0.000s 0:09.61 100.0% 0+0k 0+0io 29pf+0w
9.580u 0.000s 0:09.58 100.0% 0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.59 100.0% 0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.58 100.1% 0+0k 0+0io 29pf+0w
9.580u 0.000s 0:09.58 100.0% 0+0k 0+0io 29pf+0w
9.570u 0.000s 0:09.58 99.8% 0+0k 0+0io 29pf+0w
9.570u 0.010s 0:09.58 100.0% 0+0k 0+0io 29pf+0w
poly.hlc.gcc.O6
9.920u 0.000s 0:09.91 100.1% 0+0k 0+0io 29pf+0w
9.910u 0.000s 0:09.91 100.0% 0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.90 100.0% 0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.90 100.0% 0+0k 0+0io 29pf+0w
9.910u 0.000s 0:09.91 100.0% 0+0k 0+0io 29pf+0w
9.920u 0.000s 0:09.91 100.1% 0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.90 100.0% 0+0k 0+0io 29pf+0w
9.920u 0.000s 0:09.92 100.0% 0+0k 0+0io 29pf+0w
poly.hlc
9.580u 0.010s 0:09.59 100.0% 0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.59 100.0% 0+0k 0+0io 29pf+0w
9.600u 0.000s 0:09.59 100.1% 0+0k 0+0io 29pf+0w
9.570u 0.000s 0:09.58 99.8% 0+0k 0+0io 29pf+0w
9.620u 0.000s 0:09.62 100.0% 0+0k 0+0io 29pf+0w
9.580u 0.000s 0:09.58 100.0% 0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.58 100.1% 0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.59 100.0% 0+0k 0+0io 29pf+0w
poly.hlc.O6
9.910u 0.010s 0:09.91 100.1% 0+0k 0+0io 29pf+0w
10.040u 0.010s 0:10.05 100.0% 0+0k 0+0io 29pf+0w
9.910u 0.010s 0:09.91 100.1% 0+0k 0+0io 29pf+0w
9.950u 0.020s 0:09.98 99.8% 0+0k 0+0io 29pf+0w
9.940u 0.000s 0:09.93 100.1% 0+0k 0+0io 29pf+0w
9.920u 0.000s 0:09.91 100.1% 0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.91 99.8% 0+0k 0+0io 29pf+0w
9.970u 0.000s 0:09.98 99.8% 0+0k 0+0io 29pf+0w
poly
11.860u 0.050s 0:11.91 100.0% 0+0k 0+0io 60pf+0w
11.900u 0.030s 0:11.92 100.0% 0+0k 0+0io 58pf+0w
11.860u 0.040s 0:11.90 100.0% 0+0k 0+0io 58pf+0w
11.870u 0.050s 0:11.91 100.0% 0+0k 0+0io 58pf+0w
11.840u 0.050s 0:11.89 100.0% 0+0k 0+0io 58pf+0w
11.860u 0.060s 0:11.91 100.0% 0+0k 0+0io 58pf+0w
11.890u 0.020s 0:11.91 100.0% 0+0k 0+0io 58pf+0w
11.900u 0.020s 0:11.92 100.0% 0+0k 0+0io 58pf+0w
poly.O6
9.210u 0.040s 0:09.25 100.0% 0+0k 0+0io 60pf+0w
9.210u 0.050s 0:09.25 100.1% 0+0k 0+0io 58pf+0w
9.200u 0.040s 0:09.24 100.0% 0+0k 0+0io 58pf+0w
9.210u 0.040s 0:09.25 100.0% 0+0k 0+0io 58pf+0w
9.200u 0.040s 0:09.24 100.0% 0+0k 0+0io 58pf+0w
9.220u 0.030s 0:09.24 100.1% 0+0k 0+0io 58pf+0w
9.220u 0.020s 0:09.24 100.0% 0+0k 0+0io 58pf+0w
9.190u 0.040s 0:09.24 99.8% 0+0k 0+0io 58pf+0w
The following is the hacked-together runtime system
that I am using to test the MLDS back-end.
/* mercury.io.h */
#define MR_GC_MALLOC_WORDS(bytes) \
( __builtin_constant_p(bytes) && (bytes) < 16 \
? ({ void * temp; \
/* if size > 1, round up to an even number of words */ \
Word num_words = ((bytes) < 4 ? 1 : 2 * (((bytes) + 7) / 8));\
GC_MALLOC_WORDS(temp, num_words); \
/* return */ temp; \
}) \
: GC_MALLOC(bytes) \
)
#define MR_new_object(type, size, name) ((Word) (type *) MR_GC_MALLOC_WORDS(size))
#define mercury__private_builtin__SIZEOF_WORD sizeof(Word)
#include "mercury_imp.h"
void mercury__io__write_string_3_p_0(String string);
void mercury__io__write_int_3_p_0(Integer i);
/* mercury.io.c */
#include "mercury_imp.h"
#include "mercury.io.h"
#include <stdio.h>
void mercury__io__write_string_3_p_0(String s) {
printf("%s", s);
}
void mercury__io__write_int_3_p_0(Integer i) {
printf("%d", i);
}
void
MR_init_GC(void)
{
#ifdef CONSERVATIVE_GC
GC_quiet = TRUE;
/*
** Call GC_INIT() to tell the garbage collector about this DLL.
** (This is necessary to support Windows DLLs using gnu-win32.)
*/
GC_INIT();
#if 0
/*
** call the init_gc() function defined in <foo>_init.c,
** which calls GC_INIT() to tell the GC about the main program.
** (This is to work around a Solaris 2.X (X <= 4) linker bug,
** and also to support Windows DLLs using gnu-win32.)
*/
(*address_of_init_gc)();
/*
** Double-check that the garbage collector knows about
** global variables in shared libraries.
*/
GC_is_visible(&MR_runqueue_head);
#endif
/* The following code is necessary to tell the conservative */
/* garbage collector that we are using tagged pointers */
{
int i;
for (i = 1; i < (1 << TAGBITS); i++) {
GC_REGISTER_DISPLACEMENT(i);
}
}
#endif
}
#
# Makefile -- use in tests/benchmarks
#
GCCF = --gcc-nested-functions --gcc-local-labels --cflags "-O3 -Wno-missing-prototypes"
HLF = --high-level-c --no-line-numbers --cflags -O3 --inline-alloc
MCF = --c-debug
N = 10
#TEST = queens($N)
TEST = tak
TESTOBJS = int.o
PROGS = $(TEST).m printlist.m
OBJS = $(TEST).o printlist.o
HLOBJS = $(TEST).o printlist.o $(TEST)_main.o $(TESTOBJS)
TARGETS = $(TEST).hlc.gcc $(TEST).hlc.gcc.O6 \
$(TEST).hlc $(TEST).hlc.O6 \
$(TEST) $(TEST).O6
#TARGETS = $(TEST).hlc.gcc \
# $(TEST).hlc \
# $(TEST) $(TEST).O6
.PHONY: all clean
all: $(TARGETS)
$(TARGETS) : $(PROGS)
times.$(TEST): $(TARGETS)
echo > times.$(TEST)
for file in $(TARGETS); do \
echo $$file; \
time ./$$file; \
echo $$file >> times.$(TEST); \
ttime 8 ./$$file >> times.$(TEST); \
echo >> $times.$(TEST); \
done
clean:
rm -f $(TARGETS)
$(TEST).hlc.gcc.old: $(TEST)_main.o
mmc -c $(MCF) $(HLF) $(GCCF) $(PROGS)
ml -o $@ $(HLOBJS) mercury.io.o
$(TEST).hlc.gcc: $(TEST)_main.o
mchg -c $(MCF) $(HLF) $(GCCF) $(PROGS)
ml -o $@ $(HLOBJS) mercury.io.o
$(TEST).hlc.gcc.O6: $(TEST)_main.o
mchg -c $(MCF) -O6 $(HLF) $(GCCF) $(PROGS)
ml -o $@ $(HLOBJS) mercury.io.o
$(TEST).hlc: $(TEST)_main.o
mchg -c $(MCF) $(HLF) $(PROGS)
ml -o $@ $(HLOBJS) mercury.io.o
$(TEST).hlc.O6: $(TEST)_main.o
mchg -c $(MCF) -O6 $(HLF) $(PROGS)
ml -o $@ $(HLOBJS) mercury.io.o
$(TEST):
mmc -o $@ $(MCF) $(PROGS)
$(TEST).O6:
mmc -o $@ $(MCF) -O6 $(PROGS)
$(TEST)_main.o: $(TEST)_main.c
gcc -c $(TEST)_main.c
--
Fergus Henderson <fjh at cs.mu.oz.au> | "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh> | of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3 | -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to: mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions: mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------
More information about the developers
mailing list