[m-dev.] MLDS back-end performance: a preliminary benchmark

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Nov 11 06:03:39 AEDT 1999


On 09-Nov-1999, Fergus Henderson <fjh at cs.mu.OZ.AU> wrote:
> I ran the N-queens benchmark (for N = 10,
> and finding the first solution only).

The previous benchmark ran too quick for the times to be reliable.
So I tried again, this time finding all solutions.
This was also after I had implemented tail recursion for the MLDS
back-end.  The MLDS back-end beats the asm_fast.gc back-end by 37%
on both user time and elapsed time.  So it looks like that result
is going to hold up.

On the other hand, for `crypt', the MLDS back-end loses by 28%
at -O6 (but wins by 41% at -O2).  For `poly', the MLDS back-end
loses by 4% at -O6 (but wins by 24% at -O2).  Note that I used
`--cflags -O3' for the MLDS back-end throughout, so maybe that
explains why the MLDS back-end often wins at -O2.

For `deriv', the MLDS back-end wins by about 15%.
Interestingly, the MLDS back-end runs faster at `-O2'
than at `-O6' for this benchmark.

For `tak', with `-O2' the times are basically a wash -- the asm_fast.gc
back-end wins by a whisker (0.5%).  But with `-O6' the MLDS back-end
slows down, and the LLDS back-end speeds up;
the LLDS back-end at -O6 beets the MLDS back-end at -O2 by 11%.

Since `-O6' often slows the MLDS back-end code down,
it seems that some of the front-end optimizations which
are good for the LLDS back-end are not good for the MLDS back-end.
I didn't try other optimization levels yet.
But it looks like we'll need to tweak things a bit.

Anyway, what follows is the raw benchmark times,
plus most of the very hacked-together stuff that I used
to run the benchmark.

Cheers,
	Fergus.


The following times are on hg
(a 500MHz Pentium III running Red Hat Linux 6.0).

Name			Options
----			-------
foo.hlc.gcc		--high-level-c --gcc-nested-functions \
			--gcc-local-labels --inline-alloc --cflags -O3
foo.hlc			--high-level-c --inline-alloc --cflags -O3
foo			(none)
foo.hlc.gcc.O6		--high-level-c --gcc-nested-functions \
			--gcc-local-labels -O6 --cflags -O3
foo.hlc.O6		--high-level-c -O6 --cflags -O3
foo.O6			-O6

User   System Elapsed CPU
time   time   time    usage	Page faults
----   ------ ------- -----	-----------

queens10all.hlc.gcc
29.680u 0.000s 0:29.68 100.0%	0+0k 0+0io 33pf+0w
29.670u 0.000s 0:29.70 99.8%	0+0k 0+0io 33pf+0w
29.660u 0.000s 0:29.82 99.4%	0+0k 0+0io 33pf+0w
29.910u 0.030s 0:30.02 99.7%	0+0k 0+0io 33pf+0w
29.710u 0.000s 0:29.79 99.7%	0+0k 0+0io 33pf+0w
29.670u 0.000s 0:29.68 99.9%	0+0k 0+0io 33pf+0w
29.740u 0.000s 0:29.75 99.9%	0+0k 0+0io 33pf+0w
29.730u 0.000s 0:29.73 100.0%	0+0k 0+0io 33pf+0w
queens10all.hlc.gcc.O6
26.640u 0.020s 0:26.68 99.9%	0+0k 0+0io 33pf+0w
26.680u 0.010s 0:26.69 100.0%	0+0k 0+0io 33pf+0w
26.710u 0.000s 0:26.71 100.0%	0+0k 0+0io 33pf+0w
26.660u 0.000s 0:26.66 100.0%	0+0k 0+0io 33pf+0w
26.660u 0.020s 0:26.69 99.9%	0+0k 0+0io 33pf+0w
26.670u 0.020s 0:26.68 100.0%	0+0k 0+0io 33pf+0w
26.660u 0.000s 0:26.66 100.0%	0+0k 0+0io 33pf+0w
26.660u 0.000s 0:26.66 100.0%	0+0k 0+0io 33pf+0w
queens10all.hlc
8.850u 0.000s 0:08.86 99.8%	0+0k 0+0io 32pf+0w
8.860u 0.000s 0:08.90 99.5%	0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0%	0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0%	0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0%	0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.84 100.0%	0+0k 0+0io 32pf+0w
8.840u 0.000s 0:08.85 99.8%	0+0k 0+0io 32pf+0w
8.860u 0.000s 0:08.87 99.8%	0+0k 0+0io 32pf+0w
queens10all.hlc.O6
8.220u 0.010s 0:08.22 100.1%	0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.22 100.0%	0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.22 100.0%	0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.22 100.0%	0+0k 0+0io 33pf+0w
8.230u 0.000s 0:08.23 100.0%	0+0k 0+0io 33pf+0w
8.220u 0.000s 0:08.23 99.8%	0+0k 0+0io 33pf+0w
8.220u 0.010s 0:08.23 100.0%	0+0k 0+0io 33pf+0w
8.240u 0.000s 0:08.23 100.1%	0+0k 0+0io 33pf+0w
queens10all
14.420u 0.050s 0:14.48 99.9%	0+0k 0+0io 53pf+0w
14.510u 0.040s 0:14.56 99.9%	0+0k 0+0io 53pf+0w
14.410u 0.050s 0:14.48 99.8%	0+0k 0+0io 53pf+0w
14.660u 0.080s 0:14.86 99.1%	0+0k 0+0io 53pf+0w
14.440u 0.040s 0:14.48 100.0%	0+0k 0+0io 53pf+0w
14.520u 0.020s 0:14.54 100.0%	0+0k 0+0io 53pf+0w
14.440u 0.030s 0:14.48 99.9%	0+0k 0+0io 53pf+0w
14.520u 0.020s 0:14.55 99.9%	0+0k 0+0io 53pf+0w
queens10all.O6
11.330u 0.030s 0:11.37 99.9%	0+0k 0+0io 53pf+0w
11.260u 0.050s 0:11.31 100.0%	0+0k 0+0io 53pf+0w
11.310u 0.030s 0:11.37 99.7%	0+0k 0+0io 53pf+0w
11.270u 0.040s 0:11.30 100.0%	0+0k 0+0io 53pf+0w
11.330u 0.030s 0:11.36 100.0%	0+0k 0+0io 53pf+0w
11.280u 0.030s 0:11.31 100.0%	0+0k 0+0io 53pf+0w
11.310u 0.050s 0:11.36 100.0%	0+0k 0+0io 53pf+0w
11.280u 0.030s 0:11.31 100.0%	0+0k 0+0io 53pf+0w

crypt.hlc.gcc
1.020u 0.010s 0:01.02 100.9%	0+0k 0+0io 30pf+0w
1.020u 0.000s 0:01.02 100.0%	0+0k 0+0io 30pf+0w
1.030u 0.010s 0:01.03 100.9%	0+0k 0+0io 30pf+0w
1.010u 0.010s 0:01.02 100.0%	0+0k 0+0io 30pf+0w
1.020u 0.010s 0:01.02 100.9%	0+0k 0+0io 30pf+0w
1.020u 0.000s 0:01.02 100.0%	0+0k 0+0io 30pf+0w
1.030u 0.010s 0:01.03 100.9%	0+0k 0+0io 30pf+0w
1.420u 0.000s 0:01.42 100.0%	0+0k 0+0io 30pf+0w
crypt.hlc.gcc.O6
0.690u 0.000s 0:00.69 100.0%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.68 101.4%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.68 101.4%	0+0k 0+0io 28pf+0w
0.690u 0.000s 0:00.69 100.0%	0+0k 0+0io 28pf+0w
crypt.hlc
0.880u 0.000s 0:00.88 100.0%	0+0k 0+0io 28pf+0w
0.890u 0.000s 0:00.88 101.1%	0+0k 0+0io 28pf+0w
0.890u 0.000s 0:00.88 101.1%	0+0k 0+0io 28pf+0w
0.880u 0.000s 0:00.88 100.0%	0+0k 0+0io 28pf+0w
0.880u 0.010s 0:00.88 101.1%	0+0k 0+0io 28pf+0w
0.880u 0.000s 0:00.88 100.0%	0+0k 0+0io 28pf+0w
0.880u 0.000s 0:00.88 100.0%	0+0k 0+0io 28pf+0w
0.880u 0.010s 0:00.88 101.1%	0+0k 0+0io 28pf+0w
crypt.hlc.O6
0.680u 0.000s 0:00.68 100.0%	0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0%	0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0%	0+0k 0+0io 30pf+0w
0.680u 0.010s 0:00.68 101.4%	0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0%	0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0%	0+0k 0+0io 30pf+0w
0.680u 0.000s 0:00.68 100.0%	0+0k 0+0io 30pf+0w
0.690u 0.000s 0:00.68 101.4%	0+0k 0+0io 30pf+0w
crypt
1.200u 0.040s 0:01.24 100.0%	0+0k 0+0io 64pf+0w
1.190u 0.060s 0:01.24 100.8%	0+0k 0+0io 59pf+0w
1.220u 0.030s 0:01.24 100.8%	0+0k 0+0io 59pf+0w
1.220u 0.030s 0:01.24 100.8%	0+0k 0+0io 59pf+0w
1.190u 0.060s 0:01.24 100.8%	0+0k 0+0io 59pf+0w
1.190u 0.060s 0:01.24 100.8%	0+0k 0+0io 59pf+0w
1.210u 0.040s 0:01.25 100.0%	0+0k 0+0io 59pf+0w
1.220u 0.030s 0:01.25 100.0%	0+0k 0+0io 59pf+0w
crypt.O6
0.510u 0.030s 0:00.53 101.8%	0+0k 0+0io 60pf+0w
0.500u 0.040s 0:00.53 101.8%	0+0k 0+0io 60pf+0w
0.490u 0.050s 0:00.53 101.8%	0+0k 0+0io 60pf+0w
0.490u 0.050s 0:00.53 101.8%	0+0k 0+0io 60pf+0w
0.490u 0.040s 0:00.53 100.0%	0+0k 0+0io 60pf+0w
0.480u 0.060s 0:00.53 101.8%	0+0k 0+0io 60pf+0w
0.510u 0.030s 0:00.53 101.8%	0+0k 0+0io 60pf+0w
0.500u 0.040s 0:00.53 101.8%	0+0k 0+0io 60pf+0w

deriv.hlc.gcc
3.600u 0.000s 0:03.64 98.9%	0+0k 0+0io 30pf+0w
3.480u 0.000s 0:03.60 96.6%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.47 100.0%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2%	0+0k 0+0io 30pf+0w
3.580u 0.000s 0:03.58 100.0%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2%	0+0k 0+0io 30pf+0w
3.460u 0.000s 0:03.46 100.0%	0+0k 0+0io 30pf+0w
deriv.hlc.gcc.O6
3.570u 0.000s 0:03.57 100.0%	0+0k 0+0io 30pf+0w
3.580u 0.000s 0:03.59 99.7%	0+0k 0+0io 30pf+0w
3.570u 0.000s 0:03.57 100.0%	0+0k 0+0io 30pf+0w
3.560u 0.010s 0:03.57 100.0%	0+0k 0+0io 30pf+0w
3.580u 0.020s 0:03.59 100.2%	0+0k 0+0io 30pf+0w
3.580u 0.000s 0:03.57 100.2%	0+0k 0+0io 30pf+0w
3.560u 0.010s 0:03.58 99.7%	0+0k 0+0io 30pf+0w
3.590u 0.000s 0:03.59 100.0%	0+0k 0+0io 30pf+0w
deriv.hlc
3.460u 0.010s 0:03.46 100.2%	0+0k 0+0io 30pf+0w
3.470u 0.010s 0:03.48 100.0%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.47 100.0%	0+0k 0+0io 30pf+0w
3.460u 0.000s 0:03.46 100.0%	0+0k 0+0io 30pf+0w
3.460u 0.000s 0:03.46 100.0%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.46 100.2%	0+0k 0+0io 30pf+0w
3.470u 0.000s 0:03.47 100.0%	0+0k 0+0io 30pf+0w
3.560u 0.000s 0:03.55 100.2%	0+0k 0+0io 30pf+0w
deriv.hlc.O6
3.570u 0.000s 0:03.56 100.2%	0+0k 0+0io 30pf+0w
3.570u 0.000s 0:03.57 100.0%	0+0k 0+0io 30pf+0w
3.650u 0.010s 0:03.66 100.0%	0+0k 0+0io 30pf+0w
3.570u 0.020s 0:03.58 100.2%	0+0k 0+0io 30pf+0w
3.560u 0.000s 0:03.61 98.6%	0+0k 0+0io 30pf+0w
3.680u 0.000s 0:03.74 98.3%	0+0k 0+0io 30pf+0w
3.570u 0.000s 0:03.72 95.9%	0+0k 0+0io 30pf+0w
3.600u 0.000s 0:03.63 99.1%	0+0k 0+0io 30pf+0w
deriv
4.590u 0.040s 0:04.62 100.2%	0+0k 0+0io 59pf+0w
4.600u 0.050s 0:04.64 100.2%	0+0k 0+0io 59pf+0w
4.570u 0.050s 0:04.63 99.7%	0+0k 0+0io 59pf+0w
4.580u 0.070s 0:04.64 100.2%	0+0k 0+0io 59pf+0w
4.600u 0.030s 0:04.64 99.7%	0+0k 0+0io 59pf+0w
4.610u 0.040s 0:04.64 100.2%	0+0k 0+0io 59pf+0w
4.580u 0.050s 0:04.63 100.0%	0+0k 0+0io 59pf+0w
4.610u 0.030s 0:04.66 99.5%	0+0k 0+0io 59pf+0w
deriv.O6
3.960u 0.040s 0:04.00 100.0%	0+0k 0+0io 59pf+0w
3.980u 0.040s 0:04.01 100.2%	0+0k 0+0io 59pf+0w
3.950u 0.050s 0:04.00 100.0%	0+0k 0+0io 59pf+0w
3.960u 0.060s 0:04.01 100.2%	0+0k 0+0io 59pf+0w
3.970u 0.040s 0:04.00 100.2%	0+0k 0+0io 59pf+0w
3.970u 0.040s 0:04.01 100.0%	0+0k 0+0io 59pf+0w
3.960u 0.040s 0:04.00 100.0%	0+0k 0+0io 59pf+0w
3.970u 0.040s 0:04.01 100.0%	0+0k 0+0io 59pf+0w

tak.hlc.gcc
5.010u 0.000s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:04.96 101.2%	0+0k 0+0io 25pf+0w
5.000u 0.010s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
tak.hlc.gcc.O6
5.120u 0.000s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0%	0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0%	0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0%	0+0k 0+0io 22pf+0w
tak.hlc
5.010u 0.000s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.02 100.0%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
5.020u 0.000s 0:05.01 100.1%	0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
5.010u 0.000s 0:05.01 100.0%	0+0k 0+0io 25pf+0w
tak.hlc.O6
5.120u 0.000s 0:05.12 100.0%	0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.12 100.0%	0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0%	0+0k 0+0io 22pf+0w
5.110u 0.010s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.120u 0.000s 0:05.11 100.1%	0+0k 0+0io 22pf+0w
5.110u 0.000s 0:05.11 100.0%	0+0k 0+0io 22pf+0w
tak
4.990u 0.020s 0:05.00 100.2%	0+0k 0+0io 57pf+0w
5.000u 0.020s 0:05.01 100.1%	0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.00 100.2%	0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.01 100.0%	0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.01 100.0%	0+0k 0+0io 57pf+0w
4.990u 0.020s 0:05.01 100.0%	0+0k 0+0io 57pf+0w
4.980u 0.020s 0:05.00 100.0%	0+0k 0+0io 57pf+0w
5.010u 0.010s 0:05.01 100.1%	0+0k 0+0io 57pf+0w
tak.O6
4.490u 0.010s 0:04.49 100.2%	0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0%	0+0k 0+0io 56pf+0w
4.460u 0.030s 0:04.49 100.0%	0+0k 0+0io 56pf+0w
4.460u 0.040s 0:04.49 100.2%	0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0%	0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0%	0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0%	0+0k 0+0io 56pf+0w
4.470u 0.020s 0:04.49 100.0%	0+0k 0+0io 56pf+0w

poly.hlc.gcc
9.570u 0.010s 0:09.58 100.0%	0+0k 0+0io 29pf+0w
9.610u 0.000s 0:09.61 100.0%	0+0k 0+0io 29pf+0w
9.580u 0.000s 0:09.58 100.0%	0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.59 100.0%	0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.58 100.1%	0+0k 0+0io 29pf+0w
9.580u 0.000s 0:09.58 100.0%	0+0k 0+0io 29pf+0w
9.570u 0.000s 0:09.58 99.8%	0+0k 0+0io 29pf+0w
9.570u 0.010s 0:09.58 100.0%	0+0k 0+0io 29pf+0w
poly.hlc.gcc.O6
9.920u 0.000s 0:09.91 100.1%	0+0k 0+0io 29pf+0w
9.910u 0.000s 0:09.91 100.0%	0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.90 100.0%	0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.90 100.0%	0+0k 0+0io 29pf+0w
9.910u 0.000s 0:09.91 100.0%	0+0k 0+0io 29pf+0w
9.920u 0.000s 0:09.91 100.1%	0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.90 100.0%	0+0k 0+0io 29pf+0w
9.920u 0.000s 0:09.92 100.0%	0+0k 0+0io 29pf+0w
poly.hlc
9.580u 0.010s 0:09.59 100.0%	0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.59 100.0%	0+0k 0+0io 29pf+0w
9.600u 0.000s 0:09.59 100.1%	0+0k 0+0io 29pf+0w
9.570u 0.000s 0:09.58 99.8%	0+0k 0+0io 29pf+0w
9.620u 0.000s 0:09.62 100.0%	0+0k 0+0io 29pf+0w
9.580u 0.000s 0:09.58 100.0%	0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.58 100.1%	0+0k 0+0io 29pf+0w
9.590u 0.000s 0:09.59 100.0%	0+0k 0+0io 29pf+0w
poly.hlc.O6
9.910u 0.010s 0:09.91 100.1%	0+0k 0+0io 29pf+0w
10.040u 0.010s 0:10.05 100.0%	0+0k 0+0io 29pf+0w
9.910u 0.010s 0:09.91 100.1%	0+0k 0+0io 29pf+0w
9.950u 0.020s 0:09.98 99.8%	0+0k 0+0io 29pf+0w
9.940u 0.000s 0:09.93 100.1%	0+0k 0+0io 29pf+0w
9.920u 0.000s 0:09.91 100.1%	0+0k 0+0io 29pf+0w
9.900u 0.000s 0:09.91 99.8%	0+0k 0+0io 29pf+0w
9.970u 0.000s 0:09.98 99.8%	0+0k 0+0io 29pf+0w
poly
11.860u 0.050s 0:11.91 100.0%	0+0k 0+0io 60pf+0w
11.900u 0.030s 0:11.92 100.0%	0+0k 0+0io 58pf+0w
11.860u 0.040s 0:11.90 100.0%	0+0k 0+0io 58pf+0w
11.870u 0.050s 0:11.91 100.0%	0+0k 0+0io 58pf+0w
11.840u 0.050s 0:11.89 100.0%	0+0k 0+0io 58pf+0w
11.860u 0.060s 0:11.91 100.0%	0+0k 0+0io 58pf+0w
11.890u 0.020s 0:11.91 100.0%	0+0k 0+0io 58pf+0w
11.900u 0.020s 0:11.92 100.0%	0+0k 0+0io 58pf+0w
poly.O6
9.210u 0.040s 0:09.25 100.0%	0+0k 0+0io 60pf+0w
9.210u 0.050s 0:09.25 100.1%	0+0k 0+0io 58pf+0w
9.200u 0.040s 0:09.24 100.0%	0+0k 0+0io 58pf+0w
9.210u 0.040s 0:09.25 100.0%	0+0k 0+0io 58pf+0w
9.200u 0.040s 0:09.24 100.0%	0+0k 0+0io 58pf+0w
9.220u 0.030s 0:09.24 100.1%	0+0k 0+0io 58pf+0w
9.220u 0.020s 0:09.24 100.0%	0+0k 0+0io 58pf+0w
9.190u 0.040s 0:09.24 99.8%	0+0k 0+0io 58pf+0w

The following is the hacked-together runtime system
that I am using to test the MLDS back-end.

/* mercury.io.h */
    #define MR_GC_MALLOC_WORDS(bytes)                                   \
        ( __builtin_constant_p(bytes) && (bytes) < 16                   \
        ? ({    void * temp;                                            \
                /* if size > 1, round up to an even number of words */  \
                Word num_words = ((bytes) < 4 ? 1 : 2 * (((bytes) + 7) / 8));\
                GC_MALLOC_WORDS(temp, num_words);                       \
		/* return */ temp;					\
          })                                                            \
        : GC_MALLOC(bytes)                         			\
        )

#define MR_new_object(type, size, name) ((Word) (type *) MR_GC_MALLOC_WORDS(size))
#define mercury__private_builtin__SIZEOF_WORD sizeof(Word)

#include "mercury_imp.h"

void mercury__io__write_string_3_p_0(String string);
void mercury__io__write_int_3_p_0(Integer i);

/* mercury.io.c */
#include "mercury_imp.h"
#include "mercury.io.h"
#include <stdio.h>

void mercury__io__write_string_3_p_0(String s) {
	printf("%s", s);
}
void mercury__io__write_int_3_p_0(Integer i) {
	printf("%d", i);
}

void
MR_init_GC(void)
{
#ifdef CONSERVATIVE_GC
	GC_quiet = TRUE;

	/*
	** Call GC_INIT() to tell the garbage collector about this DLL.
	** (This is necessary to support Windows DLLs using gnu-win32.)
	*/
	GC_INIT();

#if 0
	/*
	** call the init_gc() function defined in <foo>_init.c,
	** which calls GC_INIT() to tell the GC about the main program.
	** (This is to work around a Solaris 2.X (X <= 4) linker bug,
	** and also to support Windows DLLs using gnu-win32.)
	*/
	(*address_of_init_gc)();

	/*
	** Double-check that the garbage collector knows about
	** global variables in shared libraries.
	*/
	GC_is_visible(&MR_runqueue_head);
#endif

	/* The following code is necessary to tell the conservative */
	/* garbage collector that we are using tagged pointers */
	{
		int i;

		for (i = 1; i < (1 << TAGBITS); i++) {
			GC_REGISTER_DISPLACEMENT(i);
		}
	}
#endif
}

#
# Makefile -- use in tests/benchmarks
#
GCCF = --gcc-nested-functions --gcc-local-labels --cflags "-O3 -Wno-missing-prototypes"
HLF = --high-level-c --no-line-numbers --cflags -O3 --inline-alloc
MCF = --c-debug

N = 10
#TEST = queens($N)
TEST = tak
TESTOBJS = int.o

PROGS = $(TEST).m printlist.m
OBJS = $(TEST).o printlist.o 
HLOBJS = $(TEST).o printlist.o $(TEST)_main.o $(TESTOBJS)
TARGETS = $(TEST).hlc.gcc $(TEST).hlc.gcc.O6 \
	  $(TEST).hlc $(TEST).hlc.O6 \
	  $(TEST) $(TEST).O6
#TARGETS = $(TEST).hlc.gcc \
#	  $(TEST).hlc \
#	  $(TEST) $(TEST).O6

.PHONY: all clean
all: $(TARGETS)

$(TARGETS) : $(PROGS)

times.$(TEST): $(TARGETS)
	echo > times.$(TEST)
	for file in $(TARGETS); do			\
		echo $$file;				\
		time ./$$file;				\
		echo $$file >> times.$(TEST);		\
		ttime 8 ./$$file >> times.$(TEST);	\
		echo >> $times.$(TEST);			\
	done

clean:
	rm -f $(TARGETS)

$(TEST).hlc.gcc.old: $(TEST)_main.o
	mmc -c $(MCF) $(HLF) $(GCCF) $(PROGS)
	ml -o $@ $(HLOBJS) mercury.io.o

$(TEST).hlc.gcc: $(TEST)_main.o
	mchg -c $(MCF) $(HLF) $(GCCF) $(PROGS)
	ml -o $@ $(HLOBJS) mercury.io.o

$(TEST).hlc.gcc.O6: $(TEST)_main.o
	mchg -c $(MCF) -O6 $(HLF) $(GCCF) $(PROGS)
	ml -o $@ $(HLOBJS) mercury.io.o

$(TEST).hlc: $(TEST)_main.o
	mchg -c $(MCF) $(HLF) $(PROGS)
	ml -o $@ $(HLOBJS) mercury.io.o

$(TEST).hlc.O6: $(TEST)_main.o
	mchg -c $(MCF) -O6 $(HLF) $(PROGS)
	ml -o $@ $(HLOBJS) mercury.io.o

$(TEST):
	mmc -o $@ $(MCF) $(PROGS)

$(TEST).O6:
	mmc -o $@ $(MCF) -O6 $(PROGS)

$(TEST)_main.o: $(TEST)_main.c
	gcc -c $(TEST)_main.c


-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list