[m-dev.] Summer Student Report

Ralph Becket rafe at cs.mu.OZ.AU
Wed Jan 11 11:24:54 AEDT 2006


Thanks Sam, that's good.

Regarding timing, we need only consider the amount of user time (i.e.,
not wall clock time or system time) in each run or compilation.  Apart
from cache effects, that should probably protect you from too much noise
due to other processes running at the same time.

Just a few comments on your code so far:

> %-----------------------------------------------------------------------------%
> % vim: ft=mercury ts=4 sw=4 et
> %-----------------------------------------------------------------------------%
> % Copyright (C) 2005-2006 The University of Melbourne.
> % This file may only be copied under the terms of the GNU General
> % Public License - see the file COPYING in the Mercury distribution.
> %-----------------------------------------------------------------------------%
> 
> % File: evolve.m.
> % Main author: samrith.
> 
> % This program implements part of a genetic algorithm to determine an
> % optimal set of optimisation flags to be passed to the Mercury compiler
> % for a given program.
> 
> % It expects to be given two pieces of data as input.  The first is a
> % list of genotypes, where each genotype is a set of strings representing
> % optimisation flags.  The second is a list of phenotypes, where each
> % phenotype is a list of benchmarks.  These are read from the files
> % MRFLAGS/$n/genotypes and MRFLAGS/$n/phenotypes, respectively.
> 
> % The program will then determine the next set of genotypes, which it
> % will write in the file MRFLAGS/$n+1/genotypes.  It will also create
> % the file MRFLAGS/$n+1/flags, which contains the flags passed to mmc
> % to compile a benchmark program.
> 
> % Note that this program does not perform the actual benchmarking tests,
> % nor does it control the evolution over multiple generations.  These
> % tasks are handled by the mrflags shell script.
> 
> %-----------------------------------------------------------------------------%

Our coding standards mandate the following block comment style (i.e.,
don't delete the comment character between paragraphs):

%-----------------------------------------------------------------------------%
% 
% File: evolve.m.
% Main author: samrith.
% 
% This program implements part of a genetic algorithm to determine an
% optimal set of optimisation flags to be passed to the Mercury compiler
% for a given program.
% 
% It expects to be given two pieces of data as input.  The first is a
% list of genotypes, where each genotype is a set of strings representing
etc.

This applies throughout your program.

> :- module evolve.
> :- interface.
> 
> :- import_module io.
> 
> :- pred main(io, io).
> :- mode main(di, uo) is det.
> 
> %-----------------------------------------------------------------------------%
> %-----------------------------------------------------------------------------%
> 
> :- implementation.
> 
> :- import_module char.
> :- import_module float.
> :- import_module getopt.
> :- import_module int.
> :- import_module list.
> :- import_module random.

The random module generates very poor pseudo-random sequences.  It's
almost certainly not good enough for GAs.  I recommend something better
like /home/mercury/rafe/mercury/rnd/tausworthe3.m.

> :- import_module require.
> :- import_module set.
> :- import_module string.
> 
> %-----------------------------------------------------------------------------%
> 
> :- type option
> 	--->	generation
> 	;		seed.
> 
> :- type genotype == set(flag).
> 
> :- type flag == string.
> 
> :- type phenotype
> 	--->	phenotype(
> 				compile_times		:: list(compile_time),
> 				executable_sizes	:: list(executable_size),
> 				run_times			:: list(run_time)
> 			).

You have an extra tab after run_times.

> :- type compile_time == float.
> 
> :- type executable_size == int.
> 
> :- type run_time == float.
> 
> :- type fitness == float.
> 
> main(!IO) :-
> 	%
> 	% Process any command line arguments.
> 	%
> 	command_line_arguments(Args, !IO),
> 	OptionOps = option_ops_multi(short_option, long_option, option_default),
> 	process_options(OptionOps, Args, _, Result),

Our standard block commenting style within a clause is like this:
> main(!IO) :-
> 
> 		% Process any command line arguments.
> 		%
> 	command_line_arguments(Args, !IO),
> 	OptionOps = option_ops_multi(short_option, long_option, option_default),
> 	process_options(OptionOps, Args, _, Result),
etc.

Also, it's a Good Idea to module qualify calls to imported predicates:

> 	io.command_line_arguments(Args, !IO),
> 	OptionOps = option_ops_multi(short_option, long_option, option_default),
> 	getopt.process_options(OptionOps, Args, _, Result),
etc.

> 	(
> 		Result = ok(OptionTable),
> 		lookup_int_option(OptionTable, generation, Generation),
> 		lookup_int_option(OptionTable, seed, Seed)
> 	;
> 		Result = error(Message),
> 		progname("evolve", ProgName, !IO),
> 		error(ProgName ++ ": " ++ Message)
> 	),
> 
> 	%
> 	% Read the input files.
> 	%
> 	read_genotypes(Generation, Genotypes, !IO),
> 	read_phenotypes(Generation, Phenotypes, !IO),
> 
> 	%
> 	% Apply the genetic operators to the genotypes.
> 	%
> 	some [!RS] (
> 		random.init(Seed, !:RS),
> 
> 		my_map2(fitness, Genotypes, Phenotypes, Fitness),
> 		selection(Genotypes, Fitness, Mothers, Fathers, !RS),
> 		my_map2_foldl(crossover, Mothers, Fathers, Children, !RS),
> 		map_foldl(mutation, Children, Mutants, !RS),
> 		map(flags, Mutants, Flags),
> 
> 		_ = !.RS							% XXX: suppress a compiler warning.

That comment should be on the same line as the goal above.

> 	),
> 
> 	%
> 	% Write the output files.
> 	%
> 	print_genotypes(Generation + 1, Mutants, !IO),
> 	print_flags(Generation + 1, Flags, !IO).
> 
> %-----------------------------------------------------------------------------%
> %
> % Command line argument parsing.
> %
> 
> %
> % This section contains all the code for the predicates required by
> % getopt.process_options.
> %
> 
> :- pred short_option(char, option).
> :- mode short_option(in, out) is semidet.
> 
> short_option('g', generation).
> short_option('s', seed).
> 
> :- pred long_option(string, option).
> :- mode long_option(in, out) is semidet.
> 
> long_option("generation", generation).
> long_option("seed", seed).
> 
> :- pred option_default(option, option_data).
> :- mode option_default(out, out) is multi.
> 
> option_default(generation, int(1)).
> option_default(seed, int(0)).
> 
> %-----------------------------------------------------------------------------%
> %
> % Replacements for list.map2/4 and list.map2_foldl/6.
> %
> 
> %
> % This section contains predicates to be used instead of list.map2/4 and
> % list.map2_foldl/6.  They are required because the fitness/3 and
> % crossover/3 predicates take two inputs and one output, which does not
> % match any of the declared modes for the standard library predicates.
> %
> % This code was taken mostly verbatim from mercury/library/list.m,
> % although the base cases were split into three clauses so that we can
> % switch on both the first and second input variables.
> %
> % XXX: should find a way to do this using the standard library.
> %
> 
> :- pred my_map2(pred(A, B, C), list(A), list(B), list(C)).
> :- mode my_map2(pred(in, in, out) is det, in, in, out) is det.
> 
> my_map2(_, [],        [],        []).
> my_map2(_, [],        [_H | _T], []).
> my_map2(_, [_H | _T], [],        []).
> my_map2(P, [H0 | T0], [H1 | T1], [H2 | T2]) :-
> 	call(P, H0, H1, H2),

Here you can use 

	P(H0, H1, H2)

instead, which IMHO is better style.

> 	my_map2(P, T0, T1, T2).
> 
> :- pred my_map2_foldl(pred(L, M, N, A, A), list(L), list(M), list(N), A, A).
> :- mode my_map2_foldl(pred(in, in, out, di, uo) is det, in, in, out, di,
> 		uo) is det.
> 
> my_map2_foldl(_, [],        [],        [],        !A).
> my_map2_foldl(_, [],        [_H | _T], [],        !A).
> my_map2_foldl(_, [_H | _T], [],        [],        !A).
> my_map2_foldl(P, [H0 | T0], [H1 | T1], [H2 | T2], !A) :-
> 	call(P, H0, H1, H2, !A),
> 	my_map2_foldl(P, T0, T1, T2, !A).
> 
> %-----------------------------------------------------------------------------%
> %
> % Input/output predicates and genetic operators.
> %
> 
> %
> % This section contains the predicates required by the genetic
> % algorithm, both the read and write genotypes and phenotypes to and from
> % files, and to breed a new generation of genotypes.
> %
> % Note that as there is a degree of randomness to the process of
> % evolution, a random.supply variable needs to be passed to the genetic
> % operators (selection/6, crossover/5 and mutation/4) by the caller.
> %
> 
> 	% read_genotypes(Generation, Genotypes, !IO).
> 	%
> 	% Reads a list of genotypes from the file
> 	% MRFLAGS/Generation/genotypes, and unifies the list with Genotypes.
> 	%
> :- pred read_genotypes(int, list(genotype), io, io).
> :- mode read_genotypes(in, out, di, uo) is det.
> 
> read_genotypes(Generation, Genotypes, !IO) :-
> 	FileName = "MRFLAGS/" ++ int_to_string(Generation) ++ "/genotypes",
> 	open_input(FileName, OpenInputResult, !IO),
> 	(
> 		OpenInputResult = ok(Stream),
> 		read(Stream, ReadResult, !IO),
> 		(
> 			ReadResult = ok(Genotypes)
> 		;
> 			ReadResult = eof,
> 			progname("evolve", ProgName, !IO),
> 			error(ProgName ++ ": unexpected EOF while reading genotype")
> 		;
> 			ReadResult = error(Message, LineNo),
> 			progname("evolve", ProgName, !IO),
> 			error(ProgName ++ ": " ++ int_to_string(LineNo) ++ ": " ++ Message)
> 		),
> 		close_input(Stream, !IO)
> 	;
> 		OpenInputResult = error(ErrorCode),
> 		progname("evolve", ProgName, !IO),
> 		error(ProgName ++ ": " ++ error_message(ErrorCode))
> 	).
> 
> 	% read_phenotypes(Generation, Phenotypes, !IO).
> 	%
> 	% Reads a list of phenotypes from the file
> 	% MRFLAGS/Generation/phenotypes, and unifies the list with Phenotypes.
> 	%
> 	% XXX: lots of code duplicated in read_genotypes/4 above.
> 	%

I think you could replace read_genotypes/read_phenotypes with a single
polymorphic predicate.  io.read should work fine.

> :- pred read_phenotypes(int, list(phenotype), io, io).
> :- mode read_phenotypes(in, out, di, uo) is det.
> 
> read_phenotypes(Generation, Phenotypes, !IO) :-
> 	FileName = "MRFLAGS/" ++ int_to_string(Generation) ++ "/phenotypes",
> 	open_input(FileName, OpenInputResult, !IO),
> 	(
> 		OpenInputResult = ok(Stream),
> 		read(Stream, Result, !IO),
> 		(
> 			Result = ok(Phenotypes)
> 		;
> 			Result = eof,
> 			progname("evolve", ProgName, !IO),
> 			error(ProgName ++ ": unexpected EOF while reading phenotype")
> 		;
> 			Result = error(Message, LineNo),
> 			progname("evolve", ProgName, !IO),
> 			error(ProgName ++ ": " ++ int_to_string(LineNo) ++ ": " ++ Message)
> 		),
> 		close_input(Stream, !IO)
> 	;
> 		OpenInputResult = error(ErrorCode),
> 		progname("evolve", ProgName, !IO),
> 		error(ProgName ++ ": " ++ error_message(ErrorCode))
> 	).
> 
> 	% fitness(Genotype, Phenotype, Fitness).
> 	%
> 	% This predicate evaluates the fitness of a genotype given its
> 	% genotype.
> 	%
> :- pred fitness(genotype, phenotype, fitness).
> :- mode fitness(in, in, out) is det.
> 
> % FIXME: implement this.
> fitness(_, _, 1.0).
> 
> 	% selection(Genotypes, Fitness, Mothers, Fathers, !RS).
> 	%
> 	% This predicate randomly selects pairs of individuals for
> 	% reproduction.
> 	%
> :- pred selection(list(genotype), list(fitness), list(genotype),
> 		list(genotype), supply, supply).
> :- mode selection(in, in, out, out, di, uo) is det.
> 
> % FIXME: implement this.
> selection(Genotypes, _, Genotypes, Genotypes, !RS).
> 
> 	% crossover(Mother, Father, Child, !RS).
> 	%
> 	% This predicate takes two parent genotypes and randomly selects
> 	% optimisation flags from each to create a new child genotype.
> 	%
> :- pred crossover(genotype, genotype, genotype, supply, supply).
> :- mode crossover(in, in, out, di, uo) is det.
> 
> % FIXME: implement this.
> crossover(Mother, _, Mother, !RS).
> 
> 	% mutation(Child, Mutant, !RS).
> 	%
> 	% This predicate randomly mutates a genotype.
> 	%
> :- pred mutation(genotype, genotype, supply, supply).
> :- mode mutation(in, out, di, uo) is det.
> 
> % FIXME: implement this.
> mutation(!Genotype, !RS).
> 
> 	% flags(Genotype, Flags).
> 	%
> 	% Flags contains a shell statement to define a variable
> 	% containing the flags used to compile a benchmark program for a
> 	% given genotype.
> 	%
> :- pred flags(genotype, string).
> :- mode flags(in, out) is det.
> 
> flags(Genotype, Flags) :-
> 	Flags = join_list(" ", to_sorted_list(Genotype)) ++ "\n".
> 
> 	% print_genotypes(Generation, Genotypes, !IO).
> 	%
> 	% This predicate prints the list of genotypes to the file
> 	% MRFLAGS/Generation/genotypes.
> 	%
> :- pred print_genotypes(int, list(genotype), io, io).
> :- mode print_genotypes(in, in, di, uo) is det.
> 
> print_genotypes(Generation, Genotypes, !IO) :-
> 	FileName = "MRFLAGS/" ++ int_to_string(Generation) ++ "/genotypes",
> 	open_output(FileName, Result, !IO),
> 	(
> 		Result = ok(Stream),
> 		print(Stream, Genotypes, !IO),
> 		format(Stream, ".\n", [], !IO),
> 		close_output(Stream, !IO)
> 	;
> 		Result = error(ErrorCode),
> 		error(error_message(ErrorCode))
> 	).
> 
> 	% print_flags(Generation, Flags, !IO).
> 	%
> 	% This predicate prints the list of flags to the file
> 	% MRFLAGS/Generation/flags.
> 	%
> :- pred print_flags(int, list(string), io, io).
> :- mode print_flags(in, in, di, uo) is det.
> 
> print_flags(Generation, Flags, !IO) :-
> 	FileName = "MRFLAGS/" ++ int_to_string(Generation) ++ "/flags",
> 	open_output(FileName, Result, !IO),
> 	(
> 		Result = ok(Stream),
> 		foldl(print(Stream), Flags, !IO),

This foldl will not terminate the output flags with full stops or
newlines.  A better solution is

	io.write_list(Flags, ".\n", io.print(Stream), !IO),
	io.write_string(Stream, ".\n", !IO),

> 		close_output(Stream, !IO)
> 	;
> 		Result = error(ErrorCode),
> 		progname("evolve", ProgName, !IO),
> 		error_message(ErrorCode, ErrorMessage),
> 		error(ProgName ++ ": " ++ ErrorMessage)
> 	).
> 
> %-----------------------------------------------------------------------------%

Otherwise, that's looking good.  Can you show us the shell scripts
you're going to use as well when they're written?

Ta!
-- Ralph
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list