[m-rev.] for review: analysis framework (1/2)
Julien Fischer
juliensf at cs.mu.OZ.AU
Mon Jan 16 18:42:09 AEDT 2006
On Mon, 16 Jan 2006, Peter Wang wrote:
> Estimated hours taken: 30
> Branches: main
>
> Some work on the intermodule analysis framework. The main changes are that
> modules and analysis results have statuses associated with them, which are
> saved into the `.analysis' files, and there is now code to handle intermodule
s/in/into/
> dependency graphs (that record which modules are dependent on a particular
> analysis result).
>
> Automatic recompilation of modules that use out of date or invalid analysis
> results from other modules is not handled yet.
>
> analysis/analysis.m:
> analysis/analysis.file.m:
> Remove the `FuncInfo' type variable everywhere. This was originally
> designed to be used by analyses to store "extra" information that
> would be passed from an analysis implementation through the analysis
> framework, back to methods defined by the analysis implementation
> itself. One problem was that `FuncInfo' values were not designed to
> be saved and restored from disk.
Presumably that could be dealt with by insisting that they be members of the
to_string typeclass though?
> Also, it made two `Call' or two
> `Answer' values hard to compare, as a `FuncInfo' value had to be
> present for a comparison call to be made, and it was not always
> obvious where that `FuncInfo' value would come from. I have changed
> it so that that any information which might be be stored in a
> `FuncInfo' should be stored in the corresponding `Call' value itself.
>
I don't understand this one; could you provide an example.
> Change the format of analysis result files to include an overall
> status for the module and a status for each analysis result. The
> statuses record whether the module or analysis result could be
> improved by further compilation, or if the module or analysis result
> is no longer valid.
>
> Add code to read and write intermodule dependency graphs (IMDGs). The
> IMDG file for module M records all the modules which depend on an
> analysis result for a procedure defined in M.
>
> Bump analysis file format version numbers as they are incompatible
> with earlier versions.
>
Does the README file in the analysis directory also need updating?
> compiler/mercury_compile.m:
> Update to match changes in the intermodule analysis framework.
>
> compiler/mmc_analysis.m:
> Add the trail usage analysis to the list of analyses to be used with
> the intermodule analysis framework.
> Update the entry for unused argument elimination.
At the moment the results from the intermodule analysis framework should
correspond to the pragmas in the .opt files when compiling with
--intermodule-optimization (although not with
--transitive-intermodule-optimization).
You should check that this is the case.
>
> compiler/add_pragma.m:
> compiler/hlds_module.m:
> compiler/trailing_analysis.m:
> Make the trail usage analysis pass able to make use of the intermodule
> analysis framework. Mainly, functions had to be converted to predicates
> taking I/O states.
>
Delete the last sentence - the main change was adding typeclass
instances.
> Associate each `trailing_status' in the `trailing_info' map with an
> `analysis_status', i.e. whether it is optimal or not.
>
> compiler/unused_args.m:
> Update to match the removal of `FuncInfo' arguments and the
> addition of analysis statuses.
>
> Record the unused argument analysis result for a procedure even if
> all of the procedures arguments are used, so that callers of the
> procedure will know not to request more precise answers.
>
> Record the dependence of the current module on analysis results from
> other modules.
>
...
> +:- pred read_module_status(analysis_status::out, io::di, io::uo) is det.
> +
> +read_module_status(Status, !IO) :-
> + parser__read_term(TermResult `with_type` read_term, !IO),
> + ( TermResult = term(_, term__functor(term__atom(String), [], _)) ->
> + ( analysis_status_to_string(Status0, String) ->
> + Status = Status0
> + ;
> + error("read_module_status: unknown status " ++ String),
> + throw(invalid_analysis_file)
Isn't the call to throw redundant here?
...
> @@ -163,49 +231,122 @@
> throw(invalid_analysis_file)
> ).
>
> +read_module_imdg(Info, ModuleId, ModuleEntries, !IO) :-
> + read_analysis_file(Info ^ compiler, ModuleId, ".imdg",
Make sure that mmake realclean (clean?) and the corresponding functionality in
mmc --make know how to clean up any new filetypes you add.
> +:- pred parse_imdg_arc(Compiler::in)
> + `with_type` parse_entry(module_analysis_map(imdg_arc))
> + `with_inst` parse_entry <= compiler(Compiler).
> +
> +parse_imdg_arc(Compiler, Term, Arcs0, Arcs) :-
> + (
> + Term = term.functor(atom("->"),
> + [term.functor(string(DependentModule), [], _), ResultTerm], _),
> + ResultTerm = functor(atom(AnalysisName),
> + [VersionNumberTerm, FuncIdTerm, CallPatternTerm], _),
> + FuncIdTerm = term.functor(term.string(FuncId), [], _),
> + CallPatternTerm = functor(string(CallPatternString), [], _),
> + analysis_type(_ : unit(Call), _ : unit(Answer))
> + = analyses(Compiler, AnalysisName),
> + CallPattern = from_string(CallPatternString) : Call
> + ->
> + (
> + VersionNumber = analysis_version_number(_ : Call, _ : Answer),
> + VersionNumberTerm = term.functor(
> + term.integer(VersionNumber), [], _)
> + ->
> + Arc = 'new imdg_arc'(CallPattern, DependentModule),
> + ( AnalysisArcs0 = map.search(Arcs0, AnalysisName) ->
> + AnalysisArcs1 = AnalysisArcs0
> + ;
> + AnalysisArcs1 = map.init
> + ),
> + ( FuncArcs0 = map.search(AnalysisArcs1, FuncId) ->
> + FuncArcs = [Arc | FuncArcs0]
> + ;
> + FuncArcs = [Arc]
> + ),
> + Arcs = map.set(Arcs0, AnalysisName,
> + map.set(AnalysisArcs1, FuncId, FuncArcs))
> + ;
> + % Ignore results with an out-of-date version number.
> + % XXX: is that the right thing to do?
> + % do we really need a version number for the IMDG?
> + Arcs = Arcs0
> + )
> + ;
> + throw(invalid_analysis_file)
> + ).
> +
> +%-----------------------------------------------------------------------------%
> +
> +:- type read_header(T) == pred(T, io, io).
> +:- inst read_header == (pred(out, di, uo) is det).
> +
> :- type parse_entry(T) == pred(term, T, T).
> :- inst parse_entry == (pred(in, in, out) is det).
>
Maybe these types would better named read_analysis_header and
parse_analysis_entry? (And likewise for the write preds below).
...
> Index: analysis/analysis.m
> ===================================================================
> RCS file: /home/mercury1/repository/mercury/analysis/analysis.m,v
> retrieving revision 1.2
> diff -u -r1.2 analysis.m
> --- analysis/analysis.m 5 Apr 2004 05:06:38 -0000 1.2
> +++ analysis/analysis.m 10 Jan 2006 04:33:53 -0000
> :- type analysis_name == string.
>
> :- type analysis_type
> - ---> some [FuncInfo, Call, Answer]
> - analysis_type(unit(FuncInfo), unit(Call), unit(Answer))
> - => analysis(FuncInfo, Call, Answer).
> -
> - % An analysis is defined by a type describing call patterns,
> - % a type defining answer patterns and a type giving information
> - % about the function being analysed (e.g. arity) which should
> - % be provided by the caller.
> -:- typeclass analysis(FuncInfo, Call, Answer) <=
> - (call_pattern(FuncInfo, Call),
> - answer_pattern(FuncInfo, Answer))
> + ---> some [Call, Answer]
> + analysis_type(unit(Call), unit(Answer))
> + => analysis(Call, Answer).
> +
> + % An analysis is defined by a type describing call patterns and
> + % a type defining answer patterns. If the analysis needs to store
> + % more information about the function being analysed (e.g. arity)
> + % it should be stored as part of the type for call patterns.
> + %
> +:- typeclass analysis(Call, Answer) <=
> + (call_pattern(Call),
> + answer_pattern(Answer))
> where
> [
> - func analysis_name(FuncInfo::unused, Call::unused, Answer::unused) =
> + func analysis_name(Call::unused, Answer::unused) =
> (analysis_name::out) is det,
>
> % The version number should be changed when the Call or Answer
> % types are changed so that results which use the old types
> % can be discarded.
> - func analysis_version_number(FuncInfo::unused, Call::unused,
> + func analysis_version_number(Call::unused,
> Answer::unused) = (int::out) is det,
>
> - func preferred_fixpoint_type(FuncInfo::unused, Call::unused,
> - Answer::unused) = (fixpoint_type::out) is det
> + func preferred_fixpoint_type(Call::unused,
> + Answer::unused) = (fixpoint_type::out) is det,
> +
> + % `top' and `bottom' should not really depend on the call pattern.
> + % However some analyses may choose to store extra information about
> + % the function in their `Call' types that might be needed for the
> + % answer pattern.
> + %
> + func bottom(Call) = Answer,
> + func top(Call) = Answer
> ].
>
> :- type fixpoint_type
> @@ -75,18 +91,15 @@
> % Can stop at any time.
> greatest_fixpoint.
>
> -:- typeclass call_pattern(FuncInfo, Call)
> - <= (partial_order(FuncInfo, Call), to_string(Call)) where [].
> +:- typeclass call_pattern(Call)
> + <= (partial_order(Call), to_string(Call)) where [].
>
> -:- typeclass answer_pattern(FuncInfo, Answer)
> - <= (partial_order(FuncInfo, Answer), to_string(Answer)) where [
> - func bottom(FuncInfo) = Answer,
> - func top(FuncInfo) = Answer
> -].
> +:- typeclass answer_pattern(Answer)
> + <= (partial_order(Answer), to_string(Answer)) where [].
>
> -:- typeclass partial_order(FuncInfo, Call) where [
> - pred more_precise_than(FuncInfo::in, Call::in, Call::in) is semidet,
> - pred equivalent(FuncInfo::in, Call::in, Call::in) is semidet
> +:- typeclass partial_order(T) where [
> + pred more_precise_than(T::in, T::in) is semidet,
> + pred equivalent(T::in, T::in) is semidet
> ].
>
> :- typeclass to_string(S) where [
> @@ -94,11 +107,25 @@
> func from_string(string) = S is semidet
> ].
>
> + % A call pattern that can be used by analyses that do not need
> + % finer granularity.
> + %
> :- type any_call ---> any_call.
> -:- instance call_pattern(unit, any_call).
> -:- instance partial_order(unit, any_call).
> +:- instance call_pattern(any_call).
> +:- instance partial_order(any_call).
> :- instance to_string(any_call).
>
> + % The status of a module or a specific analysis result.
> + %
> +:- type analysis_status
> + ---> invalid
> + ; suboptimal
> + ; optimal.
> +
> + % Least upper bound of two analysis_status values.
> + %
> +:- func lub(analysis_status, analysis_status) = analysis_status.
> +
> % This will need to encode language specific details like
> % whether it is a predicate or a function, and the arity
> % and mode number.
...
> :- type analysis_info
> ---> some [Compiler] analysis_info(
> compiler :: Compiler,
> +
> + % Holds outstanding requests for more specialised
> + % variants of procedures. Requests are added to this
> + % map as analyses proceed and written out to disk
> + % at the end of the compilation of this module.
> + %
> analysis_requests :: analysis_map(analysis_request),
> - analysis_results :: analysis_map(analysis_result)
> +
> + % The overall status of each module.
> + %
> + module_statuses :: map(module_id, analysis_status),
> +
> + % The "old" map stores analysis results read in from
> + % disk. New results generated while analysing the
> + % current module are added to the "new" map. After
> + % all the analyses the two maps are compared to
> + % see which analysis results have changed. Other
> + % modules may need to be marked or invalidated as a
> + % result. Then "new" results are moved into the "old"
> + % map, from where they can be written to disk.
> + %
> + old_analysis_results :: analysis_map(analysis_result),
> + new_analysis_results :: analysis_map(analysis_result),
> +
> + % The Inter-module Dependency Graph records dependences
s/dependencies/dependences/
> + % of an entire module's analysis results on another
> + % module's answer patterns. e.g. assume module M1
> + % contains function F1 that has an analysis result that
> + % used the answer F2:CP2->AP2 from module M2. If AP2
> + % changes then all of M1 will either be marked
> + % `suboptimal' or `invalid'. Finer-grained dependency
> + % tracking would allow only F1 to be recompiled,
> + % instead of all of M1, but we don't do that.
> + %
> + % IMDGs are loaded from disk into the old map.
> + % During analysis any dependences of the current module
> + % on other modules is added into the new map.
> + % At the end of analysis all the arcs which terminate
> + % at the current module are cleared from the old map
> + % and replaced by those in the new map.
> + %
> + % XXX: check if we really need two maps
> + %
> + old_imdg :: analysis_map(imdg_arc),
> + new_imdg :: analysis_map(imdg_arc)
> ) => compiler(Compiler).
>
...
> +record_dependency(CallerModuleId, AnalysisName, CalleeModuleId, FuncId, Call,
> + !Info) :-
> + (if CallerModuleId = CalleeModuleId then
> + % XXX this assertion breaks compiling the standard library with
> + % --analyse-trail-usage at the moment
How is it breaking it?
...
> + % The algorithm is from Nick's thesis, pp. 108-9.
I suggest putting a pointer to analysis/README there since that file contains
the details of the thesis.
> + % Or my corruption thereof.
> + %
> + % For each new analysis result (P^M:DP --> Ans_new):
> + % Read in the registry of M if necessary
> + % If there is an existing analysis result (P^M:DP --> Ans_old):
> + % if Ans_new \= Ans_old:
> + % Replace the entry in the registry with P^M:DP --> Ans_new
> + % if Ans_new `more_precise_than` Ans_old
> + % Status = suboptimal
> + % else
> + % Status = invalid
> + % For each entry (Q^N:DQ --> P^M:DP) in the IMDG:
> + % % Mark Q^N:DQ --> _ (_) with Status
> + % Actually, we don't do that. We only mark the
> + % module N's _overall_ status with the
> + % least upper bound of its old status and Status.
> + % Else (P:DP --> Ans_old) did not exist:
> + % Insert result (P:DP --> Ans_new) into the registry.
> + %
> + % Finally, clear out the "new" analysis results map. When we write
> + % out the analysis files we will do it from the "old" results map.
> + %
...
> + % In this procedure we have just finished compiling module ModuleId
> + % and will write out data currently cached in the analysis_info
> + % structure out to disk.
> + %
> +write_analysis_files(ModuleId, ImportedModuleIds, !Info, !IO) :-
> + % The current module was just compiled so we set its status to the
> + % lub of all the new analysis results generated.
> + (if NewResults = !.Info ^ new_analysis_results ^ elem(ModuleId) then
> + ModuleStatus = lub_result_statuses(NewResults)
> + else
> + ModuleStatus = optimal
> + ),
> +
> + update_analysis_registry(!Info, !IO),
> +
> + !:Info = !.Info ^ module_statuses ^ elem(ModuleId) := ModuleStatus,
> +
> + update_intermodule_dependencies(ModuleId, ImportedModuleIds,
> + !Info, !IO),
> + (if map.is_empty(!.Info ^ new_analysis_results) then
> + true
> + else
> + io.print("Warning: new_analysis_results is not empty\n", !IO),
> + io.print(!.Info ^ new_analysis_results, !IO),
> + io.nl(!IO)
> + ),
> +
> + % Write the results for all the modules we know of. For the
> + % module being compiled, its analysis results may have changed.
s/the/its/
...
> +% XXX make this enableable with a command-line option. A problem is that we
> +% don't want to make the analysis directory dependent on anything in the
> +% compiler directory.
> +
> +:- pred debug_msg(pred(io, io)::in(pred(di, uo) is det), io::di, io::uo)
> + is det.
> +
> +debug_msg(_P, !IO) :-
> + % P(!IO),
> + true.
I suggest using a mutable to keep track of whether debugging traces
are enabled and then export a predicate from the analysis library that
client compilers can use to turn debugging on and off.
To be continued ...
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list