[m-dev.] Replacing lists of items with a more structured representation

Simon Taylor staylr at gmail.com
Fri Apr 6 11:35:57 AEST 2007


On 02-Apr-2007, Zoltan Somogyi <zs at csse.unimelb.edu.au> wrote:
> I have been working on replacing the representation of the program
> at the parse_tree level from a list of items (which had to be scanned
> far too often and whose invariants were never documented) with a more
> structured representation, in which a parse tree unit consists of
> a list of regions (interface, implementation, and some others imported
> from .int* and .*opt files).

> +:- type unit_region
> +    --->    unit_region(
> +                region_kind             :: region_kind,
> +                region_context          :: term.context,
> +                region_clauses          :: list(item_clause_info),
> +                region_type_defns       :: list(item_type_defn_info),
> +                region_inst_defns       :: list(item_inst_defn_info),
> +                region_mode_defns       :: list(item_mode_defn_info),
> +                region_pred_decls       :: list(item_pred_decl_info),
> +                region_mode_decls       :: list(item_mode_decl_info),
> +                region_pragmas          :: list(item_pragma_info),
> + 		   ....
 
Rather than organising by where the items came from, would it be better to
collect all the parse tree items for a program entity (type, predicate, etc)
in one place, and for each item in a program entity keep track of where that
item came from so we know what to put in interface files?  This would greatly
simplify error checking in modules.m and the make_hlds modules, and allow some
errors (e.g. duplicate predicates) to be detected earlier, and in the right
place.  At the moment we sometimes generate duplicate predicate errors for
declarations in interface files.

That might be too large a change.

:- type parse_tree == map(module_name, module_parse_tree).

:- type module_parse_tree
	---->	module_parse_tree(
			types :: list(type_entity),
			modes :: list(mode_entity),
			insts :: list(inst_entity),
			preds_and_funcs :: list(pred_entity),
			typeclasses :: list(typeclass_entity),
			...
	).

:- type pred_or_func_entity
	--->	pred_or_func_entity(
			item_and_context(item_pred_or_func_decl_info),
			item_and_context_list(item_mode_decl_info),
			item_and_context_list(list(item_pred_pragma_info)
		).

:- type item_and_context(T)
	---> item_and_context(
		item :: T,
		item_context :: term.context,
		item_source :: item_source
	).

:- type item_and_context_list(T) == list(item_and_context(T)).

:- type item_source
	--->	current_module(interface_or_implementation)
	;	imported_module(module_name, interface_type, where_imported)
	;	optimization_interface(module_name)
	.

:- type interface_type
	--->	int0
	;	int
	;	int2
	;	int3
	.

:- type where_imported
	--->	current_module(interface_or_implementation)
	;	imported_module
	;	optimization_interface
	.

> Index: compiler/make_hlds_passes.m
> ===================================================================
> +:- pred collect_mq_info_promise(item_promise_info::in,
> +:- func region_status(region_kind) = item_status.
> +
> +    % XXX In the original version of this code, the marker for
> +    % transitively_imported was ignored, leading to the items in the region
> +    % it started being handled with the same item_status as the items in the
> +    % previous region. This was probably a bug, but I (zs) am not sure that
> +    % the item_status we return below when _IsTrans = is_transitively_imported
> +    % is actually correct.

`transitively_imported' is probably a bad name.  Everything after the
`:- transitively_imported' item cannot be referenced in the current
source file, so module_qual.m could stop processing at that item.
It would be better to have another item_status `may_not_appear'
for these items.

> Index: compiler/module_qual.m
> ===================================================================
> +collect_mq_info_region(Region, !Info) :-
> +    multi_map.keys(IncludeModulesMap, IncludeModuleNames),
> +    SubModuleNames = list.map(parse_tree_unit_name, SubModules),
> +    list.foldl(add_module_defn, IncludeModuleNames, !Info),
> +    list.foldl(add_module_defn, SubModuleNames, !Info),
> +
> +    % XXX zs: the next line follows the original version of this code.
> +    % I don't know whether processing submodules and leaving the final value of
> +    % !:Info in the submodules as the initial value of !.Info in the next
> +    % region was deliberate or a bug.
> +    list.foldl(collect_mq_info_module, SubModules, !Info).
 
module_qual.m was only ever passed a single module (without its nested
sub-modules), so this should never have come up.

> +collect_mq_enter_region(RegionKind, !Info) :-
> +    % The checks for IsTrans are assertions, because collect_mq_info_regions
> +    % should have already checked for that.
> +    %
> +    % XXX zs: why do we not explicitly set may_be_unqualified/must_be_qualified
> +    % for the source regions?

My guess is that the source regions always came first, so the
default of may_be_unqualified set by init_mq_info is OK.
It would be better to set it explicitly.

> +:- pred collect_mq_info_promise(item_promise_info::in,
> +    mq_info::in, mq_info::out) is det.
> +
> +collect_mq_info_promise(PromiseInfo, !Info) :-
> +    % XXX zs: why do we not ignore promises when the import_status is
> +    % mq_status_abstract_imported?
> +    PromiseInfo = item_promise_info(_, _PromiseType, Goal, _ProgVarSet,
> +        _UnivVars),
>      process_assert(Goal, SymNames, Success),

I don't know enough about promises to answer this one.

> Index: compiler/modules.m
> ===================================================================
> +:- pred leak_implementation_into_interface(parse_tree_unit::in,
> +    parse_tree_unit::in, parse_tree_unit::out) is det.

This predicate needs a comment to explain why the implementation is being
leaked into the interface (the old code was poorly documented as well).
I think it's to do with foreign types?

> +leak_implementation_into_interface(ImplementationParseTreeUnit,
> +        !InterfaceParseTreeUnit) :-
...
> +            % XXX We used to include the include_module declarations from the
> +            % implementation section as well.
> +            IntRegions = [IntRegion0, !.AbstractImpRegion],
> +            !:InterfaceParseTreeUnit =
> +                !.InterfaceParseTreeUnit ^ parse_tree_unit_regions
> +                    := IntRegions
>          )

I can't see why they would be needed either.

> @@ -3141,8 +3200,8 @@
>              !IO),
>          module_name_to_file_name(ModuleName, ".java_date", no,
>              JavaDateFileName, !IO),
> -        % XXX Why is the extension hardcoded to .pic_o here?  That looks
> -        % wrong.  It should probably be .$(EXT_FOR_PIC_OBJECT) - juliensf.
> +        % XXX Why is the extension hardcoded to .pic_o here?  That looks wrong.
> +        % It should probably be .$(EXT_FOR_PIC_OBJECT) - juliensf.

I think the reason is that if you have a rule like the one below, make
complains if "module.$O" and "module.$(EXT_FOR_PIC_OBJECT)" expand to
the same thing.  In those cases mmake will never attempt to build
`module.pic_o'.  If we are generating PIC object files, they will always
have extension `.pic_o'.

module.$O module.$(EXT_FOR_PIC_OBJECT): module.c module2.mh
	...

> @@ -4025,50 +4083,52 @@
>      io::di, io::uo) is det.
>  
>  build_deps_map(FileName, ModuleName, DepsMap, !IO) :-
> -    % read in the top-level file (to figure out its module name)
> -    read_mod_from_file(FileName, ".m", "Reading file", no, no, Items, Error,
> -        ModuleName, _, !IO),
> +    % Read in the top-level file (to figure out its module name).
> +    read_mod_from_file(FileName, ".m", "Reading file", no, no, ParseTreeUnit,
> +        Specs1, _, !IO),
> +    ModuleName = ParseTreeUnit ^ parse_tree_unit_name,
>      string.append(FileName, ".m", SourceFileName),
> -    split_into_submodules(ModuleName, Items, SubModuleList, [], Specs),
> -    sort_error_specs(Specs, SortedSpecs),
> +    split_into_submodules(ParseTreeUnit, ModuleList, Specs1, Specs),
>      globals.io_get_globals(Globals, !IO),
> -    write_error_specs(SortedSpecs, Globals, 0, _NumWarnings, 0, _NumErrors,
> -        !IO),
> -    assoc_list.keys(SubModuleList, SubModuleNames),
> -    list.map(init_dependencies(SourceFileName, ModuleName, SubModuleNames,
> -        Error, Globals), SubModuleList, ModuleImportsList),
> +    write_error_specs(Specs, Globals, 0, _NumWarnings, 0, _NumErrors, !IO),
> +    % XXX _NumErrors
> +    ModuleNames = list.map(parse_tree_unit_name, ModuleList),
> +    TopModuleName = ParseTreeUnit ^ parse_tree_unit_name,
> +    % ZZZ Why do pass ModuleNames here?
> +    list.map(init_dependencies(SourceFileName, TopModuleName, ModuleNames,
> +        Specs, Globals), ModuleList, ModuleImportsList),

The nested module names are only used by `mmc --make', so it may not
be necessary to pass them here, because the dependencies generated
here are only used by mmake.  Passing them here can't hurt.

> +region_has_main(Region) = HasMain :-
> +    PredDecls = Region ^ region_pred_decls,
> +    % XXX Should we require that the declaration be in an interface region?

I think so.

> -process_module_long_interfaces(_, _, [], _Ext, _, _,
> -        !IndirectImports, !ImplIndirectImports, !Module, !IO).
> +process_module_long_interfaces(_, _, [], _Extension, _, _,
> +        !IndirectImports, !ImpIndirectImports, !ModuleImports, !IO).
>  process_module_long_interfaces(ReadModules, NeedQualifier, [Import | Imports],
...

> +        maybe_to_bool(!.ModuleImports ^ maybe_timestamps, ReturnTimeStamp),
> +        maybe_read_mod(ReadModules, Import, Extension,
> +            "Reading interface for module", yes,
> +            ReturnTimeStamp, LongIntParseTreeUnit0, LongIntSpecs,
> +            _LongIntFileName, MaybeTimeStamp, !IO),
> +
> +        % XXX In process_module_PRIVATE_interfaces, the dependencies
> +        % are computed from the equivalent of LongIntParseTreeUnit,
> +        % not LongIntParseTreeUnit0.
>
> -        get_dependencies(LongIntItems,
> +        get_region_dependencies(LongIntParseTreeUnit0,
>              IndirectImports1, IndirectUses1,
> -            ImplIndirectImports1, ImplIndirectUses1),
> -        replace_section_decls(IntStatusItem, ImpStatusItem,
> -            LongIntItems, Items),
> -        maybe_add_int_error(LongIntError, ModError0, ModError),
> +            ImpIndirectImports1, ImpIndirectUses1),
> +
> +        replace_region_kinds(IntKind, ImpKind,
> +            LongIntParseTreeUnit0, LongIntParseTreeUnit),

I think the XXX is a bug.  The only time it makes a difference is
when reading the interface files needed by .opt files, which are
implementation dependencies.

> +get_interface_and_implementation_regions([], !IntRegions, !ImpRegions).
> +get_interface_and_implementation_regions([Region | Regions],
> +        !IntRegions, !ImpRegions) :-
> +    get_interface_and_implementation_regions(Regions,
> +        !IntRegions, !ImpRegions),
> +    RegionKind = Region ^ region_kind,
> +    (
> +        RegionKind = source_region(interface_region),
> +        !:IntRegions = [Region | !.IntRegions]
> +    ;
> +        RegionKind = source_region(implementation_region),
> +        !:ImpRegions = [Region | !.ImpRegions]
> +    ;
> +        RegionKind = external_region(_)
> +        % XXX This code used to handle private interfaces and optimization
> +        % regions as if they were part of whatever region came before.
> +        % XXX This code used to ignore Regions if it found an imported or used
> +        % region.
> +    ).

This code assumed that imported items came after all items for the current
module, so as soon as an imported item was found it could stop processing.

> +get_short_interface_region(ShortInterfaceKind, Region0, Region) :-
> +    Region0 = unit_region(RegionKind, Context, _Clauses, TypeDefns0,
> +        InstDefns, ModeDefns, _PredDecls, _ModeDecls, Pragmas0, _Promises,
> +        TypeClasses0, Instances0, _Initialises, _Finalises, _Mutables,
> +        _Externals, Imports, Uses, Includes, SubModules, _Versions),
> +    expect(unify(SubModules, []), this_file,
> +        "submodules in get_short_interface_region"),
> +
> +    TypeDefns = list.map(maybe_make_type_defn_abstract(ShortInterfaceKind),
> +        TypeDefns0),
> +    TypeClasses = list.map(make_typeclass_abstract, TypeClasses0),
> +    % XXX zs: we used to make instances abstract for int2 files,
> +    % but not for int3 files.
> +    Instances = list.map(make_abstract_instance, Instances0),
> +    list.filter(include_pragma_in_short_interface, Pragmas0, Pragmas),

I think you're right to fix that behaviour.

Simon.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions:          mercury-developers-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the developers mailing list