[m-rev.] for review: Use consistent integer types for some RTTI fields.

Zoltan Somogyi zoltan.somogyi at runbox.com
Sat Nov 17 17:32:50 AEDT 2018



On Sat, 17 Nov 2018 15:05:58 +1100, Peter Wang <novalazy at gmail.com> wrote:
> diff --git a/compiler/rtti.m b/compiler/rtti.m
> index 43cce7dab..c91e34f17 100644
> --- a/compiler/rtti.m
> +++ b/compiler/rtti.m
>
>  :- type type_ctor_data
>      --->    type_ctor_data(
> -                tcr_version         :: int,
> +                tcr_version         :: int8,
>                  tcr_module_name     :: module_name,
>                  tcr_type_name       :: string,
>                  tcr_arity           :: int,
>                  tcr_unify_pred      :: univ,
>                  tcr_compare_pred    :: univ,
>                  tcr_flags           :: set(type_ctor_flag),
>                  tcr_rep_details     :: type_ctor_details
>              ).

Changing the signedness of fields in the runtime is tricky, because
you will get warnings from the C compiler, which we normally regard
as errors. However, changing the signedness of these fields in the compiler
is trivial. If I were you, for each field you want to change to unsigned
in the runtime, go ahead and change it to unsigned in the *compiler*
right now, since that should minimize disruption. Just take a snapshot
of your changes before you do it, so you can undo it quickly if some parts
of the compiler insist on stuffing a *meaningful* negative value in there anyway.
  
>      % Each of the following values corresponds to one of the
> @@ -185,21 +186,21 @@
>                  foreign_enum_ordinal_table :: map(int, foreign_enum_functor),
>                  foreign_enum_name_table    :: map(string,
>                                                  foreign_enum_functor),
>                  foreign_enum_functor_number_mapping
>                                             :: list(int)
>              )
>      ;       tcd_du(
>                  du_axioms           :: equality_axioms,
>                  du_functors         :: list(du_functor),
>                  du_value_table      :: ptag_map,
> -                du_name_table       :: map(string, map(int, du_functor)),
> +                du_name_table       :: map(string, map(int16, du_functor)),
>                  du_functor_number_mapping
>                                      :: list(int)
>              )

Shouldn't the last field be changed to int16s (or uint16s) as well?

This raises another point. What is the maximum number of function symbols
that a type is allowed to have? This code says that number must fit in 16 bits;
other code says only that it must fit in 32 bits. We should

- decide on one or the other,
- document the decision in the reference manual, and
- define a C type in the runtime and a Mercury type in the compiler
  that aliases uint16_t or uint32_t, and use that consistently.

> @@ -257,21 +258,21 @@
>                  nt_subtype_info     :: functor_subtype_info
>              ).
>  
>      % Descriptor for a functor in a du type.
>      %
>      % This type mostly corresponds to the C type MR_DuFunctorDesc.
>      %
>  :- type du_functor
>      --->    du_functor(
>                  du_name             :: string,
> -                du_orig_arity       :: int,
> +                du_orig_arity       :: int16,
>                  du_ordinal          :: int,
>                  du_rep              :: du_rep,
>                  du_arg_infos        :: list(du_arg_info),
>                  du_exist_info       :: maybe(exist_info),
>                  du_subtype_info     :: functor_subtype_info
>              ).

We should do steps 2 and 3 above for the maximum arity
of a function symbol as well, though I don't think anyone will argue
for allowing arities that don't fit in 16 bits. For predicates, the maximum
arity we allow is either 1023 or 1024 (I don't recall which), and we should
probably use the same number here.

> @@ -774,31 +775,32 @@
> -    % Given a type constructor with the given details, return the number
> -    % of primary tag values used by the type. The return value will be
> -    % negative if the type constructor doesn't reserve primary tags.
> +    % Given a type constructor with the given details, return `yes(NumPtags)'
> +    % where NumPtags is the number of primary tag values used by the type,
> +    % or `no' if the type constructor doesn't reserve primary tags.
>      %
> -:- func type_ctor_details_num_ptags(type_ctor_details) = int.
> +:- func type_ctor_details_num_ptags(type_ctor_details) = maybe(int).

Please also change "reserve" to "use" here.

>  output_enum_functor_defn(Info, RttiTypeCtor, EnumFunctor, !DeclSet, !IO) :-
>      EnumFunctor = enum_functor(FunctorName, Ordinal),
>      output_generic_rtti_data_defn_start(Info,
>          ctor_rtti_id(RttiTypeCtor, type_ctor_enum_functor_desc(Ordinal)),
>          !DeclSet, !IO),
>      io.write_string(" = {\n\t""", !IO),
> +    % MR_enum_functor_name
>      c_util.output_quoted_string_cur_stream(FunctorName, !IO),
>      io.write_string(""",\n\t", !IO),
> +    % MR_enum_functor_ordinal
>      io.write_int(Ordinal, !IO),
>      io.write_string("\n};\n", !IO).

Earlier bug: the ordinal should be int32 (later uint32), not int.

>  output_foreign_enum_functor_defn(Info, RttiTypeCtor, ForeignEnumFunctor,
>          !DeclSet, !IO) :-
>      ForeignEnumFunctor = foreign_enum_functor(FunctorName, FunctorOrdinal,
>          FunctorValue),
>      RttiId = ctor_rtti_id(RttiTypeCtor,
>          type_ctor_foreign_enum_functor_desc(FunctorOrdinal)),
>      output_generic_rtti_data_defn_start(Info, RttiId, !DeclSet, !IO),
>      io.write_string(" = {\n\t""", !IO),
> +    % MR_foreign_enum_functor_name
>      c_util.output_quoted_string_cur_stream(FunctorName, !IO),
>      io.write_string(""",\n\t", !IO),
> +    % MR_foreign_enum_functor_ordinal
>      io.write_int(FunctorOrdinal, !IO),
>      io.write_string(",\n\t", !IO),
> +    % MR_foreign_enum_functor_value
>      io.write_string(FunctorValue, !IO),
>      io.write_string("\n};\n", !IO).

Same here.

> @@ -938,67 +980,77 @@ output_du_functor_defn(Info, RttiTypeCtor, DuFunctor, !DeclSet, !IO) :-
>          SectagAndLocn = sectag_locn_remote_word(StagUint),
>          Locn = "MR_SECTAG_REMOTE_FULL_WORD",
>          Stag = uint.cast_to_int(StagUint),
>          NumSectagBits = 0u8
>      ;
>          SectagAndLocn = sectag_locn_remote_bits(StagUint, NumSectagBits,
>              _Mask),
>          Locn = "MR_SECTAG_REMOTE_BITS",
>          Stag = uint.cast_to_int(StagUint)
>      ),
> +    % MR_du_functor_sectag_locn
>      io.write_string(Locn, !IO),
>      io.write_string(",\n\t", !IO),
> +    % MR_du_functor_primary
>      io.write_uint8(PtagUint8, !IO),
>      io.write_string(",\n\t", !IO),
> +    % MR_du_functor_secondary
>      io.write_int(Stag, !IO),
>      io.write_string(",\n\t", !IO),
> +    % MR_du_functor_ordinal
>      io.write_int(Ordinal, !IO),
>      io.write_string(",\n\t", !IO),

The sectag as well as the ordinal fields should be 32 bits here as well.
In general, it seems that we were writing out 32 bit fields as ints,
because ints *were* 32 bits when we wrote it :-)

All the 32 bit fields will need to be treated as such in rtti_to_mlds.m as well;
I am not commenting on them individually.

> @@ -1182,26 +1207,25 @@ gen_du_ptag_ordered_table(ModuleInfo, RttiTypeCtor, PtagMap, !GlobalData) :-
>  
>  gen_du_ptag_ordered_table_body(_, _, _, [], []).
>  gen_du_ptag_ordered_table_body(ModuleName, RttiTypeCtor, CurPtag,
>          [Ptag - SectagTable | PtagTail], [Initializer | Initializers]) :-
>      expect(unify(Ptag, CurPtag), $module, $pred, "ptag mismatch"),
>      SectagTable = sectag_table(SectagLocn, NumSectagBits, NumSharers,
>          _SectagMap),
>      RttiName = type_ctor_du_ptag_layout(Ptag),
>      RttiId = ctor_rtti_id(RttiTypeCtor, RttiName),
>      Initializer = init_struct(mlds_rtti_type(item_type(RttiId)), [
> -        % XXX ARG_PACK Why isn't the num_sharers field itself unsigned?
> -        gen_init_int(uint.cast_to_int(NumSharers)),
> -        gen_init_sectag_locn(SectagLocn),
> +        gen_init_uint(NumSharers),          % MR_sectag_sharers
> +        gen_init_sectag_locn(SectagLocn),   % MR_sectag_locn
>          gen_init_rtti_name(ModuleName, RttiTypeCtor,
> -            type_ctor_du_stag_ordered_table(Ptag)),
> -        init_obj(ml_const(mlconst_int8(NumSectagBits)))
> +            type_ctor_du_stag_ordered_table(Ptag)), % MR_sectag_alternatives
> +        gen_init_int8(NumSectagBits)        % MR_sectag_numbits
>      ]),
>      CurPtag = ptag(CurPtagUint8),
>      NextPtag = ptag(CurPtagUint8 + 1u8),
>      gen_du_ptag_ordered_table_body(ModuleName, RttiTypeCtor, NextPtag,
>          PtagTail, Initializers).

NumSharers should be 32 bit as well (and in rtti_out.m as well, if it isn't 32 bits
there already), provided we want to allow types with more than 2^16 function
symbols.

> @@ -718,22 +720,22 @@ make_du_functors(ModuleInfo, [CtorRepn | CtorRepns],
>          MaybeExistInfo = no
>      ;
>          MaybeExistConstraints = exist_constraints(ExistConstraints),
>          ExistConstraints = cons_exist_constraints(ExistTVars, _, _, _),
>          module_info_get_class_table(ModuleInfo, ClassTable),
>          generate_exist_info(ExistConstraints, ClassTable, ExistInfo),
>          MaybeExistInfo = yes(ExistInfo)
>      ),
>      list.map_foldl(generate_du_arg_info(TypeArity, ExistTVars),
>          ConsArgRepns, ArgInfos, functor_subtype_none, FunctorSubtypeInfo),
> -    DuFunctor = du_functor(FunctorName, Arity, CurOrdinal, DuRep,
> -        ArgInfos, MaybeExistInfo, FunctorSubtypeInfo),
> +    DuFunctor = du_functor(FunctorName, int16.det_from_int(Arity), CurOrdinal,
> +        DuRep, ArgInfos, MaybeExistInfo, FunctorSubtypeInfo),

Eventually (i.e. maybe not as part of this change), the source of Arity
should give it to us as 16 bits. We should only check for the maximum arity
of a function symbol, or the maximum number of function symbols in a type,
should be in the front end, where we can generate an error message if the
limit is exceeded.

>  get_functor_du(TypeCtorRep, TypeInfo, TypeCtorInfo, FunctorNumber,
>          FunctorName, Arity, PseudoTypeInfoList, Names) :-
>      TypeFunctors = get_type_ctor_functors(TypeCtorInfo),
>      DuFunctorDesc = TypeFunctors ^ du_functor_desc(TypeCtorRep, FunctorNumber),
>  
>      FunctorName = DuFunctorDesc ^ du_functor_name,
> -    Arity = DuFunctorDesc ^ du_functor_arity,
> +    Arity = int16.to_int(DuFunctorDesc ^ du_functor_arity),

Any reason why you convert this to int here (and elsewhere in this module)?

> --- a/runtime/mercury_type_info.h
> +++ b/runtime/mercury_type_info.h
> @@ -738,22 +738,22 @@ typedef const MR_TypeClassConstraintStruct      *MR_TypeClassConstraint;
>          ((MR_TypeClassConstraint) &((p).MR_tc_constr_type_class_info))
>  
>  // The argument number field gives the offset in the cell (in a form in which
>  // it can be given to the MR_field macro directly) of either of the typeinfo
>  // itself or of the typeclassinfo containing the typeinfo. If the former,
>  // the offset field will be negative; otherwise, it will be an integer
>  // which can be given as a second argument to the MR_typeclass_info_type_info
>  // macro.
>  
>  typedef struct {
> -    MR_int_least16_t    MR_exist_arg_num;
> -    MR_int_least16_t    MR_exist_offset_in_tci;
> +    MR_int_least16_t    MR_exist_arg_num;       // XXX make unsigned?
> +    MR_int_least16_t    MR_exist_offset_in_tci; // negative is possible
>  } MR_DuExistLocn;

If I were you, I would replace the "XXX" with something like
"XXX MAKE_FIELD_UNSIGNED" everywhere in this diff.
  
> @@ -881,21 +882,21 @@ typedef struct {
>      // so we should consider changing this. However, any such change
>      // would require a nontrivial bootstrapping sequence.
>      //
>      // In the special case of subword-sized arguments that are packed
>      // next to a remote secondary tag (MR_SECTAG_REMOTE_BITS), the value of
>      // the MR_arg_offset field will be -1.
>      // In the special case of subword-sized arguments that are packed
>      // next to a local secondary tag (MR_SECTAG_LOCAL_BITS), the value of
>      // the MR_arg_offset field will be -2.
>  
> -    MR_int_least8_t         MR_arg_shift;
> +    MR_int_least8_t         MR_arg_shift;   // XXX make unsigned?
>      MR_int_least8_t         MR_arg_bits;
>      // If MR_arg_bits is 0, then the argument occupies the entire word
>      // at the offset given by MR_arg_offset within the term's memory cell.
>      // In this case, which is the usual case, MR_arg_shift is not relevant.
>      //
>      // Nonzero values of MR_arg_bits mean that the size of the argument
>      // is not the same as the size of one word. These nonzero values
>      // fall into two categories: positive and negative.
>      //
>      // A strictly positive value of MR_arg_bits means that the argument

While making MR_arg_shift unsigned would be useful, any later change
touching this type should replace the MR_arg_bits field with two fields:
an arg_kind enum (encoding "word-sized", "subword sized enum", and the meanings
of the current special negative values), and a uint8 giving the size of the enum.
Or we could encode all the possible enum sizes (1 to 63) in the arg_kind enum.

> @@ -957,32 +958,32 @@ typedef struct {
>  typedef struct {
>      MR_ConstString          MR_du_functor_name;
> -    MR_int_least16_t        MR_du_functor_orig_arity;
> -    MR_int_least16_t        MR_du_functor_arg_type_contains_var;
> +    MR_int_least16_t        MR_du_functor_orig_arity;   // XXX make unsigned?
> +    MR_uint_least16_t       MR_du_functor_arg_type_contains_var;
>      MR_Sectag_Locn          MR_du_functor_sectag_locn;
> -    MR_int_least8_t         MR_du_functor_primary;
> -    MR_int_least32_t        MR_du_functor_secondary;
> -    MR_int_least32_t        MR_du_functor_ordinal;
> +    MR_uint_least8_t        MR_du_functor_primary;
> +    MR_int_least32_t        MR_du_functor_secondary;    // -1 = no sectag
> +    MR_int_least32_t        MR_du_functor_ordinal;      // XXX make unsigned?
>      const MR_PseudoTypeInfo *MR_du_functor_arg_types;
>      const MR_ConstString    *MR_du_functor_arg_names;
>      const MR_DuArgLocn      *MR_du_functor_arg_locns;
>      const MR_DuExistInfo    *MR_du_functor_exist_info;
>      MR_FunctorSubtype       MR_du_functor_subtype;
> -    MR_int_least8_t         MR_du_functor_num_sectag_bits;
> +    MR_uint_least8_t        MR_du_functor_num_sectag_bits;
>  } MR_DuFunctorDesc;

The value of MR_du_functor_sectag_locn says whether there is a sectag,
so I think we could make MR_du_functor_secondary unsigned as well.

> @@ -1207,32 +1207,32 @@ typedef const MR_Integer        *MR_FunctorNumberMap;
>  struct MR_TypeCtorInfo_Struct {
> -    MR_Integer          MR_type_ctor_arity;
> -    MR_int_least8_t     MR_type_ctor_version;
> -    MR_int_least8_t     MR_type_ctor_num_ptags;         // if DU
> +    MR_Integer          MR_type_ctor_arity;         // XXX make unsigned?
> +    MR_int_least8_t     MR_type_ctor_version;       // XXX make unsigned?
> +    MR_int_least8_t     MR_type_ctor_num_ptags;     // negative if not DU
>      MR_TypeCtorRepInt   MR_type_ctor_rep_CAST_ME;
>      MR_ProcAddr         MR_type_ctor_unify_pred;
>      MR_ProcAddr         MR_type_ctor_compare_pred;
>      MR_ConstString      MR_type_ctor_module_name;
>      MR_ConstString      MR_type_ctor_name;
>      MR_TypeFunctors     MR_type_ctor_functors;
>      MR_TypeLayout       MR_type_ctor_layout;
> -    MR_int_least32_t    MR_type_ctor_num_functors;
> -    MR_int_least16_t    MR_type_ctor_flags;
> +    MR_int_least32_t    MR_type_ctor_num_functors;  // negative if no symbols
> +    MR_uint_least16_t   MR_type_ctor_flags;
>      MR_FunctorNumberMap MR_type_ctor_functor_number_map;

Again, it should be possible to make MR_type_ctor_num_functors unsigned,
since MR_type_ctor_rep should say whether the field is meaningful.

Apart from the above concerns, the diff is fine.

Any reason for sending this diff with more than the standard three lines of context?

Zoltan.


More information about the reviews mailing list