[m-rev.] for review: pack sub-word-sized integers and dummies

Zoltan Somogyi zoltan.somogyi at runbox.com
Sun Apr 22 06:05:33 AEST 2018



On Sat, 21 Apr 2018 12:06:50 +1000, Peter Wang <novalazy at gmail.com> wrote:
> > +    % Compute AllowDoubleWordInts and AllowNoneForDummy.
> 
> AllowPackingDummies

Fixed. (BTW, AllowNoneForDummy was the initial name of that option.)

> > +    % The calls to decide_complex_du_ctor_args_loop decide the representation
> > +    % only of the last category. FirstArgWordNum measures offset with respect
> > +    % to the last category only. FirstCellWordNum, which we should start
> > +    % recording (in a separate field) once we can calculate it correctly,
> > +    % would measure offset with respect to the start of the cell.
> 
> the offset

Fixed.

> >      decide_complex_du_ctor_args_loop(ModuleInfo, Params, map.init,
> > -        0, CtorArgs, CtorArgRepnsBase),
> > +        FirstArgWordNum, FirstCellWordNum, FirstShift,
> > +        CtorArgs, CtorArgRepnsBase),
> >      decide_complex_du_ctor_args_loop(ModuleInfo, Params, ComponentTypeMap,
> > -        0, CtorArgs, CtorArgRepnsPacked),
> > +        FirstArgWordNum, FirstCellWordNum, FirstShift,
> > +        CtorArgs, CtorArgRepnsPacked),
> >      WorthPacking = worth_arg_packing(CtorArgRepnsBase, CtorArgRepnsPacked),
> >      (
> >          WorthPacking = no,
> 
> See the comment about may_pack_arg_type below.

I have changed this code to pass to the invocation of decide_complex_du_ctor_args_loop
that computes CtorArgRepnsBase a version of Params that disables the packing
of both enums and sub-word-sized ints.

> > +            ArgPosWidth0 = apw_none_shifted(_, _),
> > +            % We represent a dummy argument as apw_none_shifted
> > +            % only if it is packed with other sub-word arguments both
> > +            % before it and after it. The "before it" part was tested above.
> > +            % Here we test the "after it" part.
> > +            ( if
> > +                ArgRepns = [NextArgRepn | _],
> > +                NextArgRepn ^ car_pos_width =
> > +                    apw_partial_shifted(_, _, _, _, _, _)
> 
> May be followed by apw_none_shifted as well?

Thanks for catching that. I will add a test case with more than one
consecutive dummy argument.

> Sub-word-sized integers will be packed (if the option is enabled) even
> if the overall cell size would not be reduced. Is that deliberate?

Not after the change I mention above.

> (The worth_arg_packing test might not be worth keeping, that's fine.)

I also suspect that the test is probably not worth keeping, but the right time
to test it is not now, but *after* I finish an optimization that makes it possible
to copy the unchanged fields of a term after a field update *without* taking
the words containing packed arguments apart and then reassembling them.
 
> > +        StartPieces = [words("The arguments of the constructor"),
> > +            unqual_sym_name_and_arity(CtorSymNameArity),
> > +            words("could be packed more tightly."),
> > +            words("Here is one arrangement for the arguments"),
> > +            words("which take up less than one wor each"),
> 
> word

Fixed.

> > diff --git a/compiler/foreign.m b/compiler/foreign.m
> > index e79a1ba..f152abc 100644
> > --- a/compiler/foreign.m
> > +++ b/compiler/foreign.m
> > @@ -54,6 +54,11 @@
> >      %
> >  :- func to_exported_type(module_info, mer_type) = exported_type.
> >  
> > +    % A version of to_exported_type which requires the given Mercury type
> > +    % to be a builtin.
> > +    %
> 
> I suggest:
> 
> A version of to_exported_type where the given Mercury type must be a
> builtin type.

OK.

Does anyone know any specific reason why the mercury_type functor
of the mlds_type type requires most Mercury types to be given *twice*,
once for the first argument, and once (if not implemented as a foreign type)
as part of the third? The second occurrence costs only a pointer in terms of space,
but it is inconvenient when writing code generating mercury_types, and seems
to be error prone. (What would it mean if the two mer_types inside a mercury_type
term were different?)

> > diff --git a/compiler/ml_code_util.m b/compiler/ml_code_util.m
> > index d61cf96..5d8427c 100644
> > --- a/compiler/ml_code_util.m
> > +++ b/compiler/ml_code_util.m
> ...
> > +        % For the MLDS->C back-end, we need to handle constant floats,
> > +        % int64s and uint64s specially. Boxed floats, inst64s and uint64s
> 
> int64s

Fixed.

> ml_rshift doesn't handle a boxed value like ml_lshift; should it?

No. The only rvals ml_rshift is given are field references, which are
not of that that form. I have added comment to that effect.

> > @@ -3276,6 +3503,45 @@ ml_cons_id_to_tag(Info, ConsId, Tag) :-
> >      ml_gen_info_get_module_info(Info, ModuleInfo),
> >      Tag = cons_id_to_tag(ModuleInfo, ConsId).
> >  
> > +:- pred is_apw_full(arg_pos_width::in) is semidet.
> > +
> > +is_apw_full(apw_full(_, _)).
> > +
> > +:- pred allocate_consecutive_ctor_arg_repns(int::in,
> > +    list(mer_type)::in, list(constructor_arg_repn)::out) is det.
> 
> I suggest allocate_consecutive_full_word_ctor_arg_repns.

Done.

> > +
> > +allocate_consecutive_ctor_arg_repns(_, [], []).
> > +allocate_consecutive_ctor_arg_repns(CurOffset,
> > +        [Type | Types], [ArgRepn | ArgRepns]) :-
> > +    % Tuples and extra fields are word-sized, and have no extra type_infos
> > +    % and/or typeclass_infos in front of them.
> 
> This comment makes more sense near the call sites.

Done.

> > diff --git a/compiler/prog_data.m b/compiler/prog_data.m
> > index cbabf70..eb27868 100644
> > --- a/compiler/prog_data.m
> > +++ b/compiler/prog_data.m
> ...
> > +:- type arg_only_offset
> > +    --->    arg_only_offset(int).
> > +            % The offset of the word from the first part of the memory cell
> > +            % that contains arguments. In other words, the first argument word
> > +            % is at offset 0, even if it is preceded in the memory cell
> > +            % by a remote secondary tag, or by type_infos and/or
> > +            % typeclass_infos added by polymorphism.
> 
> Mention what values are used when arg_only_offset needs to be provided
> for extra args.

They can be anything. I have added a comment explaining why.

> > diff --git a/compiler/unify_gen.m b/compiler/unify_gen.m
> > index d82dc42..547a264 100644
> > --- a/compiler/unify_gen.m
> > +++ b/compiler/unify_gen.m
> > @@ -897,7 +989,7 @@ generate_and_pack_one_cons_word([VarWidth | VarsWidths], [ArgMode | ArgModes],
> >              % and its type is not a dummy type. The rval next to this is real.
> >              % (The reason why we don's store the rval as an argument of
> 
> don't

Fixed.

> > +        ( RightWidth = apw_none_nowhere
> > +        ; RightWidth = apw_none_shifted(_, _)
> > +        ),
> > +        % The value being assign is of a dummy type, so no assignment
> > +        % is actually necessary.
> > +        Code = empty
> >      ).
> 
> assigned

Fixed.

> > +        Shift = arg_shift(ShiftInt),
> > +        Mask = arg_mask(MaskInt),
> > +        % XXX ARG_PACK In the usual case where the heap cell we are assigning
> > +        % to is freshly created, this code is *seriously* suboptimal.
> >          LeftLval = field(yes(LeftPtag), LeftBaseRval,
> >              const(llconst_int(LeftOffset))),
> > -        ComplementMask = const(llconst_int(\(Mask << Shift))),
> > +        ComplementMask = const(llconst_int(\ (MaskInt << ShiftInt))),
> >          MaskOld = binop(bitwise_and(int_type_int),
> >              lval(LeftLval), ComplementMask),
> 
> int_type_uint?

You are right; I have made the change.

> > -        ShiftNew = maybe_left_shift_rval(RightRval, Shift),
> > -        Combined = binop(bitwise_or(int_type_int), MaskOld, ShiftNew),
> > -        AssignCode = singleton(llds_instr(assign(LeftLval, Combined),
> > +        ShiftedRightRval = left_shift_rval(RightRval, Shift),
> > +        CombinedRval = or_two_rvals(MaskOld, ShiftedRightRval),
> > +        AssignCode = singleton(llds_instr(assign(LeftLval, CombinedRval),
> >              "Update part of word"))
> > +    ;
> > +        ( LeftWidth = apw_none_nowhere
> > +        ; LeftWidth = apw_none_shifted(_, _)
> > +        ),
> > +        % The value being assign is of a dummy type, so no assignment
> 
> assigned

Fixed.
 
> > -:- pred generate_ground_term_args(assoc_list(prog_var, arg_width)::in,
> > +:- pred generate_ground_term_args(assoc_list(prog_var, arg_pos_width)::in,
> >      list(typed_rval)::out,
> >      active_ground_term_map::in, active_ground_term_map::out) is det.
> >  
> >  generate_ground_term_args([], [], !ActiveMap).
> >  generate_ground_term_args([ArgVarWidth | ArgVarsWidths],
> >          [TypedRval | TypedRvals], !ActiveMap) :-
> > -    ArgVarWidth = ArgVar - ArgWidth,
> > +    ArgVarWidth = ArgVar - ArgPosWidth,
> >      map.det_remove(ArgVar, ArgTypedRval, !ActiveMap),
> >      (
> > -        ArgWidth = full_word,
> > +        ArgPosWidth = apw_full(_, _),
> >          TypedRval = ArgTypedRval,
> >          generate_ground_term_args(ArgVarsWidths, TypedRvals, !ActiveMap)
> >      ;
> > -        ArgWidth = double_word,
> > +        ArgPosWidth = apw_double(_, _, DoubleWordKind),
> >          % Though a standalone float might have needed to boxed,
> >          % it may be stored in unboxed form as a constructor argument.
> 
> Update this comment.

Done.

> > +            case -8:
> > +            case -9:
> > +                // This is an int32 (-6) or uint32 (-9) argument.
> 
> -8

Good catch.

> > +                bits_to_or = (((MR_Unsigned) arg_data) & 0xffffffff);
> > +                MR_field(ptag, new_data, sectag01 + locn->MR_arg_offset)
> > +                    |= (bits_to_or << locn->MR_arg_shift);
> > +                break;
> > +
> > +            case -10:
> > +                // This is a dummy argument, which does not need setting.
> > +                break;
> > +
> > +            default:
> > +                if (locn->MR_arg_bits > 0) {
> > +                    MR_field(ptag, new_data, sectag01 + locn->MR_arg_offset)
> > +                        |= (arg_data << locn->MR_arg_shift);
> > +                } else {
> > +                    MR_fatal_error(""unknown MR_arg_bits value"");
> > +                }
> > +                break;
> 
> I think this should also handle MR_arg_bits==0.

You are right; fixed. I need to add a test case for this too.

> > diff --git a/library/store.m b/library/store.m
> > index dc6e379..2a66c9a 100644
> > --- a/library/store.m
> > +++ b/library/store.m
> > @@ -714,6 +714,8 @@ copy_ref_value(Ref, Val) -->
> >          MR_offset_incr_hp_msg(ArgRef, MR_SIZE_SLOT_SIZE,
> >              MR_SIZE_SLOT_SIZE + 1, MR_ALLOC_ID, ""store.ref/2"");
> >          MR_define_size_slot(0, ArgRef, 1);
> > +        // XXX I (zs) don't think this will work for arguments
> > +        // that are stored unboxed in two words.
> >          * (MR_Word *) ArgRef = MR_arg_value(arg_ref, arg_locn);
> >      } else {
> >          ArgRef = (MR_Word) arg_ref;
> > @@ -780,6 +782,8 @@ copy_ref_value(Ref, Val) -->
> >          MR_offset_incr_hp_msg(ArgRef, MR_SIZE_SLOT_SIZE,
> >              MR_SIZE_SLOT_SIZE + 1, MR_ALLOC_ID, ""store.ref/2"");
> >          MR_define_size_slot(0, ArgRef, 1);
> > +        // XXX I (zs) don't think this will work for arguments
> > +        // that are stored unboxed in two words.
> >          * (MR_Word *) ArgRef = MR_arg_value(arg_ref, arg_locn);
> >      } else if (arg_ref == &Val) {
> >          /*
> 
> Right, I think there are a lot more problems with that API.

Shouldn't any limitations be documented in the module header?

> Looks fine otherwise.

Thanks for the review.

Zoltan.


More information about the reviews mailing list