[m-rev.] for review: Reduce memory requirement to build compiler in profdeep grade.

Peter Wang novalazy at gmail.com
Tue May 11 18:07:07 AEST 2021

Previous message: [m-rev.] for review: Reduce memory requirement to build compiler in profdeep grade.
Next message: [m-rev.] for review: Reduce memory requirement to build compiler in profdeep grade.
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, 11 May 2021 14:14:20 +1000 "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> 
> 2021-05-11 13:45 GMT+10:00 "Peter Wang" <novalazy at gmail.com>:
> > I can make the same change to output_binop_for_csharp for consistency.
> 
> Please do.

Done.

> > The compiler fails to compile in the asm_fast.gc.profdeep.stseg grade on
> > our (virtual) test server because the C file for mlds_to_java_data.m
> > causes gcc 7.5.0 (at the usual -O2 optimisation level) to use more than
> > 2 GB RAM. I can also reproduce the problem with gcc 8.3.0 at least.
> 
> Do you think it would be worthwhile to submit a problem report?
> The branches of the big switch are independent, and gcc's
> control flow graph should show that, so the complexity of
> the optimization should be roughly the same whether or not
> the arms in separate C functions are not. The fact that the
> complexities are not the same seems to me to be a performance bug.

I think we'd need to cut the file down to a reasonable test case, and
should avoid using non-local gotos, as I don't think gcc developers
would take kindly to that (even if it turns out to be irrelevant,
I can see it being a reason to ignore a bug report).
I'm not really interested, so take it up if you like.

> > +        ( Op = int_add(_)
> > +        ; Op = int_sub(_)
> > +        ; Op = int_mul(_)
> > +        ; Op = int_div(_)
> > +        ; Op = int_mod(_)
> > +        ; Op = unchecked_left_shift(_, _)
> > +        ; Op = unchecked_right_shift(_, _)
> > +        ; Op = bitwise_and(_)
> > +        ; Op = bitwise_or(_)
> > +        ; Op = bitwise_xor(_)
> > +        ; Op = int_lt(_)
> > +        ; Op = int_gt(_)
> > +        ; Op = int_le(_)
> > +        ; Op = int_ge(_)
> > +        ),
> > +        % Handle these in a separate switch to reduce gcc memory requirements,
> > +        % particularly when building in deep profiling grades.
> > +        output_int_binop_for_java(Info, Stream, Op, X, Y, !IO)
> 
> It seems that  while output_binary_op_for_java is called from
> several different places, it is called for any given binop from only
> *one* place. And it never does anything more complex than
> choosing  a string that implements a binop, and then prints it.
> This means that it is possible to add to each Op = ... line above
> a piece of code that sets OpStr (as the switch arm for str_{ne,le,...} does),
> *without* violating the don't repeat yourself principle. I think
> this should be done both here, and in all the other switch arms.
> In this case, that means output_int_binop_for_java should take
> OpStr, not Op.

output_int_binop_for_java needs to handle various cases that require
extra casts/masking, so I don't think it's that simple. I left it as-is.

> The diff is fine, though I would also consider putting
> no_inline pragmas on the newly created predicates
> to guard against the problem returning when someone tries
> to deep profile an optimized compiler.

I added the pragma no_inlines.

Peter

Previous message: [m-rev.] for review: Reduce memory requirement to build compiler in profdeep grade.
Next message: [m-rev.] for review: Reduce memory requirement to build compiler in profdeep grade.
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the reviews mailing list