[m-rev.] for review: Reduce memory requirement to build compiler in profdeep grade.

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue May 11 14:14:20 AEST 2021


2021-05-11 13:45 GMT+10:00 "Peter Wang" <novalazy at gmail.com>:
> I can make the same change to output_binop_for_csharp for consistency.

Please do.

> The compiler fails to compile in the asm_fast.gc.profdeep.stseg grade on
> our (virtual) test server because the C file for mlds_to_java_data.m
> causes gcc 7.5.0 (at the usual -O2 optimisation level) to use more than
> 2 GB RAM. I can also reproduce the problem with gcc 8.3.0 at least.

Do you think it would be worthwhile to submit a problem report?
The branches of the big switch are independent, and gcc's
control flow graph should show that, so the complexity of
the optimization should be roughly the same whether or not
the arms in separate C functions are not. The fact that the
complexities are not the same seems to me to be a performance bug.

I don't recall any equivalent of "mmc -VS" for gcc that would tell us
which gcc optimization is the culprit here, but the gcc maintainers
have to have one.

> +        ( Op = int_add(_)
> +        ; Op = int_sub(_)
> +        ; Op = int_mul(_)
> +        ; Op = int_div(_)
> +        ; Op = int_mod(_)
> +        ; Op = unchecked_left_shift(_, _)
> +        ; Op = unchecked_right_shift(_, _)
> +        ; Op = bitwise_and(_)
> +        ; Op = bitwise_or(_)
> +        ; Op = bitwise_xor(_)
> +        ; Op = int_lt(_)
> +        ; Op = int_gt(_)
> +        ; Op = int_le(_)
> +        ; Op = int_ge(_)
> +        ),
> +        % Handle these in a separate switch to reduce gcc memory requirements,
> +        % particularly when building in deep profiling grades.
> +        output_int_binop_for_java(Info, Stream, Op, X, Y, !IO)

It seems that  while output_binary_op_for_java is called from
several different places, it is called for any given binop from only
*one* place. And it never does anything more complex than
choosing  a string that implements a binop, and then prints it.
This means that it is possible to add to each Op = ... line above
a piece of code that sets OpStr (as the switch arm for str_{ne,le,...} does),
*without* violating the don't repeat yourself principle. I think
this should be done both here, and in all the other switch arms.
In this case, that means output_int_binop_for_java should take
OpStr, not Op.

The diff is fine, though I would also consider putting
no_inline pragmas on the newly created predicates
to guard against the problem returning when someone tries
to deep profile an optimized compiler.

Zoltan.


More information about the reviews mailing list