[m-rev.] for review: colour in error messages

Julien Fischer jfischer at opturion.com
Wed Apr 24 22:14:49 AEST 2024


On Wed, 24 Apr 2024, Zoltan Somogyi wrote:

> For review by anyone. The capability is disabled for now,
> but the attached two .err files show what the output looks like.
> (Those were generated by a version of mmc in which I enabled
> the capability.)
>
> There are several questions raised by this diff.
>
> 1. Programmers should be able to enable or disable colour.
> Once we have robust suport for it, which should be the default?

Some thoughts about coloured messages:

* I think it should be possible to set the default via an option to
   the configure script -- being able to do so is potentially useful for
   packagers.

* When colours are enabled, the compiler should respect the NO_COLOR
   environment variable (<https://no-color.org/>) and disable them
   if that environment variable is set appropriately.

* Many programs that write coloured output, only do so when writing
   to a terminal and disable coloured output when the relevant output
   stream is redirected to a file. On Unix-like systems, this is done
   via a call to isatty() (or _isatty() on Windows). (I don't know how
   such a thing can easily be supported when the compiler is built in
   the Java or C# grades.)

* The terminal's background color will affect this.

> What should be the name of the option that controls it?
> Something like --enable-colo{u}r?

--enable-colo{u}r-messages or --enable-colo{u}r-diagnostics?

> 2. Programmers who enable colour should be able to control
> the actual colours used. What should the option names be?
> --set-correct-colo{u}r-to 45?

I suggest naming these options according to some sort of scheme,
e.g. somehting like:

    --set-colo{u}r-FOO

where FOO is the feature whose colour is to be set.

> 3. Should we support something beyond 8-bit color (such as that 45)?
> I have no idea how widespread the support for 24 bit rgb is in terminals
> (as opposed to actual display devices).
>
> If so, what should the option names be?

I would not worry about it initially.

> 4. How should we test this capability? We could pick a standard
> set of colours (the default set should work) and update the .err_exp
> files to contain the resulting terminal escape sequences. Or we could
> add the expected .err files with the escape sequences as new .err_exp2 files,
> leaving the original .err_exp file without those sequences.
> Neither of those would enable us consistently keep testing error messages
> both with colour enabled and with colour disabled.
>
> We could design a mechanism for executing each test case in invalid* and
> in warning with color both enabled and disabled, with a separate set of
> .err_exp files for each case.
>
> 5. At the moment, the diff supports two colours, intended for correct
> and incorrect code respectively. In error messsages of the form
> "expected x, got y", the intention is that x would be in the correct colour,
> and the y in the incorrect colour.
>
> Should we have a separate logical colour for "this may be caused by"
> messages?

I think so.  More generally, we probably need to go thorugh the expected
outputs in tests/invalid and tests/warnings and work out where colour
could be useful.

> If so, what should be the default actual colour?

I'm not sure.

> 6. The bike-shed question: it is clear that option names should have
> two versions, one with "color" and one with "colour", but which spelling
> we should use internally? The american version is slightly less likely to cause
> a line to be too long, but graph_colour.m and set_of_var.m use brit spelling.

Based on a grep of the compiler source code (and a review of your diff),
I think that's already been "decided": colour.


> Add prelim support for colour in error messages.
> 
> compiler/error_spec.m:
>     Add and export two functions, which promise to make a given piece list
>     will be printed in colour (if colour capability is enabled).
>     Support two colours for now: one each indicating correct and incorrect
>     code. The intention is to draw programmers eyes to these parts of
>     error messages.
>
>     The new capability extends the format_piece type, but in a way that
>     asks users to leave its use to code inside error_spec.m itself
>     (at least for now).
>
>     As a trial, change the implementation of change_hunk_to_pieces,
>     which transforms the output of diff algorithms into parts of
>     error_specs. We use this capability to point out inconsistencies
>     between the order in which preds/funcs are declared and the order
>     in which they are defined.
> 
> compiler/write_error_spec.m:
>     Add code to turn the semantic colour names defined in error_spec.m
>     into actual colors we can print using the escape sequences supported
>     by many (most?) kinds of terminals, if this capability is enabled.
>
>     Extend each stage of the multi-stage machinery we use to convert
>     format_piece sequences into error lines to handle colour changes.
>
>     This required changes in how we handle string lengths and spaces.
>     Previously, we could compute the length of a string on the screen
>     by calling string.count_code_points, but this does not work when
>     some of those code points represent terminal control sequences
>     (that set a new current colour) that take no space on screen.
>     Also, we could just put a space between each pair of strings on a line
>     when those string were each one word, but that would lead to incorrect
>     output when some of those strings are terminal control sequences.
>     This required adding support for setting up an initial colour
>     on each line, and resetting the colour at the end of each line.
>
>     Change the representation of unlimited length lines. Instead of the
>     representation being "no", i.e. the absence of a limit, it is now
>     int.max_int, a limit that is impossible to reach in practice.
>     The reason for this is that we used to compute the length of a line
>     using string.count_code_points if MaybeLimit was "no", and as mentioned
>     above, this does not work anymore.
>
>     This new machinery has been tested manually, and it works.
>     It is disabled for now, 
>
>     - until we figure out how we want to control it,
>     - until we figure out how we want to test it, and
>     - until we start using it in more commonly seen error messages.

...


> diff --git a/compiler/write_error_spec.m b/compiler/write_error_spec.m
> index 949668257..6f24084e4 100644
> --- a/compiler/write_error_spec.m
> +++ b/compiler/write_error_spec.m

...

> -:- pred get_line_of_words(int::in, string::in, list(string)::in,
> -    indent::in, int::out, list(string)::out, list(string)::out) is det.
> -
> -get_line_of_words(AvailLen, FirstWord, LaterWords, Indent, LineWordsLen,
> -        LineWords, RestWords) :-
> -    string.count_code_points(FirstWord, FirstWordLen),
> +% The rules for colour starts and ends:
> +%
> +% When we see one, we always look for and process all following consecutive
> +% color starts and ends, and handle their aggregrate effect. These "spans"
> +% of colour changes are always both preceded and followed by sc_str units;
> +% if they were preceded or followed by colour change units, those units
> +% would have been included in the span.
> +% 
> +% If code always creates colour changes using the colour_pieces_as_X
> +% functions in error_spec.m, then a colour change span whose first unit
> +% is sc_colour_start will end up with a stack that is at least as high
> +% as it started with, while a span whose first unit is sc_colour_end
> +% will end up with a stack stack that is at most as high as it started with.
> +% Therefore the total effect of each span is given by its initial unit.
> +%
> +% We want to limit colour changes to apply to just the pieces they are
> +% supposed to cover, not to any spaces around them. We therefore added
> +% the NextSpace parameter to get_later_words, which should contain
> +% either zero spaces or one space, to allow the addition of spaces
> +% to be delayed. We exploit this capability using the following rules
> +% for how spaces should be handled around colour changes:
> +%
> +% at left margin (get_line_of_words)
> +% str:          add the string, NextSpace := 1
> +% colour_start: add new colour, NextSpace := 0
> +% colour_end:   add new colour, NextSpace := 0
> +%
> +% not at left margin (get_later_words)
> +% str:          add NextSpace spaces, add the string, NextSpace := 1
> +% colour_start: check NextSpace=1, add 1 space, add top colour, NextSpace := 0
> +% colour_end:   check NextSpace=1, add top colour, NextSpace := 1
> +%
> +% Note how with colour_start, get_later_words add the space *before*
> +% the colour change, while with colour_end, it adds the space *after*.
> +%
> +% (The checks for NextSpace=1 in get_later_words when the first unit
> +% is a colour change must succeed, because the first unit is a colour change
> +% *only* if was not included in a previously-started span of colour changes,
> +% which means that the previous unit (which was handled by get_later_words)
> +% was sc_str.)
> +
> +:- pred get_line_of_words(int::in, sc_unit::in, list(sc_unit)::in,
> +    indent::in, int::out, list(string)::out, list(sc_unit)::out,
> +    colour_stack::in, colour_stack::out, line_end_reset::out) is det.
> +
> +get_line_of_words(AvailLen, FirstSCUnit, LaterSCUnits0, Indent, LineWordsLen,
> +        LineStrs, RestSCUnits, !ColourStack, LineEndReset) :-
>      AvailLeft = AvailLen - uint.cast_to_int(Indent * indent2_increment),
> -    get_later_words(AvailLeft, LaterWords, FirstWordLen, LineWordsLen,
> -        cord.singleton(FirstWord), LineWordsCord, RestWords),
> -    LineWords = cord.list(LineWordsCord).
> -
> -:- pred get_later_words(int::in, list(string)::in, int::in, int::out,
> -    cord(string)::in, cord(string)::out, list(string)::out) is det.
> -
> -get_later_words(_, [], CurLen, FinalLen, LineWords, LineWords, []) :-
> +    (
> +        FirstSCUnit = sc_str(FirstStr),
> +        LineEndReset1 = line_end_reset_nothing,
> +        NextSpace = " ",
> +        string.count_code_points(FirstStr, LenSoFar),
> +        LaterSCUnits0 = LaterSCUnits
> +    ;
> +        ( FirstSCUnit = sc_colour_start(_Colour)
> +        ; FirstSCUnit = sc_colour_end
> +        ),
> +        LineEndReset1 = line_end_reset_colour,
> +        NextSpace = "",
> +        merge_adjacent_colour_changes(FirstSCUnit, LaterSCUnits0, LaterSCUnits,
> +            !ColourStack),
> +        FirstStr = top_colour_to_string(!.ColourStack),
> +        LenSoFar = 0
> +    ),
> +    LineStrsCord0 = cord.singleton(FirstStr),
> +    get_later_words(AvailLeft, NextSpace, LaterSCUnits, LenSoFar, LineWordsLen,
> +        LineStrsCord0, LineStrsCord, RestSCUnits, !ColourStack,
> +        LineEndReset1, LineEndReset),
> +    LineStrs = cord.list(LineStrsCord).
> +
> +:- pred get_later_words(int::in, string::in, list(sc_unit)::in,
> +    int::in, int::out, cord(string)::in, cord(string)::out, list(sc_unit)::out,
> +    colour_stack::in, colour_stack::out,
> +    line_end_reset::in, line_end_reset::out) is det.
> +
> +get_later_words(_, _, [], CurLen, FinalLen,
> +        LineStrs, LineStrs, [], !ColourStack, LineEndReset, LineEndReset) :-
>      FinalLen = CurLen.
> -get_later_words(Avail, [Word | Words], CurLen, FinalLen,
> -        LineWords0, LineWords, RestWords) :-
> -    string.count_code_points(Word, WordLen),
> -    NextLen = CurLen + 1 + WordLen,
> +get_later_words(Avail, NextSpace0, [SCUnit | SCUnits0], CurLen, FinalLen,
> +        LineStrs0, LineStrs, RestSCUnits, !ColourStack,
> +        LineEndReset0, LineEndReset) :-
> +    (
> +        SCUnit = sc_str(Str0),
> +        LineEndReset1 = LineEndReset0,
> +        SCUnits = SCUnits0,
> +        FirstStr = NextSpace0 ++ Str0,
> +        string.count_code_points(FirstStr, FirstStrLen),
> +        NextLen = CurLen + FirstStrLen,
> +        NextSpace = " "
> +    ;
> +        SCUnit = sc_colour_start(_),
> +        LineEndReset1 = line_end_reset_colour,
> +        expect(unify(NextSpace0, " "), $pred,
> +            "NextSpace0 != 1 for colour start"),
> +        merge_adjacent_colour_changes(SCUnit, SCUnits0, SCUnits, !ColourStack),
> +        % We put the space between the previous actual word and the next one
> +        % *before* the colour change.
> +        FirstStr = NextSpace0 ++ top_colour_to_string(!.ColourStack),
> +        NextLen0 = CurLen + 1, % The 1 is for NextSpace0.
> +        NextSpace = "",
> +        % Check whether the next actual word fits on the line.
> +        PeekWordLen = peek_and_find_len_of_next_word(SCUnits),
> +        ( if NextLen0 + PeekWordLen =< Avail then
> +            % It does, so let the recursive call proceed.
> +            NextLen = NextLen0
> +        else
> +            % It does not, so force the check against Avail to fail *now*,
> +            % *not* in the recursive call, so that we change the colour
> +            % just before the next word.
> +            NextLen = Avail + 1
> +        )
> +    ;
> +        SCUnit = sc_colour_end,
> +        LineEndReset1 = line_end_reset_colour,
> +        expect(unify(NextSpace0, " "), $pred, "NextSpace0 != 1 for colour end"),
> +        merge_adjacent_colour_changes(SCUnit, SCUnits0, SCUnits, !ColourStack),
> +        % We put the space between the previous actual word and the next one
> +        % *after* the colour change. (Actually, the recursive call takes
> +        % cate of that, *if* there are any words on the rest of the line.)

s/cate/care/

...


> @@ -1445,11 +1718,54 @@ find_matching_rp([HeadLine0 | TailLines0], !MidLinesCord, !MidLinesLen,
>          )
>      ).
> 
> +%---------------------------------------------------------------------------%
> +%
> +% Colour management.
> +%
> +
> +:- type colour
> +    --->    colour_8bit(int)

Why not a uint8?

> +    ;       colour_reset.
> +
> +:- type colour_db
> +    --->    no_colour_db
> +            % Never use any colour, and ignore all colour changes.
> +    ;       colour_db(colour_name_map).
> +            % Do use colour.
> +
> +:- type colour_name_map
> +    --->    colour_name_map(
> +                cnm_incorrect   :: colour,
> +                cnm_correct     :: colour
> +            ).
> +
> +:- func init_colour_db(option_table) = colour_db.
> +
> +init_colour_db(_OptionTable) = ColourDb :-
> +%   % We should get these from option values.
> +%   % XXX What should the names of the options be?
> +%   ColourIncorrect = colour_8bit(203),     % This is red.
> +%   ColourCorrect =   colour_8bit(40),      % This is green.
> +%   ColourNameMap = colour_name_map(ColourIncorrect, ColourCorrect),
> +%   ColourDb = colour_db(ColourNameMap).
> +    ColourDb = no_colour_db.
> +
> +    % The terminal control codes we use here are described by
> +    % https://en.wikipedia.org/wiki/ANSI_escape_code#Colors.
> +    %
> +:- func colour_to_string(colour) = string.
> +
> +colour_to_string(colour_8bit(ColourNum)) = Str :-
> +    string.format("\033\[38;5;%dm", [i(ColourNum)], Str).
> +colour_to_string(colour_reset) = Str :-
> +    Str =  "\033\[39;49m".

Perhaps we should support the \e escape that several Prolog implementations
provide?

A this is not enabled by default, I'm happy for you to go ahead and commit 
it.

Julien.


More information about the reviews mailing list