[m-dev.] quoting procedure names when necessary in mdb output

Fergus Henderson fjh at cs.mu.OZ.AU
Wed Aug 20 16:08:30 AEST 2003


On 20-Aug-2003, Peter Moulder <Peter.Moulder at infotech.monash.edu.au> wrote:
> Appended is a patch that has mdb quote procedure names when necessary
> in its output.  This is useful for front-ends that use mdb, such as the
> emacs interface.  In pathalogical cases it can be useful even when
> humans parse the output.  (Actually, `/' (and to a lesser extent `-') are
> fairly common cases where it's at least a bit clearer to read with
> quoting.)

That seems like a good idea.

> The quoting style used in the patch is the same as the syntax for
> quoted names in Mercury code.  For example, a newline is written as `\n'
> (with the name being enclosed by single quotes, 'like\nso'), and octal
> escapes have a trailing backslash ('\154\ike \163\o').
> 
> However, I've noticed when testing this that mdb uses a different
> quoting mechanism
> (trace/mercury_trace_internal.c:MR_trace_break_off_one_word;
> MR_trace_continue_line is relevant to a lesser degree).  It doesn't
> support octal escapes or \n etc., though backslash-apostrophe can be
> used to enter names containing an apostrophe.
> 
> So I wonder whether I should modify the patch to use mdb's existing
> quoting style, or modify MR_trace_break_off_one_word to use
> Mercury-name-style quoting, or leave both as they are.

I think the second of these (modify MR_trace_break_off_one_word
to use Mercury-name-style quoting) is preferable.

However, I think the patch is OK for now; changing that can be a
separate patch.

> At the moment my feeling is that it's OK for the patch to retain its
> Mercury-name quoting, and that it would be good to have a separate patch
> that changed MR_trace_break_off_one_word to accept Mercury-name-style
> escape sequences.

My thoughts exactly ;-)

> The patch doesn't do any quoting for module names at the moment.

That's OK; util/mkinit.c doesn't support module names containing
special characters, and they would probably also cause trouble for
Mmake too.

> Index: runtime/mercury_stack_trace.c
...
> @@ -1029,6 +1029,84 @@
>  
> +/* Like islower, but ignoring locale. */
> +static MR_bool
> +MR_is_lower(unsigned c)

s/islower/islower()/

Having both MR_is_lower() and MR_islower() (see runtime/mercury_std.h)
is confusing.  So perhaps this one should be named
MR_is_lower_ignoring_locale() or something like that.

> +{
> +    return ((c >= 'a')
> +            && (c <= 'z')
> +            && (c - ('i' + 1u) >= 'j' - ('i' + 1u))    /* In ebcdic, 'j' - 'i' != 1. */
> +            && (c - ('r' + 1u) >= 's' - ('r' + 1u)));  /* In ebcdic, 's' - 'r' != 1. */
> +    /* (For ascii, I'm fairly sure that gcc will optimize away those
> +       last two conjuncts, but I'm not so sure if we did something
> +       like ((c <= 'i') || (c >= 'j')).  (('i' + 1 == 'j') || ...)
> +       should be fine though. */
> +}

Ick.

How about just writing it like this?

static MR_bool
MR_is_lower(unsigned c)
{
	switch(c) {
	case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': case 'g':
	case 'h': case 'i': case 'j': case 'k': case 'l': case 'm': case 'n':
	case 'o': case 'p': case 'q': case 'r': case 's': case 't': case 'u':
	case 'v': case 'w': case 'x': case 'y': case 'z':
		return MR_TRUE;
	default:
		return MR_FALSE;
	}
}

I think that is much more obviously correct.

Note that gcc 3.2 generates identical code for both of these.
(gcc 2.95.4 generates slightly worse code for the switch, but the
difference is quite small and anyway this code is not
performance-sensitive.)

Also, IMHO it would be better to use `unsigned char' or just `char'
rather than `unsigned'.

> +static MR_bool
> +MR_is_alnum_or_underscore(unsigned c)
> +{
> +    return (MR_is_lower(c)
> +            || (c == '_')
> +            || ((c >= 'A')
> +                && (c <= 'Z')
> +                && (c - ('I' + 1u) >= 'J' - ('I' + 1u))   /* In ebcdic, 'J' - 'I' != 1. */
> +                && (c - ('R' + 1u) >= 'S' - ('R' + 1u)))  /* In ebcdic, 'S' - 'R' != 1. */
> +            || ((c >= '0') && (c <= '9')));
> +}

Here I think it would be nicer to split out `MR_is_upper' as a subroutine
(implemented similarly to MR_islower).

> +static MR_bool
> +MR_needs_quoting(char const *name)
> +{
> +    unsigned c = (unsigned char) *name;
> +    if(!MR_is_lower(c))
> +        return MR_TRUE;
> +    while((c = (unsigned char) *++name) != '\0')
> +        if(!MR_is_alnum_or_underscore(c))
> +            return MR_TRUE;
> +    return MR_FALSE;
> +}

Missing curly braces.  See
<http://www.cs.mu.oz.au/mercury/information/developers/c_coding_standard.html>:
"Always use curlies, even when there's only one statement in the block."

> +static void
> +MR_maybe_quote(FILE *fp, char const *name)
> +{
> +    if(!MR_needs_quoting(name))
> +        fputs(name, fp);

Likewise.

> +        /* todo: Add some test cases trying to establish that all strings not
> +           containing NUL bytes are round-trippable. */

s/todo/XXX TODO/
Fix comment formatting.


-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list