[m-dev.] some issues I noticed while working on string.m

Peter Wang novalazy at gmail.com
Mon Nov 17 11:10:46 AEDT 2014


Hi Zoltan,

On Sun, 16 Nov 2014 18:39:48 +1100 (EST), "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> While reordering the contents of string.m, I noticed some
> things that I would like your opinions on.
> 
> Thing 1
> 
> Peter, you originally wrote the comments that say
> 
>     % NOTE: in future the same treatment may be afforded
>     % surrogate code points.
> 
> Can you please tell me whether this means that the predicates on which
> these comments appear may in the future start throwing exceptions
> when they find surrogate code points? And if not, what DO they mean?

That's what it means.  Reading the comments again, they are ambiguous.

> 
> Thing 2
> 
> In the names of predicates, we use code_unit (with an underscore)
> but codepoints (without an underscore). Peter, was this difference
> intentional?

It was intentional.  "codepoint" is nonstandard if somewhat common, so
it may have been the wrong choice.  I won't be too opposed to renaming
the predicates (with backwards compatibility), except in the interests
of reducing library churn.

> 
> Thing 3
> 
> The MR_set_code_unit function was added (as MR_set_char) by Fergus
> exactly 14 years ago, on 16 nov 2000. For speed, I want to remove
> the workaround that this function represents. Who has the oldest
> gcc version installed on their machine? If that doesn't need
> this workaround, then I think we can safely delete it. On my
> machine, it is gcc 4.6.
> 
> Thing 4
> 
> Do the types line and text_file really belong in string.m? To me,
> they seem to belong more in string.m, next to the comment that
> says "Line oriented streams". Any opinions?

I agree with Julien.

> 
> Thing 5
> 
> At the moment, when a .m file contains a foreign_decl pragma,
> we write that out the contents of that pragma with a #line directive, like this:
> 
>     #line 1296 "string.m"
> 
>     #include <ctype.h>
>     #include <string.h>
>     #include <stdio.h>
> 
>     #include "mercury_string.h"   /* for MR_allocate_aligned_string*() etc. */
>     #include "mercury_tags.h"     /* for MR_list_cons*() */
> 
> The #line directive is nice for users if the foreign code declaration
> has a problem, but when the position of the foreign_decl in its source file
> changes, the resulting change in the #line directive causes the recompilation
> of every .c file that includes that .mh file, even when those recompilations
> are otherwise unnecessary.
> 
> We can avoid those unnecessary recompilations by either
> 
> - adding an option to foreign_decl directives that allows users to tell the
>   compiler that it can leave out the #line directive, probably because
>   the foreign code is unlikely to yield error messages; or
> - we can simply never write out #line directives in such cases, figuring
>   that this is true for a large majority of the things people put in
>   foreign_decl pragmas.
> 
> Any opinions?

I think it should be controlled by a command-line option, e.g.
--line-numbers-for-foreign-decl, off by default.

Peter



More information about the developers mailing list