[m-dev.] some issues I noticed while working on string.m

Zoltan Somogyi zoltan.somogyi at runbox.com
Sun Nov 16 18:39:48 AEDT 2014


While reordering the contents of string.m, I noticed some
things that I would like your opinions on.

Thing 1

Peter, you originally wrote the comments that say

    % NOTE: in future the same treatment may be afforded
    % surrogate code points.

Can you please tell me whether this means that the predicates on which
these comments appear may in the future start throwing exceptions
when they find surrogate code points? And if not, what DO they mean?

Thing 2

In the names of predicates, we use code_unit (with an underscore)
but codepoints (without an underscore). Peter, was this difference
intentional?

Thing 3

The MR_set_code_unit function was added (as MR_set_char) by Fergus
exactly 14 years ago, on 16 nov 2000. For speed, I want to remove
the workaround that this function represents. Who has the oldest
gcc version installed on their machine? If that doesn't need
this workaround, then I think we can safely delete it. On my
machine, it is gcc 4.6.

Thing 4

Do the types line and text_file really belong in string.m? To me,
they seem to belong more in string.m, next to the comment that
says "Line oriented streams". Any opinions?

Thing 5

At the moment, when a .m file contains a foreign_decl pragma,
we write that out the contents of that pragma with a #line directive, like this:

    #line 1296 "string.m"

    #include <ctype.h>
    #include <string.h>
    #include <stdio.h>

    #include "mercury_string.h"   /* for MR_allocate_aligned_string*() etc. */
    #include "mercury_tags.h"     /* for MR_list_cons*() */

The #line directive is nice for users if the foreign code declaration
has a problem, but when the position of the foreign_decl in its source file
changes, the resulting change in the #line directive causes the recompilation
of every .c file that includes that .mh file, even when those recompilations
are otherwise unnecessary.

We can avoid those unnecessary recompilations by either

- adding an option to foreign_decl directives that allows users to tell the
  compiler that it can leave out the #line directive, probably because
  the foreign code is unlikely to yield error messages; or
- we can simply never write out #line directives in such cases, figuring
  that this is true for a large majority of the things people put in
  foreign_decl pragmas.

Any opinions?



More information about the developers mailing list