[m-rev.] for review: add examples of the C data passing conventions
Zoltan Somogyi
zoltan.somogyi at runbox.com
Mon Aug 22 15:11:29 AEST 2022
2022-08-22 13:59 GMT+10:00 "Julien Fischer" <jfischer at opturion.com>:
> +% Mercury's int type corresponds to the C type MR_Integer.
> +% MR_Integer is a typedef defined by the Mercury runtime for a signed
> +% word-sized integral type.
"for" seems strange here (and in later copies).
I would go with something like "that expands to", and would also mention
that the expanded-to C type is autoconfigured, and may differ between
target systems.
> + % int_add(A, B) = C:
> + %
> + % This function computes the sum of two Mercury ints using C code.
> + %
> +:- func int_add(int, int) = int.
> +:- pragma foreign_proc("C",
> + int_add(A::in, B::in) = (C::out),
> + [promise_pure, will_not_call_mercury, thread_safe],
> +"
> + C = A + B;
> +").
I don't think this example, and similar ones later on, are as effective
as you would want them to be, because they don't demonstrate
the use of the C types corresponding to Mercury types.
I would go with C code such as
MR_Integer AB = A + B,
D = AB +C
where the predicate computes D from A, B and C.
I would also explain, in this first example, what the promise_pure,
will_not_call_mercury, and thread_safe annotations mean.
Without that, the kinds of readers who need this introduction
can take them as "incantations to appease the great god mmc",
and include them even in foreign_procs in which they are not appropriate.
> + % Return the largest uint64 value.
> + %
> +:- func big_uint64 = uint64.
> +:- pragma foreign_proc("C",
> + big_uint64 = (A::out),
> + [promise_pure, will_not_call_mercury, thread_safe],
> +"
> + // We could also write: A = UINT64_MAX;
> + A = UINT64_C(18446744073709551615);
> +").
Given that you mentioned special requirements for 64 bit code above,
I would have used an example that illuminates those requirements.
> +%----------------------------------------------------------------------------%
> +%
> +% Floats.
> +%
> +
> +% Mercury's float type corresponds to the C type MR_Float.
> +% MR_Float is a typedef defined by the Mercury runtime.
> +% In spf (single-precision) float grades, it is a typedef for C's float type.
> +% In other grades, it is a typedef for C's double type.
Parentheses in wrong place: should be "(single precision float)".
> +% C code can test whether the macro MR_USE_SINGLE_PREC_FLOAT is defined to
> +% check if an spf grade is being used.
> +
> + % add_floats(A, B) = C:
> + % This function computes the sum of two Mercury floats using C code.
> + %
> +:- func add_floats(float, float) = float.
> +:- pragma foreign_proc("C",
> + add_floats(A::in, B::in) = (C::out),
> + [promise_pure, will_not_call_mercury, thread_safe],
> +"
> + C = A + B;
> +").
I would use an operation that does not make sense for integers,
such as sqrt.
> +%----------------------------------------------------------------------------%
> +%
> +% Characters.
> +%
> +
> +% Mercury's char type corresponds to the C type MR_Char.
> +% MR_Char is a typedef defined by the Mercury runtime for a signed 32-bit
> +% integral type.
> +%
> +% A Mercury char represents a Unicode code point and valid values must be in
> +% the range [0, 0x10ffff]. Mercury's foreign language interface does *not*
> +% check that characters passed back to Mercury are within this range.
Add a comma after "point". And an explanation for the absence of this check
may also be useful.
> +% Mercury's string type corresponds to the C type MR_String.
> +% MR_String is a typedef defined by the Mercury runtime for a pointer to char
> +% (i.e. char *).
Again, "for".
> +% Mercury's list.list/1 type corresponds to the C type MR_Word.
In llds grades it does, but I thought in mlds grades we usually call it MR_Box.
> +% MR_Word is a typedef declared in the Mercury runtime for an unsigned integral
> +% type whose size is the same size as a pointer.
> +%
> +% The Mercury runtime defines the following function-like macros for
> +% manipulating Mercury lists in C code:
> +%
> +% MR_bool MR_list_is_empty(MR_Word list);
> +% MR_Word MR_list_head(MR_Word list);
> +% MR_Word MR_list_tail(MR_Word tail);
> +% MR_Word MR_list_empty(void);
> +% MR_Word MR_list_cons(MR_Word head, MR_Word tail);
> +%
> +% When an element is extracted from a list using the MR_list_head() macro, that
> +% element will also have the type MR_Word. How you convert that MR_Word value
> +% to the actual element type depends on what the element type is. For most
> +% element types you can insert a cast to the appropriate type. If the element
> +% type is float, int64 or uint64 you might need to arrange for the element to
> +% be unboxed -- see the following two sections for further details.
This won't make sense to readers who don't know what "boxing" means in this
context.
> In the
> +% following examples, we have lists of int, so adding a cast to MR_Integer will
> +% suffice.
And this won't make sense either, unless you tell readers that data types
whose sizes are one word or less are never boxed.
> +% Because the size of a Mercury float might exceed a word, floats contained in
> +% Mercury data structures might be boxed. That is, they are passed around as a
> +% pointer to a slot on the heap where the actual float is stored.
I wouldn't say "might exceed": I would specify exact when it would exceed a word,
and when it wouldn't. It is not too complicated for users.
> +% When manipulating Mercury data structures that contain floats in C code,
> +% you must account for the possibility that floats are boxed.
"that your code will be compiled on 32 bit machines in grades in which
floats are boxed".
> +% Data structures containing 64-bit integers.
Same comments here.
> +% Foreign types.
> +%
> +
> +% In this section we illustrate how to use a type defined in C from Mercury.
> +
> + % Here is a C type that we wish to use in Mercury.
> + %
> +:- pragma foreign_decl("C", "
> +
> + // A C structure representing a vector in 3-dimensional space.
> + //
> + typedef struct {
> + double i;
> + double j;
> + double k;
> + } c_vector;
> +").
Why not x, y z?
> + % A declaration for the vector/0 type.
> + %
> + % The foreign_type pragma we use below does not act as a type declaration,
> + % so the following abstract type declaration serves that purpose.
That explains things from a compiler writer point of view, which users
don't care about.
> + % When using foreign types we must provide this even if the type is not
> + % exported from its defining module.
> + %
> +:- type vector.
I would be more direct: say that the declaration must be in the interface section
iff the type is exported, but the foreign_type pragma must be in the impl section
regardless of whether the type is exported.
> + // The macro MR_GC_NEW() is used to allocate memory using the garbage
> + // collector.
I know why gc is involved in allocation, but most readers who need this
won't know that. The point you want to get across is that memory allocated
via MR_GC_NEW will be deallocated automatically by the Mercury runtime,
and that it need not and *should* not be deallocated manually.
> It allocates space sufficient for an object of the type named
> + // by the argument.
by ITS argument
> +% The type io.state/0 is what is known as "dummy type". The Mercury compiler
> +% does not generate code that passes around values of dummy types.
> +% Nevertheless, foreign_proc arguments of type io.state/0 are manifested in the
> +% foreign_proc bodies as local variables of type MR_Word.
> +%
> +% Because the Mercury compiler will emit warnings for foreign_proc arguments
> +% that not referred to by the body of the foreign_proc, you must use one of
> +% the following approaches to handling io.state/0 arguments.
> +
> + % This example illustrates the first (and preferred) approach to dealing
> + % with arguments of type io.state/0 in foreign_procs: ignore them.
> + % The Mercury compiler does not require that foreign_proc arguments whose
> + % name begins with an underscore be referred to in the foreign_proc body.
> + %
> +:- pred say_hello(io::di, io::uo) is det.
> +:- pragma foreign_proc("C",
> + say_hello(_IO0::di, _IO::uo),
> + [promise_pure, will_not_call_mercury],
> +"
> + puts(\"Hello!\\n\");
> +").
> +
> + % If foreign_proc arguments of type io.state/0 are not ignored then they
Add comma after "ignored".
> + % will manifest in the foreign_proc body as local variables of the same
> + % names as the arguments.
> + %
> + % Any value assigned to the variable IO in the following block of code will
> + % be ignored.
... by the Mercury code that invoked the foreign_proc.
> The convention is to assign the initial io.state/0 argument
> + % to the final io.state/0 argument at the end of the foreign_proc body.
> + %
This convention does not make sense unless you state that
foreign_procs whose body does not mention an argument will get an error
message, *unless* its name starts with _. In other words, move the last two lines
from the previous example here, in suitably mutated form.
> +% In Mercury, an enumeration is a discriminated union type where none of the
> +% data constructors has any arguments. Mercury enumeration types correspond
> +% to the C type MR_Integer.
This is true only for loose senses of the word "correspond". The straightforward
corresponding C type is a C enum. You want to say that Mercury passes values
of Mercury enums to C code as values of type MR_Integer.
> +% This example illustrates how to use Mercury's foreign_enum pragma to assign
> +% the values by which each constructor of a Mercury enumeration is represented
> +% in C code.
I don't think this says what you want to say. The point of foreign_enum pragmas
is that Mercury conforms to the name->representation mapping set by C,
unlike foreign_export_enum pragmas, which do the opposite.
> +% The MR_ArrayType structure has two fields. The first is named "size" and
> +% has type MR_Integer. Its value gives the number of elements in the array.
> +% The second is named "elements" and is the underlying array of elements.
> +% (The actual definition of this second field varies depending on whether a C
> +% compiler that supports variable-length arrays is being used or not.)
I would use "whether the configured C compiler ..."
Other than all of that nitpicking, the diff is fine :-)
Zoltan.
More information about the reviews
mailing list