[m-dev.] foreign type syntax

Fergus Henderson fjh at cs.mu.OZ.AU
Wed Oct 31 20:03:42 AEDT 2001


On 29-Oct-2001, Tyson Dowd <trd at cs.mu.OZ.AU> wrote:
> If you specify the types using C syntax, you are going to have a bit of
> a hard time marshalling them to and from asm (you won't know how to
> generate type specifications in gcc's tree representation without
> parsing the type specifications).
> 
> But if you specify types using Mercury syntax, you can so this just
> fine, and generate both C an gcc tree representation type.

What's the advantage *for Mercury users* of us inventing a new way of
writing C types in Mercury syntax?  If the Mercury compiler really needs
to understand the structure of C types, then maybe the best way to do
that is to write a parser for (a subset of) C syntax.  From the
perspective of a programmer who is trying to interface Mercury and C,
I think it would be easier to write the type names in C syntax rather
than having to learn some new Mercury syntax for writing C type names
and using that.

But regardless of whether it is done by inventing new Mercury syntax
for C types, or by writing a parser for C, it is going to be very
difficult to handle types such as `FILE' or `pid_t' which are defined
as typedefs in header files.  The problem is that it is hard for the
Mercury compiler to find the C header files for this system, and these
may not even be in standard C syntax.

So the only reasonable way I can see of handling typedefs such as those
is to not try to interpret anything about the structure of the C type.
That can be done using the approach I described earlier (representing
foreign types as `MR_Box', i.e. `void *', and boxing/unboxing them in
the C wrapper functions).

This approach does have some drawbacks: for efficiency, because we
always box/unbox non-word-sized types across the foreign interface;
for simplicity of the performance model, since the issue of implicit
boxing/unboxing now affects types defined using `pragma foreign_type'
as well as polymorphic types; for using gdb on the generated code,
because the C type names aren't included in the generated code.
However, I think these drawbacks are much less important than
having the construct work for types such as `FILE' and `pid_t',
which are exactly the kind of C types that people will want to
interface with.

For the Java back-end, I think the same thing applies: what's the
advantage *for Mercury users* of us inventing a new way of writing Java
types in Mercury syntax?  I don't see any.  I think our users would
prefer to write Java type names using Java syntax.
So for the Java back-end, I think we should use Java syntax for `pragma
foreign_type', and if we need to understand the structure of those types
(as would be the case if we're compiling directly to Java byte code
rather than to Java source) we can easily write a parser for Java types.

For the .NET back-end, we could use IL or C#/MC++/... syntax.
In the short term, we treat these as uninterpreted strings,
which means we'd need to require the user to give
`pragma foreign_type' declarations for both IL and
whatever other foreign languages these types are used with.
For example:

	- if a foreign type `foo' is used as an argument of a procedure
	  defined using `pragma foreign_proc(csharp, ...)',
	  then you'd need to give both
	  `pragma foreign_type(foo, il, "...")' and
	  `pragma foreign_type(foo, csharp, "...")' declarations.

	- If a foreign type `bar' is used as an argument of a procedure
	  defined using `pragma foreign_proc(managed_cplusplus, ...)',
	  then you'd need to give both
	  `pragma foreign_type(bar, il, "...")' and
	  `pragma foreign_type(bar, managed_cplusplus, "...")' declarations.

In the longer term, we could write a parser for C# type syntax, and
then you'd only need to give the C# `pragma foreign_type' declaration.
We could then go even further and also write a parser for IL syntax too,
and then you'd only need to give *either* the IL *or* the C# `pragma
foreign_type' declaration.

More generally, speaking about interfacing with an arbitrary language:
I think it would be easier for Mercury users if the `pragma foreign_type'
syntax used the type syntax from the foreign language being interfaced with.

This does not require the Mercury implementation to be able to parse
that syntax.  In some cases (e.g. for the Java back-end, compiling
to Java source), the Mercury implementation may be able to use the name
directly in the generated code.  In other cases, the Mercury implementation
can use a generic type and box/unbox in the foreign_code wrappers if needed.
Or it can parse the foreign language type syntax, which would avoid the
need to represent foreign types using a generic type.  Whether or not
it does so is a quality-of-implementation issue.

This approach has the following advantages:

	- it's easy for users (they don't need to learn a new syntax,
	  they can just use the syntax of the language that they are
	  interfacing with)

	- it's easy to implement in an OK manner
	  (mapping foreign types to a generic type),
	  without needing to parse the foreign language type syntax

	- it's possible to implement in an ideal manner
	  (although this may require parsing the foreign language type
	  syntax)

In contrast, the approach of inventing Mercury syntax for foreign
language types is easy to implement in an ideal manner, but this comes
at the expense of complicating the Mercury language and making things
harder for Mercury users.  I don't think that is a good trade-off.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  | "... it seems to me that 15 years of
The University of Melbourne         | email is plenty for one lifetime."
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- Prof. Donald E. Knuth
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list