[m-dev.] foreign type syntax

Tyson Dowd trd at cs.mu.OZ.AU
Thu Nov 1 14:04:51 AEDT 2001


On 31-Oct-2001, Fergus Henderson <fjh at cs.mu.OZ.AU> wrote:
> On 29-Oct-2001, Tyson Dowd <trd at cs.mu.OZ.AU> wrote:
> > If you specify the types using C syntax, you are going to have a bit of
> > a hard time marshalling them to and from asm (you won't know how to
> > generate type specifications in gcc's tree representation without
> > parsing the type specifications).
> > 
> > But if you specify types using Mercury syntax, you can so this just
> > fine, and generate both C an gcc tree representation type.
> 
> What's the advantage *for Mercury users* of us inventing a new way of
> writing C types in Mercury syntax?  If the Mercury compiler really needs
> that is to write a parser for (a subset of) C syntax.  From the
> perspective of a programmer who is trying to interface Mercury and C,
> I think it would be easier to write the type names in C syntax rather
> than having to learn some new Mercury syntax for writing C type names
> and using that.

This is fair enough, I just wanted to hear someone say they believe
writing a type parser is a better solution.

Of course the user is going to have to be very careful about such
things.

If you write:

:- pragma foreign_decl("C", 
"
	typedef struct foo {
		int x;
	} bar;
").

:- pragma foreign_type(bar, c("bar")).

you cannot expect the type parser to do much.


> But regardless of whether it is done by inventing new Mercury syntax
> for C types, or by writing a parser for C, it is going to be very
> difficult to handle types such as `FILE' or `pid_t' which are defined
> as typedefs in header files.  The problem is that it is hard for the
> Mercury compiler to find the C header files for this system, and these
> may not even be in standard C syntax.
> 
> So the only reasonable way I can see of handling typedefs such as those
> is to not try to interpret anything about the structure of the C type.
> That can be done using the approach I described earlier (representing
> foreign types as `MR_Box', i.e. `void *', and boxing/unboxing them in
> the C wrapper functions).
> 
> This approach does have some drawbacks: for efficiency, because we
> always box/unbox non-word-sized types across the foreign interface;
> for simplicity of the performance model, since the issue of implicit
> boxing/unboxing now affects types defined using `pragma foreign_type'
> as well as polymorphic types; for using gdb on the generated code,
> because the C type names aren't included in the generated code.
> However, I think these drawbacks are much less important than
> having the construct work for types such as `FILE' and `pid_t',
> which are exactly the kind of C types that people will want to
> interface with.
> 
> For the Java back-end, I think the same thing applies: what's the
> advantage *for Mercury users* of us inventing a new way of writing Java
> types in Mercury syntax?  I don't see any.  I think our users would
> prefer to write Java type names using Java syntax.
> So for the Java back-end, I think we should use Java syntax for `pragma
> foreign_type', and if we need to understand the structure of those types
> (as would be the case if we're compiling directly to Java byte code
> rather than to Java source) we can easily write a parser for Java types.
> 
> For the .NET back-end, we could use IL or C#/MC++/... syntax.
> In the short term, we treat these as uninterpreted strings,
> which means we'd need to require the user to give
> `pragma foreign_type' declarations for both IL and
> whatever other foreign languages these types are used with.

Since you cannot know in advance which foreign languages a type is used
with , you either demand all of them, or you throw exceptions if one is
demanded but not available.

>
> For example:
> 
> 	- if a foreign type `foo' is used as an argument of a procedure
> 	  defined using `pragma foreign_proc(csharp, ...)',
> 	  then you'd need to give both
> 	  `pragma foreign_type(foo, il, "...")' and
> 	  `pragma foreign_type(foo, csharp, "...")' declarations.
> 
> 	- If a foreign type `bar' is used as an argument of a procedure
> 	  defined using `pragma foreign_proc(managed_cplusplus, ...)',
> 	  then you'd need to give both
> 	  `pragma foreign_type(bar, il, "...")' and
> 	  `pragma foreign_type(bar, managed_cplusplus, "...")' declarations.
> 
> In the longer term, we could write a parser for C# type syntax, and
> then you'd only need to give the C# `pragma foreign_type' declaration.
> We could then go even further and also write a parser for IL syntax too,
> and then you'd only need to give *either* the IL *or* the C# `pragma
> foreign_type' declaration.
> 
> More generally, speaking about interfacing with an arbitrary language:
> I think it would be easier for Mercury users if the `pragma foreign_type'
> syntax used the type syntax from the foreign language being interfaced with.
> 
> This does not require the Mercury implementation to be able to parse
> that syntax.  In some cases (e.g. for the Java back-end, compiling
> to Java source), the Mercury implementation may be able to use the name
> directly in the generated code.  In other cases, the Mercury implementation
> can use a generic type and box/unbox in the foreign_code wrappers if needed.
> Or it can parse the foreign language type syntax, which would avoid the
> need to represent foreign types using a generic type.  Whether or not
> it does so is a quality-of-implementation issue.
> 
> This approach has the following advantages:
> 
> 	- it's easy for users (they don't need to learn a new syntax,
> 	  they can just use the syntax of the language that they are
> 	  interfacing with)
> 
> 	- it's easy to implement in an OK manner
> 	  (mapping foreign types to a generic type),
> 	  without needing to parse the foreign language type syntax
> 
> 	- it's possible to implement in an ideal manner
> 	  (although this may require parsing the foreign language type
> 	  syntax)
> 
> In contrast, the approach of inventing Mercury syntax for foreign
> language types is easy to implement in an ideal manner, but this comes
> at the expense of complicating the Mercury language and making things
> harder for Mercury users.  I don't think that is a good trade-off.

The idea with the .NET backend was to try to invent syntax that was
pretty close to the Mercury syntax for foreign language types (e.g. what
that type would map into in Mercury).  This was trying to go along with
the general idea of .NET -- you don't necessarily have to learn a
foreign language to interface with it.  

But I guess this conflicts with the other backends where there is just
one foreign language, and so there is a definitive syntax for types.
So I am happy to go along with the foreign language syntax types for the
moment.  I wanted to support it anyway.

So I will work on changing the syntax and implementation to support this
approach.

-- 
       Tyson Dowd           # 
                            #  Surreal humour isn't everyone's cup of fur.
     trd at cs.mu.oz.au        # 
http://www.cs.mu.oz.au/~trd #
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list