[mercury-users] Addendum to: need help: Mercury+C/C++?
Fergus Henderson
fjh at cs.mu.OZ.AU
Wed Sep 2 13:04:24 AEST 1998
On 01-Sep-1998, tcklnbrg <at at ingenuity-sw.com> wrote:
> tcklnbrg wrote:
> > The lexer.c file contains one global var, "char * tokenStr",
>
> Addendum: The above is wrong: it is "string * tokenStr"
>
> This means that lexer.c, and presumably et.al.
> must be compiled with a c++ compiler,
> in this case g++ rather than gcc. The "string", above is the
> g++/std/ string.
OK, that complicates things a little. The Mercury implementation does
not (yet) support a direct C++ interface, it only has a C interface.
So if you want to interface with C++ code from Mercury, you need to
build a C wrapper around your C++ code.
To compile the C++ file lexer.c, you need to add the following to
your Mmakefile:
CXX = g++
CXXFLAGS = -Wall
lexer.o : lexer.c
$(CXX) $(CXXFLAGS) -c lexer.c
Actually I suggest calling it `lexer.cxx' or `lexer.cc'
rather than `lexer.c' if it is C++. Then you could add
a suffix rule to your Mmakefile
.SUFFIXES: .cxx
.cxx.o:
$(CXX) $(CXXFLAGS) -c $<
and there'd be no need for the explicit rule for lexer.o.
(Suffix rules, `.SUFFIXES', and `$<' are features of standard Make.
Mmake is built on top of GNU Make, so you could alternatively use
GNU Make's pattern rules instead.)
Also, you need to build a C wrapper for the C++ code in lexer.c.
This will look something like this:
/* lexer_interface.h */
#ifdef __cplusplus
extern "C" {
#endif
const char * C_tokenStr(void);
char * C_tokenize(char *);
/* maybe you should have `const' in there somewhere? */
#ifdef __cplusplus
}
#endif
/* lexer_interface.cxx */
#include <string>
extern string tokenStr;
extern "C" const char * C_tokenStr() {
// string::c_str() converts a C++ string to a C string
return tokenStr.c_str();
}
extern "C" char * C_tokenize(char *s) {
return tokenize(s);
}
The Mercury interface to the C++ function `tokenize'
and the C++ global variable `tokenStr' will look something like this:
:- pred tokenize(string::in, string::out,
io__state::di, io__state::uo) is det.
:- pred tokenStr(string::out, io__state::di, io__state::uo) is det.
These can be implemented using Mercury's `pragma c_code' to
interface with the C interface that you wrapped around the C++ code.
:- pragma c_header_code("#include ""lexer_interface.h"").
:- pragma c_code(get_next_token(Stuff::in, Token::out, IO0::di, IO::uo),
will_not_call_mercury,
"
Token = make_aligned_string_copy(C_tokenize(Stuff));
update_io(IO0, IO);
").
:- pragma c_code(tokenStr(Result::out, IO0::di, IO::uo),
will_not_call_mercury,
"
const char *c_tokenStr = C_tokenStr();
Result = make_aligned_string_copy(c_tokenStr);
/* free(c_tokenStr); */
update_io(IO0, IO);
").
Now there is one tricky part of the above code that I have not yet explained,
and that is the calls to make_aligned_string_copy().
The values returned from C_tokenStr() and C_tokenize()
are not suitable for use as Mercury strings because
- they do not have the appropriate lifetime; that is,
the memory they are stored in may be deallocated
- they are not guaranteed to remain constant
- they are not guaranteed to be word-aligned
For C_tokenStr(), the return value is obtained from string::c_str();
according to the C++ standard, this return value from string::c_str()
is only guaranteed to remain valid up until the next time the string is
modified, so we need to make a copy. The macro make_aligned_string_copy()
creates a copy on the Mercury heap, appropriately aligned.
(The Mercury implementation requires that Mercury strings be word-aligned so
that it can use the bottom two or three bits of the pointer as tag bits.)
This macro was introduced in Mercury version 0.7.2.
For C_tokenize(), it's a bit more complicated, since you did not
specify the lifetime of the return value from tokenize(). If the
return value is allocated on the C or C++ heap, then it may be the
caller's responsibility to deallocate that memory, so you may
need to add a call to free() or `delete' to avoid a memory leak there.
If you need to deallocate the memory using `delete' (or `delete []')
then you will need to add a C interface to the C++ delete, e.g.
extern "C" void C_delete_char_array(char *s) {
delete [] s;
}
in lexer_interface.cxx.
--
Fergus Henderson <fjh at cs.mu.oz.au> | "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh> | of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3 | -- the last words of T. S. Garp.
More information about the users
mailing list