[m-dev.] Re: \r as whitespace

Fergus Henderson fjh at cs.mu.OZ.AU
Fri May 11 22:21:04 AEST 2001


On 20-Mar-2001, Paul Massey <pma at miscrit.be> wrote:
> 
> Our web server written in Mercury has a DLL at the front of
> it to package requests from the browsers. Netscape appears
> to convert \r\n into \n before sending it to the DLL (from
> what I can see), but IE doesn't (unsurprisingly). The DLL
> then sends the request via a socket to the server which then
> uses read_term to recover the query structure (if the string
> contains \r as well as \n then the read_term fails and
> leaves the server in a strange state with a partial query on
> the input stream).

How are you reading the term in from the socket?

If you are reading it in using Mercury's io__read_char etc.  or C's
getchar(), etc., then the \r\n should get converted into \n automatically,
provided you opened the C FILE* in text mode.

> Also, I have programs I develop under both windows and
> Linux. I've run into problems when \r is not recognised by
> the compiler as a white space because files copied across
> naively have \r at the ends of lines, etc. (I could have
> used dos2unix or whatever but why make it difficult?).

You'd still need to use dos2unix on your Mmakefiles,
and for C code embedded in Mercury code that uses line continuations, e.g.

	:- pragma c_header_code"
		#define DECLARE_FOO \\
			int foo;
	").

Also Mercury string constants that extend over more than one line, e.g.

	main ---> io__write_string("
		A long message
		goes here
	").

would get the wrong semantics if you don't use dos2unix or equivalent,
since the string literal would now contain extra carriage returns.
I think it's much better to get a compiler error up-front, rather than
having the compiler accept it and generate subtly incorrect code.

> Treating \r as white space would simplify both these
> problems.

I think it would only mask some of the problems.
Other problems like the ones mentioned above would remain.
Using dos2unix or equivalent seems like a better solution.

> I'm not sure how correct it is, but
> 
> http://www.sics.se/isl/sicstus/docs/latest/html/sicstus.html#Full%20Syntax
[...]

The relevant part of that is

   layout-char
             These are character codes 0..32 and 127. This includes
	     characters such as <TAB>, <LFD>, and <SPC>.

In other words, SICStus treats *all* control characters as whitespace.
That seems like a pretty dumb thing to do, IMHO.

> Maybe we could check against a real copy (I don't have one
> around) of the ISO Prolog standard.

The corresponding part of the Prolog standard says

	6.5.4 Layout characters

	layout char (* 6.5.4 *)
	  = space char (* 6.5.4 *)
	  | horizontal tab char (* 6.5.4 *)
	  | new line char (* 6.5.4 *) ;

	space char (* 6.5.4 *) = " " ;
	horizontal tab char (* 6.5.4 *) = implementation dependent ;
	new line char (* 6.5.4 *) = implementation dependent ;

The Prolog standard says that an implementation may as an extension add
extra syntax rules, so long as syntactically correct Prolog programs
remain valid.  So SICStus' extension is not of itself non-conforming.
However, a conforming Prolog implementation must "offer a strictly
conforming mode which shall reject the use of an implementation specific
feature in Prolog text".  I don't think SICStus Prolog does that.

Programs that rely on such extensions are not strictly conforming Prolog
programs and may not be portable to other Prolog implementations.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
                                    |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list