[mercury-users] Re: Lisp-like syntax for Mercury (Mercury & macros)

Richard A. O'Keefe ok at cs.otago.ac.nz
Wed Jun 12 13:25:19 AEST 2002


schachte at cs.mu.OZ.AU (Peter Schachte) wrote:
	Unfortunately, character_escapes must be explicitly enabled, and in
	general there's no way to tell by looking at it if a file is meant to
	be compiled with character escapes on or off.

Agreed.

	It's also rather a nuissance, and error prone to boot, to turn
	on character escapes for a single file.

This is indeed a defect.  It's also a problem that various style checks
cannot readily be set for a single file.  There *ought* to have been a
directive

:- compile_flags([Flag...]).

where the compiler would have pushed the old values of the flags on a
stack at that point, then popped them off when it finished processing
the file.  Hindsight is a wonderful thing.

	It's unfortunate that character escapes weren't made an always
	on feature, with the prohibition on unescaped newlines.
	
Backwards compatibility was one of QP's selling points.

I've still got the Sun-3/50 I used at Quintus, with 4MB of memory.
What a lot it seemed back then, and what a lot we didn't do because
we wanted the system to seen (rightly) as small.

	> But the real secret is "in order to move backwards, tokenise FORWARDS".
	
	Of course, this isn't easy in a text editor.  You pretty much have to
	rely on the rule that all and only clause heads begin in column 1.

It all depends on how smart you are willing to be.
If you want to do syntax colouring in ANY language, then
 - either you keep tight control over what the user does,
 - or you occasionally let syntax colouring get out of synch
   (Alpha seems to do this)
 - or you do something clever like recording some kind of syntax state
   at the beginning of each line, scanning back to the last line whose
   start start you know, and parsing forward from there
 - or you use a layout heuristic for finding useful starting points.

	Except for % comment lines.

Which are very properly ignored by my editor.

	And there is a performance hit involved
	in finding the beginning of the clause and tokenizing it forward to
	the current location every time you do anything.
	
Even on my 250MHz machine, I would be astonished if anyone noticed that
overhead.  With 1.7 GHz machines out there, well...

I had heard that current Emacs implementations use a doubly linked list
of lines.  If you tag each line with its start state, then you normally
don't have to scan back any further than the beginning of the current
line.

	> Newline at same indentation: <LF>.
	> Increase indentation: <ESC><Ctrl-I>
	> Decrease indentation: <ESC><Ctrl-U>
	> 
	> Given these keys, I find getting the indentation right manually
	> is practically effortless
	
	You are still forced to think about your program as text.  I'd rather
	think about it as predicates and terms.  Only the comments are text.
	
Well, I think literate programming is a great idea, so my programs ARE text
meant for people to read.  In many cases, good layout (putting extra spaces
in to make things that _are_ tabular _look_ tabular) is part of "quality"
editing for readability.  I also like writing programs that write programs.
For example, several of my recent C programs contain what looks like
embedded XML.  There's a preprocessor that turns the XML bits into C code
using the DVM library to build suitable data structures.  There isn't ANY
indenting editor I can trust to get that right, because something like

xml person(xml p, int want_e_mail_and_or_id) {
    char const *e_mail = value(p, "e-mail", 0);

.   <span>
      if (want_e_mail_and_or_id >= 0) {
.       <a name=value(p, "id", 0) id=value(p, "id", 0)>
.         ^first_child_named(p, "name")
.       </a>
      } else {
.       ^first_child_named(p, "name")
      }
      if (first_child_named(p, "org")) {
.       ", " ^first_child_named(p, "org")
      }
.     ".\n"
      if (want_e_mail_and_or_id > 0 && e_mail != 0) {
.       <a href=str_printf("mailto:%s", e_mail)> =e_mail </a>
      }
.   </span>
    return x_pop();
}

doesn't follow normal C rules.  It doesn't follow normal XML rules either.
By thinking of my program (C in this case) as text, I am enabled to think
of automatically turning it into _different_ text for compilation.

	My point was that "." followed by whitespace does not necessarily end
	a clause, and telling whether it does or not is difficult.

It does whenever "." is the start of a token.
It is not difficult if you parse forward.
On any reasonable size file, the time required to parse forward is just
not going to be noticed.  (When I was at Quintus, it was a couple of
years before I stopped using C Prolog as a preliminary syntax checker,
because it read Prolog *fast*.  The parser in my editor works in basically
the same way as the C Prolog parser.  It's fast.)

	In fact, because of character escapes, it's not in general
	possible to tell just by looking at the file.
	
Operator declarations too.  One reason I don't greatly care for having
lots of operators.  I wish Quintus had approved my "syntax table" proposal
for ISO instead of sitting on me and telling me not to get involved.
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the users mailing list