[m-rev.] for review: better error messages for lambda exprs

Zoltan Somogyi zoltan.somogyi at runbox.com
Fri May 6 16:53:02 AEST 2016



On Fri, 6 May 2016 16:02:12 +1000, Peter Wang <novalazy at gmail.com> wrote:
> On Thu, 05 May 2016 22:28:04 +1000 (AEST), "Zoltan Somogyi" <zoltan.somogyi at runbox.com> wrote:
> > For review by anyone. If possible, I would like a review of the
> > change to NEWS by more than one person.
> > 
> > Like several of my diffs in the last few months, this one improves
> > the error messages we generate for syntax errors, but restricts
> > the set of legal Mercury programs. If anyone objects to the restrictions
> > imposed by this diff, please speak up.
> 
> Hi Zoltan,
> 
> I object to the restriction.

Is this objection philosophical or practical? And to which parts
of the change? The parts for lambda expressions, the parts for
the ^, :=, : or @ operators, or all of them?
 
> Firstly, on the grounds of language design.  IMHO Prolog syntax is well
> designed, flexible and general.

It is certainly flexible and general, but I don't think it is well designed.
In fact, I don't think it was consciously designed at all in programming
language terms, being more of an incidental aspect of the exploration
of "can one use Horn clause logic as a programming language?", with
all the focus of the designers being on the other aspects of the language.
It was certainly not designed with software engineering principles in mind,
since most of those didn't become widely known until much later, in the
late 1970s or even the 80s.

Prolog was first implemented around 1970. PL/I, the most hyped language
at the time, had no reserved words. Since its keywords could be used as
identifiers, programmers could write statements such as this:

\Red{IF} THEN \Red{THEN} THEN = ELSE; \Red{ELSE} ELSE = THEN;

with the \Red{}-wrapped words acting as keywords and the other words as
variable names (taken from my lecture slides). This liberality of allowing
programs to use words (or in our case, operators) that have a meaning
in the language for other, completely unrelated purposes has been shown
by experience to be a bad thing. There is a reason why languages designed
since the late 70s onward do NOT have this liberality. And, and this is
a key point, people who care about the clarity of their code don't take
advantage of it even when the language allows it.

The data I was trying to acquire by sampling the (admittedly biased)
set of Mercury developers was "what fraction of Mercury code is written
in a style that reuses language keywords and/or operators for unrelated
purposes?". Is that fraction negligible or not?

> My feeling is that we should continue
> on those foundations, so that any syntactically valid, well-typed data
> term should be writable in a Mercury program.

Sure. The question is about what terms we accept as syntactically valid.

> Second, practically.  Occasionally the Mercury language "takes over"
> new words, sometimes common words like "event" and "for".  For programs
> that were using such words (or wish to), it would have been an annoyance
> when the words were added to the operator table, but a relatively
> minor annoyance as only simple names needed to be parenthesized.  As I
> understand it, this would no longer the case.  Any future version of the
> language may require some constructors to be renamed, not just make them
> harder to type.

That is a fact of life for EVERY language. I remember when I was evaluating
whether to buy some HP servers for the department for use by students
(sometime in the mid 1990s), I tested the HP C compiler, and found that
it could not compile my perfectly standard C program, because it was a
compiler that could compile both C and C++, and it considered "new" to be
a keyword even when compiling C. And this couldn't be fixed by putting
the variable name "new" in parentheses either.

The only languages that NEVER have any new keywords added to them
are the ones that don't have any, and the ones that no-one uses.

> It can be convenient to use term syntax to persist data structures
> (perhaps using custom operator tables to protect against new operators).
> If a program is forced to rename a constructor then the data format would
> implicitly change.  To retain compatibility with an old constructor
> name, the program would need to implement versions of term_to_type/
> type_to_term that translate between the old and new names.  That potential
> for problems down the track detracts from the utility of the term syntax
> (which, as I said, I regard as a good design).

If anyone has file-stored terms which would be affected by this change,
they can write a simple translation program, compiled with a compiler
from BEFORE this change, that reads in those terms, constructs from them
structurally identical terms in another type that has the problematic function
symbols renamed, and writes them out again. No need to rewrite term_to_type.

We could a pointer to this technique to the NEWS file.

Zoltan.




More information about the reviews mailing list