[m-dev.] Re: user-defined operators

Peter Schachte schachte at cs.mu.OZ.AU
Sat Jul 10 13:23:58 AEST 1999


On Fri, Jul 09, 1999 at 12:20:39PM +1000, Fergus Henderson wrote:
> Oh, here's another problem: how do you create the interface files?
> Currently this is done by reading in and parsing the file,
> and then writing out the interface section.
> But module interfaces are allowed to be mutually recursive!

Hmmm.  Yes, this can be a problem if, eg, 2 mutually dependent
modules each define an operator used as a type name in the other
module.

> So if you can't parse the file until you've read in the interfaces
> of the imported modules, because you need the operator declarations,
> then you're in a bit of a bind.
> 
> A solution to this would be to have an addition pre-pass which reads
> in and tokenizes each module, parsing any operator declarations
> in the module and writing them out to separate files named
> `<module>.ops'.

The only way I can see to handle this correctly is, as you suggest, to
make sure that all op declarations are handled before trying to parse
any .int files.  I think you could fold the generation of the .ops
files into the extraction of .int files, though, by constraining the
form of op declarations such that they could be recognized from just
the list of tokens.  Basically, you just tokenize each item in the
file; if it's an :- interface or :- implementation declaration, then
you turn on/off the collection of the token lists for the .int file.
If it's an op declaration, you stick it in the .ops file.  The trick
is not to try to parse the toke lists.

> [*] AGNM = "agh, not more files!" :-),

Shouldn't that be ANMF?  :-)

> > > > I guess there is still the question of how to handle importation of
> > > > two different modules that have incompatible operation declarations
> > > 
> > > One possible approach is to use a partial order rather than a total
> > > order for the operator precedences.
> > 
> > This would give you some of the power of a real (context-free)
> > grammar, but also much of the pain of one.
> 
> I think the complexity of operator precedence parsing, even with
> partially ordered precedences, would be considerably less than
> the complexity needed to support arbitrary user-defined grammar rules.

Certainly changing to full context-free parsing, even without
user-defined grammar rules, would be a lot of work.  I supose
efficient parsing with user-defined grammar rules would take some
work, but a simple nondeterministic recursive descent parser with
user-defined grammar rules without any {semantic actions} would be
pretty easy.

> > In fact, maybe more,
> > because you'd have to remember to declare a new operator's
> > relationship to *all* the other operators it's related to.
> 
> No, operator precedence relationships would be transitive, so typically
> you would only need to specify that the precedence of operator "foo"
> is the same as this other operator "bar", or that the precedence
> of "foo" is between that of "bar" and "baz".

Ah, I didn't realize you allowed statements of equivalence.  In that
case I agree it would be reasonably convenient.  Probably no worse
than the Prolog approach.

> Allowing user-written grammar rules of the kind shown above
> would require rather drastic changes to they way that Mercury
> syntax is currently defined.  Basically you'd be going from the current
> three-level grammar (tokens -> terms -> parse tree) to a two-level grammar
> (tokens -> parse tree).

Agreed.  I don't think this is a bad thing, but I agree it is a big
change.

> User-defined operators already let you change the tokens -> terms mapping.
> It might be possible to allow the same kind of affect as user-written
> grammar rules without such a drastic change to the way Mercury syntax
> is defined by providing (a) distfix operators
> and/or (b) some way for users to affect the terms -> parse tree
> mapping.

These would be nice (particularly distfix syntax), but wouldn't solve
the problem that you'd like to be able to use operators in different
ways depending on context.  But I don't want to argue that that
benefit of user-defined grammar rules is worth the upheaval of
introducing that feature.

> > Not if your term reading library has a basic operation to read a
> > module file and all the interfaces it depends on.  The user can always
> > ignore the interface information if they don't need it.
> 
> No, that wouldn't solve the problem.  If I want to write a preprocessor
> that adds new syntax to the language, then I won't be able to use the
> standard library operation which reads a Mercury module file and all the
> interfaces that it depends on, because the input to my preprocessor
> won't be a valid Mercury module, it will just be a file containing
> Mercury terms.

Ok, that's a particularly bad example.  For that, as you say, you'll
have to read the file an item at a time.  But I think what I'm
suggesting would work for the most common applications that want to
read code.

Anyway, this example would be better handled with cpp.


-- 
Peter Schachte                     True development puts first those that
mailto:schachte at cs.mu.OZ.AU        society puts last.
http://www.cs.mu.oz.au/~schachte/      -- Mahatma Gandhi 
PGP: finger schachte at 128.250.37.3  
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list