[m-users.] A naming and access problem

Zoltan Somogyi zoltan.somogyi at runbox.com
Tue May 18 22:28:58 AEST 2021


2021-05-18 18:00 GMT+10:00 "Sean Charles (emacstheviking)" <objitsu at gmail.com>:
> Hi, I have a token definition like this:
> 
> :- type location
>            ---> pos(index::int, line::int, col::int).
> 
> % for ast detection later.
> :- type token
>            ---> tk(location, string)
>            ; '('(location) ; ')'(location)
>            ; '['(location) ; ']'(location)
>            ; '{'(location) ; '}'(location)
>            ; c1(location, string) ; cn(location, string)
>            ; s1(location, string) ; s2(location, string)
>            .
> 
> Given the above, I have now realised that I can’t easily (?) access the location part of any particulr instance of the token type because it doesn’t have a name and thus no field reader will be generated. As I understand it, I would have to give a unique name.
> 
> I tried:
> 
>     arg(Tok1, canonicalize, 0, Out),
> 
> and was rewarded with this message from the compiler,
> 
> repl.m:199: In clause for predicate `run_lex'/3:
> repl.m:199: in argument 1 of call to predicate `lexer.on_string'/4:
> repl.m:199: in argument 2 of functor `string.between/3’:
> repl.m:199: in unification of argument
> repl.m:199: and term `index(Out)’:
> repl.m:199: type error in argument of functor `index'/1.
> repl.m:199: Argument 1 (Out) has type `(some [ArgT] ArgT)’,
> repl.m:199: expected type was `lexer.location’.
> 
> Which is fair enough but how do I do it? That message is currently above my pay grade.
> Have I overlooked something blindingly obvious?

Using 'arg' here is the wrong thing to do for several reasons.

First, it would be fragile in the face of future development that would add
some field other than the location in front of what is now the first arg in some
of the token type's function symbols.

Second, the first argument of the different function symbols can be different.
In this case, they are not, but 'arg' cannot rely on that: it has to work even on types
whose function symbols have different types in the same argument positions.
This is why its type signature says "I am returning a value, but the type of that
value is not known at compile time". (That is what "some [ArgT] ArgT" means.)

Third, that is an existential type, and as I said earlier, that is not something
people new to Mercury should start with.

Your problem has a much simpler solution. You can simply write

:- func get_token_location(token) = location.

get_token_location(tk(Locn, _)) = Locn.
get_token_location(c1(Locn, )) = Locn.
etc.

By the way, I advise you against using character constants as function symbols.
Mercury supports this because its syntax was originally the same as the syntax
of Prolog (the Mercury parser started out as an existing parser for Prolog), but
just because you can do something, does not mean you should. Using e.g.
lparen and rparen instead of '(' and ')' as function symbol names would
essentially eliminate the chance of code not being parsed right without
quoting, and would make error messages much easier to read. As you can see,
all the function symbols in the token type in Mercury's own lexer  contain nothing
but lower case letters and underscores.

Zoltan.


More information about the users mailing list