[mercury-users] Bug in lexer.m
Nicholas Nethercote
njn at csse.unimelb.edu.au
Tue Jul 15 14:57:24 AEST 2008
Hi,
lexer.m has this predicate:
:- pred string_ungetchar(string::in, posn::in, posn::out) is det.
string_ungetchar(String, Posn0, Posn) :-
Posn0 = posn(LineNum0, LineOffset0, Offset0),
Offset = Offset0 - 1,
string.unsafe_index(String, Offset, Char),
( Char = '\n' ->
LineNum = LineNum0 - 1,
Posn = posn(LineNum, Offset, Offset)
;
Posn = posn(LineNum0, LineOffset0, Offset)
).
In a 'posn', the first arg is the current line number, the 3rd arg is the
current offset into the string being parsed, and the 2nd arg is the offset
of the start of the current line.
When the above code goes back over a newline, the 2nd argument is
incorrectly set -- it gets set to 'Offset', which is the offset of the
newline character that ended the line. It should be set to the offset of
the beginning of that line (which will be a number of characters earlier
than 'Offset'). But by this point the offset for the start of that line has
been lost and cannot be easily recreated.
I suspect this has been wrong for a very long time, but nobody has noticed
because the lexer doesn't ever use the 2nd field anywhere (it could be used
to report column numbers for lexing errors, but it isn't). So perhaps the
3-argument 'posn' type could be replaced with a 2-argument type, and some
memory would be saved.
Nick
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to: mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions: mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the users
mailing list