[mercury-users] Bug in lexer.m

Nicholas Nethercote njn at csse.unimelb.edu.au
Tue Jul 15 14:57:24 AEST 2008


Hi,

lexer.m has this predicate:


:- pred string_ungetchar(string::in, posn::in, posn::out) is det.

string_ungetchar(String, Posn0, Posn) :-
     Posn0 = posn(LineNum0, LineOffset0, Offset0),
     Offset = Offset0 - 1,
     string.unsafe_index(String, Offset, Char),
     ( Char = '\n' ->
         LineNum = LineNum0 - 1,
         Posn = posn(LineNum, Offset, Offset)
     ;
         Posn = posn(LineNum0, LineOffset0, Offset)
     ).


In a 'posn', the first arg is the current line number, the 3rd arg is the 
current offset into the string being parsed, and the 2nd arg is the offset 
of the start of the current line.

When the above code goes back over a newline, the 2nd argument is 
incorrectly set -- it gets set to 'Offset', which is the offset of the 
newline character that ended the line.  It should be set to the offset of 
the beginning of that line (which will be a number of characters earlier 
than 'Offset').  But by this point the offset for the start of that line has 
been lost and cannot be easily recreated.

I suspect this has been wrong for a very long time, but nobody has noticed
because the lexer doesn't ever use the 2nd field anywhere (it could be used 
to report column numbers for lexing errors, but it isn't).  So perhaps the 
3-argument 'posn' type could be replaced with a 2-argument type, and some 
memory would be saved.

Nick
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the users mailing list