[mercury-users] Bug in lexer.m

Nicholas Nethercote njn at csse.unimelb.edu.au
Tue Jul 15 21:40:30 AEST 2008


On Tue, 15 Jul 2008, Nicholas Nethercote wrote:

> :- pred string_ungetchar(string::in, posn::in, posn::out) is det.
>
> string_ungetchar(String, Posn0, Posn) :-
>    Posn0 = posn(LineNum0, LineOffset0, Offset0),
>    Offset = Offset0 - 1,
>    string.unsafe_index(String, Offset, Char),
>    ( Char = '\n' ->
>        LineNum = LineNum0 - 1,
>        Posn = posn(LineNum, Offset, Offset)
>    ;
>        Posn = posn(LineNum0, LineOffset0, Offset)
>    ).
>
>
> In a 'posn', the first arg is the current line number, the 3rd arg is the 
> current offset into the string being parsed, and the 2nd arg is the offset of 
> the start of the current line.
>
> When the above code goes back over a newline, the 2nd argument is incorrectly 
> set -- it gets set to 'Offset', which is the offset of the newline character 
> that ended the line.  It should be set to the offset of the beginning of that 
> line (which will be a number of characters earlier than 'Offset').  But by 
> this point the offset for the start of that line has been lost and cannot be 
> easily recreated.
>
> I suspect this has been wrong for a very long time, but nobody has noticed
> because the lexer doesn't ever use the 2nd field anywhere (it could be used 
> to report column numbers for lexing errors, but it isn't).  So perhaps the 
> 3-argument 'posn' type could be replaced with a 2-argument type, and some 
> memory would be saved.

And some more savings could be made if that 2-argument type was split into 
two parts which were passed around separately.  It results in extra 
arguments to lots of lexing predicates, but avoids the deconstruct/construct 
pair which currently occurs on every character.

I just made a change like this to the Zinc lexer, which is structured 
similarly, and got a nice speedup (I think it was about 20--30%, although I 
didn't measure it all that carefully).

Nick
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the users mailing list