[mercury-users] Bug in lexer.m

Paul Bone pbone at csse.unimelb.edu.au
Thu Jul 17 12:57:35 AEST 2008


On Tue, Jul 15, 2008 at 09:40:30PM +1000, Nicholas Nethercote wrote:
> On Tue, 15 Jul 2008, Nicholas Nethercote wrote:
> 
> >:- pred string_ungetchar(string::in, posn::in, posn::out) is det.
> >
> >string_ungetchar(String, Posn0, Posn) :-
> >   Posn0 = posn(LineNum0, LineOffset0, Offset0),
> >   Offset = Offset0 - 1,
> >   string.unsafe_index(String, Offset, Char),
> >   ( Char = '\n' ->
> >       LineNum = LineNum0 - 1,
> >       Posn = posn(LineNum, Offset, Offset)
> >   ;
> >       Posn = posn(LineNum0, LineOffset0, Offset)
> >   ).
> >
> >
> >In a 'posn', the first arg is the current line number, the 3rd arg is the 
> >current offset into the string being parsed, and the 2nd arg is the offset 
> >of the start of the current line.
> >
> >When the above code goes back over a newline, the 2nd argument is 
> >incorrectly set -- it gets set to 'Offset', which is the offset of the 
> >newline character that ended the line.  It should be set to the offset of 
> >the beginning of that line (which will be a number of characters earlier 
> >than 'Offset').  But by this point the offset for the start of that line 
> >has been lost and cannot be easily recreated.
> >
> >I suspect this has been wrong for a very long time, but nobody has noticed
> >because the lexer doesn't ever use the 2nd field anywhere (it could be 
> >used to report column numbers for lexing errors, but it isn't).  So 
> >perhaps the 3-argument 'posn' type could be replaced with a 2-argument 
> >type, and some memory would be saved.
> 
> And some more savings could be made if that 2-argument type was split into 
> two parts which were passed around separately.  It results in extra 
> arguments to lots of lexing predicates, but avoids the 
> deconstruct/construct pair which currently occurs on every character.
> 
> I just made a change like this to the Zinc lexer, which is structured 
> similarly, and got a nice speedup (I think it was about 20--30%, although I 
> didn't measure it all that carefully).
> 

Isn't there a complier optimization or two that are supposed to help
here?  By packing and unpacking tuples?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mercurylang.org/archives/users/attachments/20080717/c162658c/attachment.sig>


More information about the users mailing list