[mercury-users] Bug in lexer.m

Julien Fischer juliensf at csse.unimelb.edu.au
Thu Jul 17 13:45:00 AEST 2008

Previous message: [mercury-users] Bug in lexer.m
Next message: [mercury-users] Bug in lexer.m
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, 17 Jul 2008, Paul Bone wrote:

> On Tue, Jul 15, 2008 at 09:40:30PM +1000, Nicholas Nethercote wrote:
>> On Tue, 15 Jul 2008, Nicholas Nethercote wrote:
>>
>>> :- pred string_ungetchar(string::in, posn::in, posn::out) is det.
>>>
>>> string_ungetchar(String, Posn0, Posn) :-
>>>   Posn0 = posn(LineNum0, LineOffset0, Offset0),
>>>   Offset = Offset0 - 1,
>>>   string.unsafe_index(String, Offset, Char),
>>>   ( Char = '\n' ->
>>>       LineNum = LineNum0 - 1,
>>>       Posn = posn(LineNum, Offset, Offset)
>>>   ;
>>>       Posn = posn(LineNum0, LineOffset0, Offset)
>>>   ).
>>>
>>>
>>> In a 'posn', the first arg is the current line number, the 3rd arg is the
>>> current offset into the string being parsed, and the 2nd arg is the offset
>>> of the start of the current line.
>>>
>>> When the above code goes back over a newline, the 2nd argument is
>>> incorrectly set -- it gets set to 'Offset', which is the offset of the
>>> newline character that ended the line.  It should be set to the offset of
>>> the beginning of that line (which will be a number of characters earlier
>>> than 'Offset').  But by this point the offset for the start of that line
>>> has been lost and cannot be easily recreated.
>>>
>>> I suspect this has been wrong for a very long time, but nobody has noticed
>>> because the lexer doesn't ever use the 2nd field anywhere (it could be
>>> used to report column numbers for lexing errors, but it isn't).  So
>>> perhaps the 3-argument 'posn' type could be replaced with a 2-argument
>>> type, and some memory would be saved.
>>
>> And some more savings could be made if that 2-argument type was split into
>> two parts which were passed around separately.  It results in extra
>> arguments to lots of lexing predicates, but avoids the
>> deconstruct/construct pair which currently occurs on every character.
>>
>> I just made a change like this to the Zinc lexer, which is structured
>> similarly, and got a nice speedup (I think it was about 20--30%, although I
>> didn't measure it all that carefully).
>>
>
> Isn't there a complier optimization or two that are supposed to help
> here?  By packing and unpacking tuples?

There is, but it is not enabled by default, it doesn't really work
across module boundaries and it requires feedback to work out where
to do the tupling.

Julien.
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------

Previous message: [mercury-users] Bug in lexer.m
Next message: [mercury-users] Bug in lexer.m
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the users mailing list