[mercury-users] a question on efficient parsing of a file

Ralph Becket rafe at csse.unimelb.edu.au
Fri May 21 10:23:43 AEST 2010


Hi Vladimir,

Vladimir Gubarkov, Wednesday, 19 May 2010:
>    I believe that method used by 'CL = to_char_list_s_fs(S):list(string)'
>    is similar to used in parsing_utils (or, maybe I'm wrong?).
>    But it appears, that turning string to list of chars in this way is a
>    way slowly then by built-in string.to_char_list

parsing_utils doesn't turn the string into a list of chars at all,
it just looks up characters directly in the string (a parsing_utils
parser takes a string (the source, or 'src') and an offset (the parser
state, or 'ps', which is just an int) and, on success, produces a new
offset (or 'ps').  The only memory allocation performed is that needed
to return any results from a parser.

Splitting a Mercury string into its first character and the rest of the
string is very expensive, because it's an O(n) operation.

Turning a Mercury string into a list of characters is also expensive
because each char will require a list cell consisting of the char itself
(one word of memory) and the pointer to the tail of the list (another
word of memory).  On a 64 bit machine this means that an N character
string will require 16N bytes (and N memory allocations) to represent as
a list.

The whole list-of-chars representation is one reason nobody ever does
bulk string processing this way in Prolog!

-- Ralph
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the users mailing list