[mercury-users] Destructive list operations
Michael Day
mikeday at yeslogic.com
Sun Jan 4 19:06:46 AEDT 2009
Hi,
We use a lot of lists, and list operations are starting to take up a
significant portion of our total run time, so I have a few questions
about list operations that perform destructive update for speed.
To begin with, there is the following mode for append:
:- mode list.append(di, di, uo) is det.
How is this implemented by the compiler? One might assume that it
performs a traversal of the first list and destructively updates the
final cons tail to be the second list, taking linear time and constant
stack space. But is this actually the case?
Also, we generally use the ++ function everywhere, and only call append
explicitly if we are using it in a reverse mode. Would it make sense to
add a di/di/uo mode for the ++ function?
Secondly, we often need to traverse a list many times, but only modify
it occasionally, when some condition is satisfied. For example, consider
taking a list of ints and replacing negative numbers with zero:
Xs = [1, 2, -17, 3, 4],
Ys = map((func(X) = ( if X < 0 then 0 else X )), Xs),
% Ys is now [1, 2, 0, 3, 4]
The problem with this code is that it rebuilds the list each time, even
if nothing needs to be changed. When the list is long and there are many
passes with few changes this is a lot of wasted work, including the time
taken to perform unnecessary allocations and copies, the extra garbage
produced that must be cleaned up later, and unfriendly cache behaviour.
Ideally, the standard library would include a version of map that
updated the list in place to reuse the list skeleton, and the compiler
could specialise it for the given function and avoid touching the list
at all in the common case where no change is necessary. But since we do
not yet live in such a declarative nirvana, is it possible to write a
predicate by hand that does this?
I've tried writing it this way:
:- pred filter_neg(list(int), list(int)).
:- mode filter_neg(di, uo) is det.
filter_neg([], []).
filter_neg([X|Xs], Ys) :-
filter_neg(Xs, Ys0),
( if X < 0 then
Ys = [0|Ys0]
else
Ys = [X|Ys0]
).
The compiler accepts it, but doesn't appear to optimise it to the point
of not allocating new cons cells, as far as I can tell by looking at the
generated code.
(Actually, With -O5 the compiler completely optimised away the call to
filter_neg in my test program and substituted a constant value, which
was rather clever; I had to change the test to read an unknown input
list at runtime. Other than that though, no other optimisations).
Any hints on compiler flags or coding tricks to destructively update
lists without copying them unnecessarily? Our only other option seems to
be implementing a new list type in C and threading the IO state through
for safety, also known as the "open a portal to hell" technique.
Best regards,
Michael
--
Print XML with Prince!
http://www.princexml.com
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to: mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions: mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the users
mailing list