[mercury-users] Destructive list operations

Michael Day mikeday at yeslogic.com
Sun Jan 4 19:06:46 AEDT 2009


Hi,

We use a lot of lists, and list operations are starting to take up a 
significant portion of our total run time, so I have a few questions 
about list operations that perform destructive update for speed.

To begin with, there is the following mode for append:

:- mode list.append(di, di, uo) is det.

How is this implemented by the compiler? One might assume that it 
performs a traversal of the first list and destructively updates the 
final cons tail to be the second list, taking linear time and constant 
stack space. But is this actually the case?

Also, we generally use the ++ function everywhere, and only call append 
explicitly if we are using it in a reverse mode. Would it make sense to 
add a di/di/uo mode for the ++ function?

Secondly, we often need to traverse a list many times, but only modify 
it occasionally, when some condition is satisfied. For example, consider 
taking a list of ints and replacing negative numbers with zero:

     Xs = [1, 2, -17, 3, 4],
     Ys = map((func(X) = ( if X < 0 then 0 else X )), Xs),
     % Ys is now [1, 2, 0, 3, 4]

The problem with this code is that it rebuilds the list each time, even 
if nothing needs to be changed. When the list is long and there are many 
passes with few changes this is a lot of wasted work, including the time 
taken to perform unnecessary allocations and copies, the extra garbage 
produced that must be cleaned up later, and unfriendly cache behaviour.

Ideally, the standard library would include a version of map that 
updated the list in place to reuse the list skeleton, and the compiler 
could specialise it for the given function and avoid touching the list 
at all in the common case where no change is necessary. But since we do 
not yet live in such a declarative nirvana, is it possible to write a 
predicate by hand that does this?

I've tried writing it this way:

:- pred filter_neg(list(int), list(int)).
:- mode filter_neg(di, uo) is det.

filter_neg([], []).
filter_neg([X|Xs], Ys) :-
     filter_neg(Xs, Ys0),
     ( if X < 0 then
         Ys = [0|Ys0]
     else
         Ys = [X|Ys0]
     ).

The compiler accepts it, but doesn't appear to optimise it to the point 
of not allocating new cons cells, as far as I can tell by looking at the 
generated code.

(Actually, With -O5 the compiler completely optimised away the call to 
filter_neg in my test program and substituted a constant value, which 
was rather clever; I had to change the test to read an unknown input 
list at runtime. Other than that though, no other optimisations).

Any hints on compiler flags or coding tricks to destructively update 
lists without copying them unnecessarily? Our only other option seems to 
be implementing a new list type in C and threading the IO state through 
for safety, also known as the "open a portal to hell" technique.

Best regards,

Michael

-- 
Print XML with Prince!
http://www.princexml.com
--------------------------------------------------------------------------
mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the users mailing list