[mercury-users] unsafe_set_char

Julien Fischer juliensf at csse.unimelb.edu.au
Sun May 30 18:54:40 AEST 2010


On Sun, 30 May 2010, Paul Bone wrote:

> On Fri, May 28, 2010 at 03:00:33PM +0100, James Cussens wrote:
>> Consider the following from the documentation for the string.m library
>> module:
>>   % string.unsafe_set_char(Char, Index, String0, String):
>>  ...
>>   % This version is constant time, whereas string.set_char_det
>>   % may be linear in the length of the string. Use with care!
>> But I can't see how it can be constant time when it involves a call to
>> strcpy. I feel I must be missing something here. What's going on?
>> More generally, being able to destructively update strings would be
>> useful, I think. I'm using Mercury to implemenent hidden Markov models
>> (which spit out strings) and currently make millions of calls to
>> string.unsafe_set_char, so if anyone has tips on quick (perhaps dirty)
>> methods of generating strings that would be appreciated. (My initial
>> experiments seem to show that using string is better than array(char),
>> btw.)
> You're right, there is a bug here in the documentation.  I've filed it as bug
> 151.
> I don't think we have a distructive update predicate for strings in the
> standard library,

We do, unsafe_set_char has a commented out mode, (in, in, di, uo),  that
does the update in constant time.  (And indeed, it is to that mode, that
the existing documentation refers.)  The reason that mode is currently
commented out is that the compiler will may place string constants in
static data, even when they may be updated.

(For Mercury developers: one possible workaround here, at least for the
C grades, might be to use MR_in_heap_range to test where a string is
stored, and then only do the O(1) update if it is stored on the heap;
another more drastic solution to this problem would be to use the tag
bits on strings -- remember we insist on them being aligned** -- to
indicate where they are stored.  The disdvantage of doing this is
that it would break the existing C interface for strings.)

** This of course assumes that strings that are stored as static data
are also aligned.

> As for 'dirty' ways to acheive this, you could write your own foreign
> code that uses distructive update.

The following is equivalent to the commented out code in the library:
(Although, it omits the check for the nul character.)

:- pred very_unsafe_set_char(char::in, int::in, string::in, string::out)
     is det.

:- pragma foreign_proc("C",
     very_unsafe_set_char(Ch::in, Index::in, Str0::in, Str::out),
     [will_not_call_mercury, promise_pure, thread_safe,
      will_not_modify_trail, does_not_affect_liveness],
     Str = Str0;
     MR_set_char(Str, Index, Ch);

mercury-users mailing list
Post messages to:       mercury-users at csse.unimelb.edu.au
Administrative Queries: owner-mercury-users at csse.unimelb.edu.au
Subscriptions:          mercury-users-request at csse.unimelb.edu.au

More information about the users mailing list