[m-rev.] for review: mercury implementation of string.m
Simon Taylor
stayl at cs.mu.OZ.AU
Sat Jun 15 06:39:46 AEST 2002
On 14-Jun-2002, Peter Ross <pro at missioncriticalit.com> wrote:
> > > @@ -1116,6 +1165,20 @@
> > > Index = WholeString->IndexOf(SubString);
> > > }").
> > >
> > > +string__sub_string_search(String, SubString, Index) :-
> > > + string__sub_string_search_2(String, SubString, 0, Index).
> > > +
> > > +:- pred sub_string_search_2(string::in, string::in,
> > > + int::in, int::out) is semidet.
> > > +
> > > +sub_string_search_2(String, SubString, CurrentIndex, Index) :-
> > > + ( string__prefix(String, SubString) ->
> > > + Index = CurrentIndex
> > > + ;
> > > + string__first_char(String, _, Rest),
> > > + sub_string_search_2(Rest, SubString, CurrentIndex + 1, Index)
> > > + ).
> >
> > Same here.
> >
> I haven't done this because I couldn't easily change the algorithm to
> use string__index. In reality we should implement a proper string
> searching algorithm.
> I have added an XXX.
What's so hard about writing a brute force string matcher?
I'm not happy for this to be committed as is.
> > > :- pragma foreign_proc("C",
> > > string__to_int_list(Str::in, IntList::out),
> > > [will_not_call_mercury, promise_pure, thread_safe], "{
> > > @@ -1727,45 +1824,6 @@
> > > MR_PROC_LABEL);
> > > }
> > > }").
> > > -
> > > -:- pragma foreign_proc("C",
> > > - string__to_int_list(Str::out, IntList::in),
> > > - [will_not_call_mercury, promise_pure, thread_safe], "{
> > > - /* mode (out, in) is det */
> > > - MR_Word int_list_ptr;
> > > - size_t size;
> > > - MR_Word str_ptr;
> > > -/*
> > > -** loop to calculate list length + sizeof(MR_Word) in `size' using list in
> > > -** `int_list_ptr'
> > > -*/
> > > - size = sizeof(MR_Word);
> > > - int_list_ptr = IntList;
> > > - while (! MR_list_is_empty(int_list_ptr)) {
> > > - size++;
> > > - int_list_ptr = MR_list_tail(int_list_ptr);
> > > - }
> >
> > > +string__to_int_list(String, IntList) :-
> > > + string__to_char_list(String, CharList),
> > > + IntList = list__map(char__to_int, CharList).
> >
> > Why not just remove the foreign implementation altogether?
>
> Well the foreign implementation should be faster then the Mercury
> implementation if we are using the foreign_code version of
> string__to_char_list.
Where is it used that needs the extra speed? The only place I can
see is string__hash, which is very sub-optimal anyway. The historical
reason for the foreign code verison was the lack of mode-specific
clauses.
> > > @@ -1853,6 +1902,13 @@
> > > Ch = Str->get_Chars(Index);
> > > }
> > > ").
> > > +string__index(Str, Index, Char) :-
> > > + string__first_char(Str, First, Rest),
> > > + ( Index = 0 ->
> > > + Char = First
> > > + ;
> > > + string__index(Rest, Index - 1, Char)
> > > + ).
> >
> > It would be better to call `sorry' here. If string__index
> > doesn't work, string__first_char isn't likely to.
> >
> No I don't think so. The idea behind this change is to provide as much
> functionality in the string library as possible so as to minimize the
> amount of work required to get a working version of the library on a new
> backend. Yes it is very inefficient, but it is one less piece of work
> that has to be done intitially.
That's fine, but you've chosen the wrong primitives to base the rest
of the operations on. Mercury strings are not lists. string__first_char
is an unnatural operation for strings, and expressing algorithms which
iterate over strings in terms of it will cause unacceptable inefficiency.
A better choice of basic operations would be string__alloc (this
would allocate memory for a string given the length), string__length,
string__unsafe_index and string__unsafe_set_char. Operations based
on these primitives should be fairly close in efficiency to foreign
code implementations (if not, we need to improve the compiler).
> > > +:- pragma promise_pure(string__length/2).
> > > +string__length(Str::in, Len::uo) :-
> > > + string__length_2(Str, Len).
> > > +string__length(Str0::ui, Len::uo) :-
> > > + copy(Str0, Str),
> > > + string__length_2(Str, Len).
> > > +
> > > +:- pred string__length_2(string::in, int::uo) is det.
> > > +string__length_2(Str, Length) :-
> > > + ( string__first_char(Str, _First, Rest) ->
> > > + string__length(Rest, Length0),
> > > + Length = Length0 + 1
> > > + ;
> > > + Length = 0
> > > + ).
> >
> > It would be better to just call `sorry' here.
> >
> No I think it is better to have a Mercury version even if it very
> inefficient.
That isn't just very inefficient, it's ridiculously inefficient.
Simon.
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list