[m-rev.] for review: mercury implementation of string.m
Peter Ross
pro at missioncriticalit.com
Fri Jun 14 19:34:40 AEST 2002
On Fri, Jun 14, 2002 at 02:12:57AM +1000, Simon Taylor wrote:
> > Index: string.m
> > ===================================================================
> > @@ -743,6 +742,56 @@
> > Str[size] = '\\0';
> > }").
> >
> > +:- pragma foreign_proc("MC++", string__to_char_list(Str::in, CharList::uo),
> > + [will_not_call_mercury, promise_pure, thread_safe], "{
> > + MR_Integer length, i;
> > + MR_Word tmp;
> > + MR_Word prev;
> > +
> > + length = Str->get_Length();
> > +
> > + MR_list_nil(prev);
> > +
> > + for (i = length - 1; i >= 0; i--) {
> > + MR_list_cons(tmp, __box(Str->get_Chars(i)), prev);
>
> You should get someone else (probably Tyson) to review the Managed C++
> part of the code, but the reference to `__box' here rings alarm bells.
>
I didn't actually write this code, I think I must have just moved it
around or diff got confused.
> > +string__to_char_list(Str::in, CharList::uo) :-
> > + ( string__first_char(Str, First, Rest) ->
> > + string__to_char_list(Rest, CharList0),
> > + CharList = [First | CharList0]
> > + ;
> > + CharList = []
> > + ).
>
> Don't use string__first_char, it's horrendously inefficient.
> Base the algorithm on string__unsafe_index instead.
>
Done.
> > @@ -1116,6 +1165,20 @@
> > Index = WholeString->IndexOf(SubString);
> > }").
> >
> > +string__sub_string_search(String, SubString, Index) :-
> > + string__sub_string_search_2(String, SubString, 0, Index).
> > +
> > +:- pred sub_string_search_2(string::in, string::in,
> > + int::in, int::out) is semidet.
> > +
> > +sub_string_search_2(String, SubString, CurrentIndex, Index) :-
> > + ( string__prefix(String, SubString) ->
> > + Index = CurrentIndex
> > + ;
> > + string__first_char(String, _, Rest),
> > + sub_string_search_2(Rest, SubString, CurrentIndex + 1, Index)
> > + ).
>
> Same here.
>
I haven't done this because I couldn't easily change the algorithm to
use string__index. In reality we should implement a proper string
searching algorithm.
I have added an XXX.
> > /*
> > :- pred string__to_int_list(string, list(int)).
> > :- mode string__to_int_list(in, out) is det.
> > -:- mode string__to_int_list(out, in) is det.
> > */
> > -
> > :- pragma foreign_proc("C",
> > string__to_int_list(Str::in, IntList::out),
> > [will_not_call_mercury, promise_pure, thread_safe], "{
> > @@ -1727,45 +1824,6 @@
> > MR_PROC_LABEL);
> > }
> > }").
> > -
> > -:- pragma foreign_proc("C",
> > - string__to_int_list(Str::out, IntList::in),
> > - [will_not_call_mercury, promise_pure, thread_safe], "{
> > - /* mode (out, in) is det */
> > - MR_Word int_list_ptr;
> > - size_t size;
> > - MR_Word str_ptr;
> > -/*
> > -** loop to calculate list length + sizeof(MR_Word) in `size' using list in
> > -** `int_list_ptr'
> > -*/
> > - size = sizeof(MR_Word);
> > - int_list_ptr = IntList;
> > - while (! MR_list_is_empty(int_list_ptr)) {
> > - size++;
> > - int_list_ptr = MR_list_tail(int_list_ptr);
> > - }
>
> > +string__to_int_list(String, IntList) :-
> > + string__to_char_list(String, CharList),
> > + IntList = list__map(char__to_int, CharList).
>
> Why not just remove the foreign implementation altogether?
>
Well the foreign implementation should be faster then the Mercury
implementation if we are using the foreign_code version of
string__to_char_list.
> > @@ -1817,6 +1859,13 @@
> > [will_not_call_mercury, promise_pure, thread_safe], "
> > SUCCESS_INDICATOR = (Str->IndexOf(Ch) != -1);
> > ").
> > +string__contains_char(String, Char) :-
> > + string__first_char(String, FirstChar, RestOfString),
> > + ( FirstChar = Char ->
> > + true
> > + ;
> > + string__contains_char(RestOfString, Char)
> > + ).
>
> Again, don't use string__first_char.
>
> > @@ -1853,6 +1902,13 @@
> > Ch = Str->get_Chars(Index);
> > }
> > ").
> > +string__index(Str, Index, Char) :-
> > + string__first_char(Str, First, Rest),
> > + ( Index = 0 ->
> > + Char = First
> > + ;
> > + string__index(Rest, Index - 1, Char)
> > + ).
>
> It would be better to call `sorry' here. If string__index
> doesn't work, string__first_char isn't likely to.
>
No I don't think so. The idea behind this change is to provide as much
functionality in the string library as possible so as to minimize the
amount of work required to get a working version of the library on a new
backend. Yes it is very inefficient, but it is one less piece of work
that has to be done intitially.
> > @@ -1866,6 +1922,9 @@
> > [will_not_call_mercury, promise_pure, thread_safe], "
> > Ch = Str->get_Chars(Index);
> > ").
> > +string__unsafe_index(_, _, _) :-
> > + % This version is only for if there is not a foreign_proc version.
> > + private_builtin__sorry("string__unsafe_index").
>
> This should default to using string__index.
>
Done.
> > @@ -2021,6 +2087,22 @@
> > Length = Str->get_Length();
> > ").
> >
> > +:- pragma promise_pure(string__length/2).
> > +string__length(Str::in, Len::uo) :-
> > + string__length_2(Str, Len).
> > +string__length(Str0::ui, Len::uo) :-
> > + copy(Str0, Str),
> > + string__length_2(Str, Len).
> > +
> > +:- pred string__length_2(string::in, int::uo) is det.
> > +string__length_2(Str, Length) :-
> > + ( string__first_char(Str, _First, Rest) ->
> > + string__length(Rest, Length0),
> > + Length = Length0 + 1
> > + ;
> > + Length = 0
> > + ).
>
> It would be better to just call `sorry' here.
>
No I think it is better to have a Mercury version even if it very
inefficient.
> > +string__split(Str, Count, Left, Right) :-
> > + ( Count =< 0 ->
> > + Left = "",
> > + Right = Str
> > + ;
> > + string__to_char_list(Str, List),
> > + Len = list__length(List),
>
> string__length is likely to be faster than list__length, unless
> deforestation can merge the call to list__length into the call to
> string__to_char_list.
>
Changed.
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list