[m-dev.] For review: added string__words/2

Ralph Becket rbeck at microsoft.com
Fri Apr 14 01:36:43 AEST 2000


Fergus suggested a word-wrapping facility for the pretty printer, which
begat the need for something that would split a string up into a list
of words.  This, being deemed useful enough for wider use, I've put an
implementation of the latter into string.m

Question: I did `cvs diff -u string.m' to get the diff below and clearly
my previous commit for string__length/1 didn't go through.  The command
I used was
$ cvs commit -F RALPHS_CHANGES_20000406 {array,std_util,string}.m
Did I do something wrong?

Ralph


Estimated hours taken: .5

Added function to string.m to split strings up into lists of `words'.

library/string.m
        Added function string__words/2.

Index: string.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/string.m,v
retrieving revision 1.120
diff -u -u -r1.120 string.m
--- string.m    2000/03/20 09:01:49     1.120
+++ string.m    2000/04/13 14:59:11
@@ -214,6 +214,15 @@
 %              list__foldl(Closure, Chars, Acc0, Acc)
 %      but is implemented more efficiently.)

+:- func words(pred(char), string) = list(string).
+:- mode words(pred(in) is semidet, in) = out is det.
+%       words(SepP, String) returns the list of non-empty substrings of
String
+%       (in first to last order) that are delimited by non-empty sequences
of
+%       chars matched by SepP.  For example,
+%
+%       words(char__is_whitespace, " the cat  sat on the  mat") =
+%               ["the", "cat", "sat", "on", "the", "mat"]
+
 :- pred string__split(string, int, string, string).
 :- mode string__split(in, in, out, out) is det.
 %      string__split(String, Count, LeftSubstring, RightSubstring):
@@ -1988,6 +1997,8 @@

 :- interface.

+:- func string__length(string) = int.
+
 :- func string__append(string, string) = string.

 :- func string__char_to_string(char) = string.
@@ -2045,6 +2056,9 @@

 :- implementation.

+string__length(S) = L :-
+       string__length(S, L).
+
 string__append(S1, S2) = S3 :-
        string__append(S1, S2, S3).

@@ -2123,6 +2137,48 @@

 string__format(S1, PT) = S2 :-
        string__format(S1, PT, S2).
+
+%
----------------------------------------------------------------------------
%
+
+words(SepP, String) = words_0(SepP, String, I, [])
+:-
+    I = preceding_boundary(isnt(SepP), String, string__length(String) - 1).
+
+%
----------------------------------------------------------------------------
%
+
+:- func words_0(pred(char), string, int, list(string)) = list(string).
+:- mode words_0(pred(in) is semidet, in, in, in) = out is det.
+
+words_0(SepP, String, WordEnd, Words0) = Words
+:-
+    ( if WordEnd < 0 then
+        Words = Words0
+      else
+        WordPre = preceding_boundary(SepP, String, WordEnd),
+        Word = string__unsafe_substring(String, WordPre + 1, WordEnd -
WordPre),
+        PrevWordEnd = preceding_boundary(isnt(SepP), String, WordPre),
+        Words = words_0(SepP, String, PrevWordEnd, [Word | Words0])
+    ).
+
+%
----------------------------------------------------------------------------
%
+
+        % preceding_boundary(SepP, String, I) returns the largest index J
=< I
+        % in String of the char that is SepP and min(-1, I) if there is no
+        % such J.  preceding_boundary/3 is intended for finding (in
reverse)
+        % consecutive maximal sequences of chars satisfying some property.
+        % Note that I *must not* exceed the largest valid index for String.
+
+:- func preceding_boundary(pred(char), string, int) = int.
+:- mode preceding_boundary(pred(in) is semidet, in, in) = out is det.
+
+preceding_boundary(SepP, String, I) =
+    ( if I < 0 then
+        I
+      else if SepP(string__unsafe_index(String, I)) then
+        I
+      else
+        preceding_boundary(SepP, String, I - 1)
+    ).

 %
----------------------------------------------------------------------------
%
 %
----------------------------------------------------------------------------
%

--
Ralph Becket      |      MSR Cambridge      |      rbeck at microsoft.com 
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list