[m-rev.] for review: add string.word_wrap/2
Ian MacLarty
maclarty at cs.mu.OZ.AU
Mon Mar 28 02:10:52 AEST 2005
On Mon, Mar 14, 2005 at 09:48:21AM +1100, Julien Fischer wrote:
>
> On Sun, 13 Mar 2005, Ian MacLarty wrote:
>
> > On Sun, Mar 13, 2005 at 03:41:23PM +1100, Julien Fischer wrote:
> >
> > Okay:
> > Index: NEWS
> > ===================================================================
> > RCS file: /home/mercury1/repository/mercury/NEWS,v
> > retrieving revision 1.375
> > diff -u -r1.375 NEWS
> > --- NEWS 25 Feb 2005 08:02:09 -0000 1.375
> > +++ NEWS 13 Mar 2005 08:36:56 -0000
> > @@ -1,3 +1,8 @@
> > +NEWS since Mercury release 0.12.0:
> > +----------------------------------
>
> There cannot yet be any NEWS since release 0.12.0 because that's still
> in the future. I suggest saying since the 0.12 fork.
>
Okay.
> > > > + % Wrapped is Str with newlines inserted between words so that at most
> > > > + % N characters appear on a line and each line contains as many
> > > > + % whole words as possible. If any one word exceeds N characters in
> > > > + % length then it will be broken over two (or more) lines.
> > > I think that you should be able to insert a hyphen between the two parts
> > > of the word in this case. (Perhaps, make this an optional argument).
> >
> > I don't think this'll be particularly useful since words that are longer than
> > a line typically aren't real words anyway.
>
> That rather depends on how large the line width is. What if the user
> sets it to say 16 (which is sensible in some contexts, e.g a small text
> window) but still smaller than some real words.
>
> > > > + string.length(Str) =< N
> > > It may be worth keeping track of the length of the strings as you go,
> > > rather than recomputing it all the time.
> > >
> >
> > I doubt it, since it won't be very often that a word needs to be broken up over
> > multiple lines (with english text and a sensible line width at least).
>
> Again, the notion of sensible line width is application dependent. Since
> you are putting this in the standard library, as opposed to the debugger,
> I think that it should be as general as possible.
>
Alright then, here is the new version of word_wrap with an optional word
seperator argument:
Index: library/string.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/string.m,v
retrieving revision 1.230
diff -u -r1.230 string.m
--- library/string.m 28 Feb 2005 03:39:38 -0000 1.230
+++ library/string.m 27 Mar 2005 16:02:19 -0000
+
+ % word_wrap(Str, N) = Wrapped.
+ % Wrapped is Str with newlines inserted between words so that at most
+ % N characters appear on a line and each line contains as many
+ % whole words as possible. If any one word exceeds N characters in
+ % length then it will be broken over two (or more) lines.
+ % Sequences of whitespace characters are replaced by a single space.
+ %
+:- func string__word_wrap(string, int) = string.
+
+ % word_wrap(Str, N, WordSeperator) = Wrapped.
+ % word_wrap/3 is like word_wrap/2, except that words that need to be
+ % broken up over multiple lines have WordSeperator inserted between
+ % each piece. If the length of WordSeperator is greater that or equal
+ % to N, then no seperator is used.
+ %
+:- func string__word_wrap(string, int, string) = string.
+
%-----------------------------------------------------------------------------%
:- implementation.
@@ -4714,6 +4727,120 @@
%-----------------------------------------------------------------------------%
+string__word_wrap(Str, N) = string__word_wrap(Str, N, "").
+
+string__word_wrap(Str, N, WordSep) = Wrapped :-
+ Words = string.words(char.is_whitespace, Str),
+ SepLen = string.length(WordSep),
+ ( SepLen < N ->
+ string.word_wrap_2(Words, WordSep, SepLen, 1, N, [], Wrapped)
+ ;
+ string.word_wrap_2(Words, "", 0, 1, N, [], Wrapped)
+ ).
+
+:- pred word_wrap_2(list(string)::in, string::in, int::in, int::in, int::in,
+ list(string)::in, string::out) is det.
+
+word_wrap_2([], _, _, _, _, RevStrs,
+ string.join_list("", list.reverse(RevStrs))).
+
+ % Col is the column where the next character should be written if there
+ % is space for a whole word.
+word_wrap_2([Word | Words], WordSep, SepLen, Col, N, Prev, Wrapped) :-
+ WordLen = string.length(Word),
+ (
+ % We are on the first column and the length of the word
+ % is less than the line length.
+ Col = 1, WordLen < N
+ ->
+ NewCol = Col + WordLen,
+ WrappedRev = [Word | Prev],
+ NewWords = Words
+ ;
+ % The word takes up the whole line.
+ Col = 1, WordLen = N
+ ->
+ %
+ % We only put a newline if there are more words to follow.
+ %
+ NewCol = 1,
+ (
+ Words = [],
+ WrappedRev = [Word | Prev]
+ ;
+ Words = [_ | _],
+ WrappedRev = ["\n", Word | Prev]
+ ),
+ NewWords = Words
+ ;
+ % If we add a space and the current word to the line we'll
+ % still be within the line length limit.
+ Col + WordLen < N
+ ->
+ NewCol = Col + WordLen + 1,
+ WrappedRev = [Word, " " | Prev],
+ NewWords = Words
+ ;
+ % Adding the word and a space takes us to the end of the
+ % line exactly.
+ Col + WordLen = N
+ ->
+ %
+ % We only put a newline if there are more words to follow.
+ %
+ NewCol = 1,
+ (
+ Words = [],
+ WrappedRev = [Word, " " | Prev]
+ ;
+ Words = [_ | _],
+ WrappedRev = ["\n", Word, " " | Prev]
+ ),
+ NewWords = Words
+ ;
+ %
+ % Adding the word would take us over the line limit.
+ %
+ (
+ Col = 1
+ ->
+ %
+ % Break up words that are too big to fit on a line.
+ %
+ RevPieces = break_up_string_reverse(Word, N - SepLen,
+ []),
+ (
+ RevPieces = [LastPiece | Rest]
+ ;
+ RevPieces = [],
+ error("string__word_wrap_2: no pieces")
+ ),
+ RestWithSep = list.map(func(S) = S ++ WordSep ++ "\n",
+ Rest),
+ NewCol = 1,
+ WrappedRev = list.append(RestWithSep, Prev),
+ NewWords = [LastPiece | Words]
+ ;
+ NewCol = 1,
+ WrappedRev = ["\n" | Prev],
+ NewWords = [Word | Words]
+ )
+ ),
+ word_wrap_2(NewWords, WordSep, SepLen, NewCol, N, WrappedRev, Wrapped).
+
+:- func break_up_string_reverse(string, int, list(string)) = list(string).
+
+break_up_string_reverse(Str, N, Prev) = Strs :-
+ (
+ string.length(Str) =< N
+ ->
+ Strs = [Str | Prev]
+ ;
+ string.split(Str, N, Left, Right),
+ Strs = break_up_string_reverse(Right, N, [Left | Prev])
+ ).
+
+%-----------------------------------------------------------------------------%
:- end_module string.
%------------------------------------------------------------------------------%
Index: tests/general/string_test.exp
===================================================================
RCS file: /home/mercury1/repository/tests/general/string_test.exp,v
retrieving revision 1.3
diff -u -r1.3 string_test.exp
--- tests/general/string_test.exp 4 Feb 2005 05:55:16 -0000 1.3
+++ tests/general/string_test.exp 27 Mar 2005 15:49:43 -0000
@@ -17,3 +17,71 @@
aaa|1111111| 1,300,000.00
b | | 9,999.00
cc | 333|123,456,789.99
+
+Wrapped string:
+*aaaaaaaaa
+aaaaaaaaaa
+a* bbbbb
+bbb b
+ccccc c c
+c cccc c c
+c c ccccc
+ccc cccc c
+ccc ccc
+ccc
+*ddddddddd
+dddddddddd
+dddddddddd
+dddddddddd
+dddddddddd
+ddddd* eee
+Wrapped string with hyphens:
+*aaaaaaaa-
+aaaaaaaaa-
+aaa* bbbbb
+bbb b
+ccccc c c
+c cccc c c
+c c ccccc
+ccc cccc c
+ccc ccc
+ccc
+*dddddddd-
+ddddddddd-
+ddddddddd-
+ddddddddd-
+ddddddddd-
+ddddddddd-
+d* eee
+Wrapped string with dots:
+*a...
+aa...
+aa...
+a*
+bbbbb
+bbb b
+ccccc
+c c c
+cccc
+c c c
+c
+ccccc
+ccc
+cccc
+c ccc
+ccc
+ccc
+*d...
+dd...
+dd...
+dd...
+dd...
+dd...
+dd...
+d*
+eee
+Wrapped string where seperator is too long:
+wh
+at
+ev
+er
\ No newline at end of file
Index: tests/general/string_test.m
===================================================================
RCS file: /home/mercury1/repository/tests/general/string_test.m,v
retrieving revision 1.5
diff -u -r1.5 string_test.m
--- tests/general/string_test.m 4 Feb 2005 05:55:16 -0000 1.5
+++ tests/general/string_test.m 27 Mar 2005 15:48:09 -0000
@@ -62,6 +62,33 @@
right(["1111111", "", "333"]), right(["1,300,000.00",
"9,999.00", "123,456,789.99"])], "|") ++ "\n" },
write_string(Table),
+ { Wrapped = string.word_wrap("*aaaaaaaaaaaaaaaaaaaa* bbbbb bbb b\t"
+ ++ " ccccc c c c cccc c c c c ccccc ccc cccc c ccc ccc ccc "
+ ++ "*dddddddddddddddddddddddddddddddddddddddddddddddddddddd*"
+ ++ " eee",
+ 10) },
+ { WrappedHyphen =
+ string.word_wrap("*aaaaaaaaaaaaaaaaaaaa* bbbbb bbb b\t"
+ ++ " ccccc c c c cccc c c c c ccccc ccc cccc c ccc ccc ccc "
+ ++ "*dddddddddddddddddddddddddddddddddddddddddddddddddddddd*"
+ ++ " eee",
+ 10, "-") },
+ { WrappedDots =
+ string.word_wrap("*aaaaaa* bbbbb bbb b\t"
+ ++ " ccccc c c c cccc c c c c ccccc ccc cccc c ccc ccc ccc "
+ ++ "*dddddddddddddd*"
+ ++ " eee",
+ 5, "...") },
+ { SepTooLong =
+ string.word_wrap("whatever", 2, "...") },
+ write_string("\nWrapped string:\n"),
+ write_string(Wrapped),
+ write_string("\nWrapped string with hyphens:\n"),
+ write_string(WrappedHyphen),
+ write_string("\nWrapped string with dots:\n"),
+ write_string(WrappedDots),
+ write_string("\nWrapped string where seperator is too long:\n"),
+ write_string(SepTooLong),
[].
Ian.
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list