[m-rev.] for review: fix bugs in parsing_utils.offset_to_line_number_and_position

Ian MacLarty maclarty at csse.unimelb.edu.au
Tue Aug 18 14:13:14 AEST 2009


For review by Ralph.

Estimated hours taken: 4
Branches: main

Fix bugs in parsing_utils when converting offsets to line numbers and
positions.  One problem was that it was trying to index a zero length array
when there was only one line in the source string.  Another bug was that it was
using the wrong offset to compute the position.

library/parsing_utils.m:
	Add a comment explaining what the line_numbers array is meant to
	contain.

	Include the last line in the line_numbers array, so we never end
	up with an empty array.

	Fix the calculation of the position by computing the offset of
	the beginning of the line containing the offset in question and
	using that to work out the position (instead of using the offset
	of the following line).

tests/general/test_parsing_utils.exp:
tests/general/test_parsing_utils.m:
	Test offset_to_line_number_and_position.

Index: library/parsing_utils.m
===================================================================
RCS file: /home/mercury1/repository/mercury/library/parsing_utils.m,v
retrieving revision 1.2
diff -u -r1.2 parsing_utils.m
--- library/parsing_utils.m	17 Jun 2009 03:14:41 -0000	1.2
+++ library/parsing_utils.m	18 Aug 2009 03:55:25 -0000
@@ -327,6 +327,9 @@
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
+    % For a source string Src, the following array contains the positions
+    % of all the newline characters in the string Src ++ "\n".
+    %
 :- type line_numbers == array(int).
 
 %-----------------------------------------------------------------------------%
@@ -338,7 +341,7 @@
     F = ( func(I, Ns) =
         ( if string.unsafe_index(Str, I) = ('\n') then [I | Ns] else Ns )
     ),
-    LineNosList = int.fold_down(F, Lo, Hi, []),
+    LineNosList = int.fold_down(F, Lo, Hi, [Src ^ input_length]),
     LineNos = array(LineNosList).
 
 %-----------------------------------------------------------------------------%
@@ -368,9 +371,16 @@
                 LineNo, Pos)
         )
       else
-        LoOffset = LineNos ^ elem(Lo),
+        % Lo is the index of the newline that terminates the line that Offset
+        % is on.  We compute LineBegin as the offset of the first character
+        % of the line Offset is on.
+        ( if Lo = 0 then
+            LineBegin = 0
+          else
+            LineBegin = LineNos ^ elem(Lo - 1) + 1
+        ),
         LineNo = 1 + Lo,
-        Pos = 1 + Offset - LoOffset
+        Pos = 1 + Offset - LineBegin
     ).
 
 %-----------------------------------------------------------------------------%
Index: tests/general/test_parsing_utils.exp
===================================================================
RCS file: /home/mercury1/repository/tests/general/test_parsing_utils.exp,v
retrieving revision 1.2
diff -u -r1.2 test_parsing_utils.exp
--- tests/general/test_parsing_utils.exp	17 Jun 2009 03:14:41 -0000	1.2
+++ tests/general/test_parsing_utils.exp	18 Aug 2009 03:40:08 -0000
@@ -230,3 +230,13 @@
 pass: one_or_more(int_with_state) on "1 2 3"
 	returned [3, 2, 1] as expected
 	[5 chars consumed]
+Line = 2, Pos = 5
+Line = 2, Pos = 3
+Line = 7, Pos = 1
+Line = 4, Pos = 2
+Line = 3, Pos = 1
+Line = 1, Pos = 3
+Line = 1, Pos = 1
+Line = 2, Pos = 10
+Line = 3, Pos = 1
+Line = 1, Pos = 1
Index: tests/general/test_parsing_utils.m
===================================================================
RCS file: /home/mercury1/repository/tests/general/test_parsing_utils.m,v
retrieving revision 1.2
diff -u -r1.2 test_parsing_utils.m
--- tests/general/test_parsing_utils.m	17 Jun 2009 03:14:41 -0000	1.2
+++ tests/general/test_parsing_utils.m	18 Aug 2009 03:39:54 -0000
@@ -30,7 +30,17 @@
 %-----------------------------------------------------------------------------%
 
 main(!IO) :-
-    unsorted_aggregate(run_test, io.write_string, !IO).
+    unsorted_aggregate(run_test, io.write_string, !IO),
+    test_pos("123456789\n123456789\n", 14, !IO),
+    test_pos("\n123456789\n123456789\n\n\n\n\n\n", 3, !IO),
+    test_pos("\n1234\n12\n\n\nfewefwef\nwwfwe\n\n", 20, !IO),
+    test_pos("123456789\n123456789\n\n1234567890", 22, !IO),
+    test_pos("123456789\n123456789\n\n1234567890", 20, !IO),
+    test_pos("123456789", 2, !IO),
+    test_pos("123456789", 0, !IO),
+    test_pos("123456789\n123456789\n\n", 19, !IO),
+    test_pos("123456789\n123456789\n\n", 20, !IO),
+    test_pos("", 0, !IO).
 
 %-----------------------------------------------------------------------------%
 
@@ -343,6 +353,16 @@
 
 %-----------------------------------------------------------------------------%
 
+:- pred test_pos(string::in, int::in, io::di, io::uo) is det.
+
+test_pos(Str, OS, !IO) :-
+    new_src_and_ps(Str, Src, _),
+    offset_to_line_number_and_position(src_to_line_numbers(Src), OS, Line,
+        Pos),
+    io.format("Line = %d, Pos = %d\n", [i(Line), i(Pos)], !IO).
+
+%-----------------------------------------------------------------------------%
+
 :- pred stringify_state(
         pred(src, T, list(S), list(S), ps, ps)::
             in(pred(in, out, in, out, in, out) is semidet),
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list