for review: samples/diff update

Andrew Bromage bromage at cs.mu.OZ.AU
Mon Sep 14 12:52:44 AEST 1998


G'day all.

If I get no objections in the next few days I'll just commit this, so
speak up now.

Cheers,
Andrew Bromage


Estimated hours taken: 30

Modified files
--------------

samples/diff/Mmakefile:
	Minor documentation update.

samples/diff/README:
samples/diff/TODO:
	Update stuff that's now done, do a couple of minor wording
	changes.

samples/diff/diff.m:
	Fix case of identical filenames (which implicitly assumed
	no_diff_implies_no_output).  Add new match pass, call the
	new diff algorithm.

samples/diff/diff_out.m:
	Add --cvs-merge-conflict output style.  Slight reorganisation
	of top-level predicate.  Lots of small fixes to use better
	syntax (e.g. functional style for integer maths operations).

samples/diff/difftype.m:
	Added first_mentioned_positions/3, last_mentioned_positions/3,
	add_edit/4.

samples/diff/file.m:
	Use io__read_line_as_string.

samples/diff/filter.m:
	Minor syntax/wording changes.

samples/diff/options.m:
	Update all the newly handled options.

New files
---------

samples/diff/myers.m:
	New diff algorithm.

samples/diff/match.m:
	New pass to match common lines in the files to be diffed.

Removed file
------------

samples/diff/lcss.m:
	Functionality replaced by myers.m.


Index: Mmakefile
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/Mmakefile,v
retrieving revision 1.4
diff -u -t -u -r1.4 Mmakefile
--- Mmakefile	1998/01/13 00:51:48	1.4
+++ Mmakefile	1998/09/13 07:50:34
@@ -16,7 +16,7 @@
 # in 2.7.2.1, so feel free to comment out this line if you're using
 # an unbuggy compiler.
 #
-# BTW, the predicate which isn't compiled correctly is diff_out__show_file
+# BTW, the predicate which isn't compiled correctly is diff_out__show_file_2
 # in diff_out.m.
 MGNUCFLAGS=-O0
 
Index: README
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/README,v
retrieving revision 1.6
diff -u -t -u -r1.6 README
--- README	1998/01/13 00:51:50	1.6
+++ README	1998/09/13 07:51:40
@@ -7,7 +7,7 @@
 
         - We now accept command-line options.  In particular, we
           recognise all options that are accepted by GNU diff,
-          though a lot of them result in error reports and a few
+          though some of them result in error reports and a few
           which have do nothing to do with the output format or
           semantics, but are merely for efficiency, are accepted
           and ignored.
@@ -15,16 +15,19 @@
         - We support different output formats, in particular all
           of the output formats supported by GNU diff. There are
           a number of modifiers to the output formats (for example,
-          --expand-tabs) which we don't yet support.
+          --show-function-line) which we don't yet support.
 
-        - Just about everything (except the actual diff algorithm)
-          has been modified to support the above changes.
+        - We have a new diff algorithm, based on the one by Eugene
+          Myers.  See myers.m for details.
+
+        - Just about everything has been modified to support the
+          above changes.
 
         - Lots of cleanups, lots more documentation.
 
 Examine the file TODO to see what's still missing.
 
-Andrew Bromage  8 Jan 1998
+Andrew Bromage  13 September 1998
 
 ===========================================================================
 
Index: TODO
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/TODO,v
retrieving revision 1.1
diff -u -t -u -r1.1 TODO
--- TODO	1998/01/13 00:51:51	1.1
+++ TODO	1998/09/13 07:53:32
@@ -1,29 +1,27 @@
 
 Things which should be straightforward:
 
- * The following options should be supported but aren't:
-
-        --expand-tabs
-        --initial-tab
-        --ignore-case
-        --ignore-all-space
-        --ignore-space-change
-        --sdiff-merge-assist
+ * The `--sdiff-merge-assist' option is accepted but does nothing.  What
+   precisely does it do anyway?
 
  * Optimise the case of the --brief output style, where a full-blown diff
-   isn't necessary but we currently do it anyway.  Similarly, if no diff
-   implies no output (which it does for some output styles) we could
-   avoid a full diff sometimes.
+   isn't necessary but we currently do it.
 
  * We currently aren't careful about noticing the difference between a
    file which has a return/new line on the last line and one which
-   doesn't.  Admittedly this distinction has never made a difference to
-   any diffing I've done, but if we're going try to be compliant...
+   doesn't.  In "robust" output styles, this should result in a warning.
+   Admittedly this distinction has never made a difference to any 
+   diffing I've done, but if we're going try to be compliant...
+
+ * We do produce the minimal diff (at least if --minimal is enabled we
+   do), but we don't produce the "prettiest" diff.  We should post-
+   process our diffs to make them prettier.
 
 
 Things which need a bit more work:
 
- * Implement a more efficient diff algorithm.
+ * Implement --speed-large-files (or at least examine whether or not
+   it's worth it to do so).
 
  * Support diffing of binary files.  Mostly this just requires being
    more careful than we currently are.
Index: diff.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/diff.m,v
retrieving revision 1.9
diff -u -t -u -r1.9 diff.m
--- diff.m	1998/01/13 00:51:53	1.9
+++ diff.m	1998/09/13 07:54:56
@@ -4,8 +4,7 @@
 % Public License - see the file COPYING in the Mercury distribution.
 %-----------------------------------------------------------------------------%
 
-% Main author: bromage
-% Simplified by Marnix Klooster <marnix at worldonline.nl>
+% Main authors: bromage, Marnix Klooster <marnix at worldonline.nl>
 
 % Something very similar to the standard diff utility.  Sort of.  :-)
 
@@ -21,7 +20,7 @@
 %-----------------------------------------------------------------------------%
 
 :- implementation.
-:- import_module options, lcss, diff_out, globals, filter.
+:- import_module options, myers, diff_out, globals, filter, match.
 :- import_module string, list, file, std_util, require, getopt.
 
 %-----------------------------------------------------------------------------%
@@ -81,8 +80,15 @@
 main_2([Fname1 | Rest]) -->
         ( { Rest = [Fname2 | _] },
                 ( { Fname1 = Fname2 } ->
-                % There are no differences between identical files.
-                        []
+                        % Not sure why anyone would want to diff two
+                        % files with the same name, but just in case...
+                        ( { Fname1 = "-" } ->
+                                file__read_input(Fname1, Contents1),
+                                { Contents1 = Contents2 }
+                        ;
+                                file__read_file(Fname1, Contents1),
+                                { Contents1 = Contents2 }
+                        )
                 ;
                         % If either file is "-", simply use standard input.
                         % (Note: Both can't be "-" since that was dealt with
@@ -97,17 +103,17 @@
                         % Otherwise read the files normally.
                                 file__read_file(Fname1, Contents1),
                                 file__read_file(Fname2, Contents2)
-                        ),
-                        % Now do the diff.
-                        ( { Contents1 = ok(File1), Contents2 = ok(File2) } ->
-                                diff__do_diff(File1, File2)
-                        ; { Contents1 = error(Msg) } ->
-                                usage_io_error(Msg)
-                        ; { Contents2 = error(Msg) } ->
-                                usage_io_error(Msg)
-                        ;
-                                { error("main2") }
                         )
+                ),
+                % Now do the diff.
+                ( { Contents1 = ok(File1), Contents2 = ok(File2) } ->
+                        diff__do_diff(File1, File2)
+                ; { Contents1 = error(Msg) } ->
+                        usage_io_error(Msg)
+                ; { Contents2 = error(Msg) } ->
+                        usage_io_error(Msg)
+                ;
+                        { error("main2") }
                 )
         ; { Rest = [] },
                 usage_error("missing operand")
@@ -118,10 +124,13 @@
         % diff__do_diff takes the files plus all the command
         % line options and determines what to do with them.
         %
-        % At the moment, we're organised into three passes:
+        % At the moment, we're organised into four passes:
         %
-        %       - diff_by_lcss takes the two files and produces
-        %         a diff using the LCSS algorithm.
+        %       - build_matches determines which lines from the
+        %         input files match (using the appropriate command-
+        %         line options).
+        %       - diff_by_myers takes the matches produced and
+        %         computes a diff between them.
         %       - filter_diff analyses the diff, filtering out
         %         any edits which the user said that they didn't
         %         want to see (using the appropriate command-line
@@ -129,17 +138,11 @@
         %       - display_diff outputs the diff in whatever output
         %         format the user chose.
         %
-        % TO DO: Options like --ignore-case are probably best handled
-        %        by a pass taking place _before_ the diff algorithm is
-        %        run.  This pass would have the benefit of determining
-        %        whether or not there are any differences or not, in
-        %        the case where the output style chosen doesn't require
-        %        output if there is no diff.  It would also speed up
-        %        the --brief output style.
 :- pred diff__do_diff(file, file, io__state, io__state).
 :- mode diff__do_diff(in, in, di, uo) is det.
 diff__do_diff(File1, File2) -->
-        { diff_by_lcss(File1, File2, Diff0) },
+        build_matches(File1, File2, FileX, FileY),
+        diff_by_myers(FileX, FileY, Diff0),
         filter_diff(Diff0, File1, File2, Diff),
         display_diff(File1, File2, Diff).
 
Index: diff_out.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/diff_out.m,v
retrieving revision 1.1
diff -u -t -u -r1.1 diff_out.m
--- diff_out.m	1998/01/13 00:51:55	1.1
+++ diff_out.m	1998/09/13 07:41:49
@@ -20,8 +20,8 @@
 
 %-----------------------------------------------------------------------------%
 
-:- type diff_out__output_style --->
-                normal
+:- type diff_out__output_style
+        --->    normal
         ;       help_only
         ;       version_only
         ;       context(int)
@@ -31,9 +31,28 @@
         ;       rcs
         ;       ifdef(string)
         ;       brief
-        ;       side_by_side.
+        ;       side_by_side
+        ;       cvs_merge_conflict.
 
-:- pred diff_out__default_output_style(diff_out__output_style :: out) is det.
+        % The default output style.
+:- pred diff_out__default_output_style(diff_out__output_style).
+:- mode diff_out__default_output_style(out) is det.
+
+        % Succeeds if, for this output style, an absence of differences
+        % means that no output should be generated.
+:- pred diff_out__no_diff_implies_no_output(diff_out__output_style).
+:- mode diff_out__no_diff_implies_no_output(in) is semidet.
+
+        % Succeeds if the user only wants to know about the presence
+        % of any differences, not what they actually are.
+:- pred diff_out__full_diff_not_required(diff_out__output_style).
+:- mode diff_out__full_diff_not_required(in) is semidet.
+
+        % Succeeds if the output style is "robust", that is, the
+        % absence of a newline at the end of the file actually
+        % matters.
+:- pred diff_out__robust(diff_out__output_style).
+:- mode diff_out__robust(in) is semidet.
 
         % display_diff takes a diff and displays it
         % in the user's specified output format.
@@ -51,6 +70,30 @@
 
 %-----------------------------------------------------------------------------%
 
+diff_out__no_diff_implies_no_output(normal).
+diff_out__no_diff_implies_no_output(context(_)).
+diff_out__no_diff_implies_no_output(unified(_)).
+diff_out__no_diff_implies_no_output(ed).
+diff_out__no_diff_implies_no_output(forward_ed).
+diff_out__no_diff_implies_no_output(rcs).
+diff_out__no_diff_implies_no_output(brief).
+
+%-----------------------------------------------------------------------------%
+
+diff_out__full_diff_not_required(brief).
+
+%-----------------------------------------------------------------------------%
+
+diff_out__robust(normal).
+diff_out__robust(context(_)).
+diff_out__robust(unified(_)).
+diff_out__robust(rcs).
+diff_out__robust(ifdef(_)).
+diff_out__robust(side_by_side).
+diff_out__robust(cvs_merge_conflict).
+
+%-----------------------------------------------------------------------------%
+
         % diff_out__show_file shows the segment of the file
         % from Low to High, with each line preceeded by
         % the Prefix characher and a space.  The diff(1)
@@ -59,17 +102,31 @@
         % lines effected in the second file should be
         % flagged by '>'.
         %
+:- pred diff_out__show_file(file, string, pos, pos, io__state, io__state).
+:- mode diff_out__show_file(in, in, in, in, di, uo) is det.
+
+diff_out__show_file(File, Prefix, Low, High) -->
+        globals__io_lookup_bool_option(expand_tabs, ExpandTabs),
+        diff_out__show_file_2(ExpandTabs, File, Prefix, Low, High).
+
         % NOTE: GCC 2.7.2 under Digital Unix 3.2 doesn't compile
         %       this predicate correctly with optimisation turned on.
-:- pred diff_out__show_file(file, string, segment, io__state, io__state).
-:- mode diff_out__show_file(in, in, in, di, uo) is det.
+:- pred diff_out__show_file_2(bool, file, string, pos, pos,
+                io__state, io__state).
+:- mode diff_out__show_file_2(in, in, in, in, in, di, uo) is det.
 
-diff_out__show_file(File, Prefix, Low - High) -->
+diff_out__show_file_2(ExpandTabs, File, Prefix, Low, High) -->
         ( { Low < High } ->
                 ( { file__get_line(File, Low, Line) } ->
-                        { Low1 is Low + 1 },
-                        io__write_strings([Prefix, Line]),
-                        diff_out__show_file(File, Prefix, Low1 - High)
+                        io__write_string(Prefix),
+                        ( { ExpandTabs = yes },
+                                { string__to_char_list(Line, LineList) },
+                                diff_out__expand_tabs(LineList, 0)
+                        ; { ExpandTabs = no },
+                                io__write_string(Line)
+                        ),
+                        diff_out__show_file_2(ExpandTabs, File, Prefix,
+                                        Low + 1, High)
                 ;
                         { error("diff_out_show_file: file ended prematurely") }
                 )
@@ -77,9 +134,24 @@
                 []
         ).
 
+:- pred diff_out__expand_tabs(list(char), int, io__state, io__state).
+:- mode diff_out__expand_tabs(in, in, di, uo) is det.
+
+diff_out__expand_tabs([], _) --> [].
+diff_out__expand_tabs([C | Cs], Pos) -->
+        ( { C = '\t' } ->
+                { Spaces = tab_width - (Pos rem tab_width) },
+                put_spaces(Spaces, Pos, NewPos),
+                diff_out__expand_tabs(Cs, NewPos)
+        ;
+                io__write_char(C),
+                diff_out__expand_tabs(Cs, Pos + 1)
+        ).
+
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
+
         % display_diff: Determine which output style to use, then call
         % the predicate to display that output.
         %
@@ -88,40 +160,63 @@
         % reach here.  In those cases, we just call error/1.
 display_diff(File1, File2, Diff) -->
         globals__io_get_output_style(OutputStyle),
-        ( { OutputStyle = normal },
-                display_diff_normal(File1, File2, Diff)
-        ; { OutputStyle = help_only },
-                { error("display_diff: help_only") }
-        ; { OutputStyle = version_only },
-                { error("display_diff: version_only") }
-        ; { OutputStyle = context(Context) },
-                display_context_diff(Context, File1, File2, Diff)
-        ; { OutputStyle = unified(Context) },
-                display_unified_diff(Context, File1, File2, Diff)
-        ; { OutputStyle = ed },
-                display_diff_ed(File1, File2, Diff)
-        ; { OutputStyle = forward_ed },
-                display_diff_forward_ed(File1, File2, Diff)
-        ; { OutputStyle = rcs },
-                display_diff_rcs(File1, File2, Diff)
-        ; { OutputStyle = ifdef(Sym) },
-                display_diff_ifdef(Sym, File1, File2, Diff)
-        ; { OutputStyle = brief },
-                % XXX For this output style, we really don't need to
-                %     perform a complete diff.  This should be handled
-                %     higher up for efficiency.
-                ( { Diff \= [] } ->
-                        { file__get_file_name(File1, FileName1) },
-                        { file__get_file_name(File2, FileName2) },
-                        io__write_strings(["Files ", FileName1, " and ",
-                                FileName2, " differ\n"])
-                ;
-                        []
-                )
-        ; { OutputStyle = side_by_side },
-                display_diff_side_by_side(File1, File2, Diff)
+        (
+                { Diff = [],
+                  diff_out__no_diff_implies_no_output(OutputStyle)
+                }
+        ->
+                []
+        ;
+                display_diff_2(OutputStyle, File1, File2, Diff)
         ).
 
+
+:- pred display_diff_2(diff_out__output_style, file, file, diff,
+                        io__state, io__state).
+:- mode display_diff_2(in, in, in, in, di, uo) is det.
+
+display_diff_2(normal, File1, File2, Diff) -->
+        display_diff_normal(File1, File2, Diff).
+
+display_diff_2(help_only, _File1, _File2, _Diff) -->
+        { error("display_diff: help_only") }.
+
+display_diff_2(version_only, _File1, _File2, _Diff) -->
+        { error("display_diff: version_only") }.
+
+display_diff_2(context(Context), File1, File2, Diff) -->
+        display_context_diff(Context, File1, File2, Diff).
+
+display_diff_2(unified(Context), File1, File2, Diff) -->
+        display_unified_diff(Context, File1, File2, Diff).
+
+display_diff_2(ed, File1, File2, Diff) -->
+        display_diff_ed(File1, File2, Diff).
+
+display_diff_2(forward_ed, File1, File2, Diff) -->
+        display_diff_forward_ed(File1, File2, Diff).
+
+display_diff_2(rcs, File1, File2, Diff) -->
+        display_diff_rcs(File1, File2, Diff).
+
+display_diff_2(ifdef(Sym), File1, File2, Diff) -->
+        display_diff_ifdef(Sym, File1, File2, Diff).
+
+display_diff_2(brief, File1, File2, _Diff) -->
+        % XXX For this output style, we really don't need to
+        %     perform a complete diff.  This should be handled
+        %     higher up for efficiency.
+        { file__get_file_name(File1, FileName1) },
+        { file__get_file_name(File2, FileName2) },
+        io__write_strings(["Files ", FileName1, " and ",
+                FileName2, " differ\n"]).
+
+display_diff_2(side_by_side, File1, File2, Diff) -->
+        display_diff_side_by_side(File1, File2, Diff).
+
+display_diff_2(cvs_merge_conflict, File1, File2, Diff) -->
+        display_diff_cvs_merge_conflict(File1, File2, Diff).
+
 %-----------------------------------------------------------------------------%
 
         % display_diff_normal takes a diff and displays it
@@ -129,21 +224,38 @@
 :- pred display_diff_normal(file, file, diff, io__state, io__state).
 :- mode display_diff_normal(in, in, in, di, uo) is det.
 
-display_diff_normal(_, _, []) --> [].
-display_diff_normal(File1, File2, [SingDiff | Diff]) -->
+display_diff_normal(File1, File2, Diff) -->
+        globals__io_lookup_bool_option(initial_tab, InitialTab),
+        { InitialTab = no,
+                FromStr = "< ",
+                ToStr = "> "
+        ; InitialTab = yes,
+                FromStr = "<\t",
+                ToStr = ">\t"
+        },
+        display_diff_normal_2(File1, File2, Diff, FromStr, ToStr).
+
+        % display_diff_normal takes a diff and displays it
+        % in the standard diff(1) output format.
+:- pred display_diff_normal_2(file, file, diff, string, string,
+                        io__state, io__state).
+:- mode display_diff_normal_2(in, in, in, in, in, di, uo) is det.
+
+display_diff_normal_2(_, _, [], _, _) --> [].
+display_diff_normal_2(File1, File2, [SingDiff | Diff], FromStr, ToStr) -->
         ( { SingDiff = add(X, Y1 - Y2) },
                 diff_out__write_command(X - X, 'a', Y1 - Y2),
-                diff_out__show_file(File2, "> ", Y1 - Y2)
+                diff_out__show_file(File2, ToStr, Y1, Y2)
         ; { SingDiff = delete(X1 - X2, Y) },
                 diff_out__write_command(X1 - X2, 'd', Y - Y),
-                diff_out__show_file(File1, "< ", X1 - X2)
+                diff_out__show_file(File1, FromStr, X1, X2)
         ; { SingDiff = change(X1 - X2, Y1 - Y2) },
                 diff_out__write_command(X1 - X2, 'c', Y1 - Y2),
-                diff_out__show_file(File1, "< ", X1 - X2),
+                diff_out__show_file(File1, FromStr, X1, X2),
                 io__write_string("---\n"),
-                diff_out__show_file(File2, "> ", Y1 - Y2)
+                diff_out__show_file(File2, ToStr, Y1, Y2)
         ),
-        display_diff_normal(File1, File2, Diff).
+        display_diff_normal_2(File1, File2, Diff, FromStr, ToStr).
 
 
         % diff_out__write_command displays a diff(1) command.
@@ -187,18 +299,14 @@
 display_diff_rcs(_File1, _File2, []) --> [].
 display_diff_rcs(File1, File2, [Cmd | Diff]) -->
         ( { Cmd = add(X, Y1 - Y2) },
-                { Y is Y2 - Y1 },
-                diff_out__write_command_rcs('a', X, Y),
-                diff_out__show_file(File2, "", Y1 - Y2)
+                diff_out__write_command_rcs('a', X, Y2-Y1),
+                diff_out__show_file(File2, "", Y1, Y2)
         ; { Cmd = delete(X1 - X2, _Y) },
-                { X is X2 - X1 },
-                diff_out__write_command_rcs('d', X1, X)
+                diff_out__write_command_rcs('d', X1, X2-X1)
         ; { Cmd = change(X1 - X2, Y1 - Y2) },
-                { X is X2 - X1 },
-                { Y is Y2 - Y1 },
-                diff_out__write_command_rcs('d', X1, X),
-                diff_out__write_command_rcs('a', X1, Y),
-                diff_out__show_file(File2, "", Y1 - Y2)
+                diff_out__write_command_rcs('d', X1, X2-X1),
+                diff_out__write_command_rcs('a', X1, Y2-Y1),
+                diff_out__show_file(File2, "", Y1, Y2)
         ),
         display_diff_rcs(File1, File2, Diff).
 
@@ -209,9 +317,8 @@
 :- mode diff_out__write_command_rcs(in, in, in, di, uo) is det.
 
 diff_out__write_command_rcs(C, X, Y) -->
-        { X1 is X + 1 },                % Convert from pos to line number
         io__write_char(C),
-        io__write_int(X1),
+        io__write_int(X + 1),   % Convert from pos to line number
         io__write_char(' '),
         io__write_int(Y),
         io__write_char('\n').
@@ -228,13 +335,13 @@
         display_diff_ed(File1, File2, Diff),
         ( { Cmd = add(X, Y1 - Y2) },
                 diff_out__write_command_ed(X - X, 'a'),
-                diff_out__show_file(File2, "", Y1 - Y2),
+                diff_out__show_file(File2, "", Y1, Y2),
                 io__write_string(".\n")
         ; { Cmd = delete(X, _Y) },
                 diff_out__write_command_ed(X, 'd')
-        ; { Cmd = change(X, Y) },
+        ; { Cmd = change(X, Y1 - Y2) },
                 diff_out__write_command_ed(X, 'c'),
-                diff_out__show_file(File2, "", Y),
+                diff_out__show_file(File2, "", Y1, Y2),
                 io__write_string(".\n")
         ).
 
@@ -269,18 +376,21 @@
 display_diff_forward_ed(File1, File2, [Cmd | Diff]) -->
         ( { Cmd = add(X, Y1 - Y2) },
                 diff_out__write_command_forward_ed(X - X, 'a'),
-                diff_out__show_file(File2, "", Y1 - Y2),
+                diff_out__show_file(File2, "", Y1, Y2),
                 io__write_string(".\n")
         ; { Cmd = delete(X, _Y) },
                 diff_out__write_command_forward_ed(X, 'd')
-        ; { Cmd = change(X, Y) },
+        ; { Cmd = change(X, Y1 - Y2) },
                 diff_out__write_command_forward_ed(X, 'c'),
-                diff_out__show_file(File2, "", Y),
+                diff_out__show_file(File2, "", Y1, Y2),
                 io__write_string(".\n")
         ),
         display_diff_forward_ed(File1, File2, Diff).
 
-        % diff_out__write_command_ed displays a forward ed(1) command.
+        % diff_out__write_command_forward_ed displays a forward ed(1)
+        % command.  The difference between this and write_command_ed is
+        % that the command char comes first here.  Who comes up with
+        % these dumb formats anyway?
 :- pred diff_out__write_command_forward_ed(segment, char, io__state, io__state).
 :- mode diff_out__write_command_forward_ed(in, in, di, uo) is det.
 diff_out__write_command_forward_ed(X - X2, C) -->
@@ -320,25 +430,25 @@
 
 display_diff_ifdef_2(Prev, _Sym, File1, _File2, []) -->
         { file__get_numlines(File1, SegEnd) },
-        diff_out__show_file(File1, "", Prev - SegEnd).
+        diff_out__show_file(File1, "", Prev, SegEnd).
 display_diff_ifdef_2(Prev, Sym, File1, File2, [Edit | Diff]) -->
         { first_mentioned_positions(Edit, StartOfEdit, _) },
-        diff_out__show_file(File1, "", Prev - StartOfEdit),
-        ( { Edit = add(X, Seg2) },
+        diff_out__show_file(File1, "", Prev, StartOfEdit),
+        ( { Edit = add(X, Y1 - Y2) },
                 io__write_strings(["#ifdef ", Sym, "\n"]),
-                diff_out__show_file(File2, "", Seg2),
+                diff_out__show_file(File2, "", Y1, Y2),
                 io__write_strings(["#endif /* ", Sym, " */\n"]),
                 { Next = X }
         ; { Edit = delete(X1 - X2, _) },
                 io__write_strings(["#ifndef ", Sym, "\n"]),
-                diff_out__show_file(File1, "", X1 - X2),
+                diff_out__show_file(File1, "", X1, X2),
                 io__write_strings(["#endif /* not ", Sym, " */\n"]),
                 { Next = X2 }
         ; { Edit = change(X1 - X2, Y1 - Y2) },
                 io__write_strings(["#ifndef ", Sym, "\n"]),
-                diff_out__show_file(File1, "", X1 - X2),
+                diff_out__show_file(File1, "", X1, X2),
                 io__write_strings(["#else /* ", Sym, " */\n"]),
-                diff_out__show_file(File2, "", Y1 - Y2),
+                diff_out__show_file(File2, "", Y1, Y2),
                 io__write_strings(["#endif /* ", Sym, " */\n"]),
                 { Next = X2 }
         ),
@@ -347,6 +457,56 @@
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
+        % display_diff_cvs_merge_conflict writes out the files in a
+        % unified diff, using CVS merge conflict marks around each edit.
+        %
+:- pred display_diff_cvs_merge_conflict(file, file, diff, io__state, io__state).
+:- mode display_diff_cvs_merge_conflict(in, in, in, di, uo) is det.
+
+display_diff_cvs_merge_conflict(File1, File2, Diff) -->
+        display_diff_cvs_merge_conflict_2(0, File1, File2, Diff).
+
+        % Argument 1 (prev) is the last pos displayed before
+        % the current edit (or end of edits, in the base case).
+        % This is important for when we have to display the
+        % "non-diffed" text between edits.
+:- pred display_diff_cvs_merge_conflict_2(int, file, file, diff,
+                io__state, io__state).
+:- mode display_diff_cvs_merge_conflict_2(in, in, in, in, di, uo) is det.
+
+display_diff_cvs_merge_conflict_2(Prev, File1, _File2, []) -->
+        { file__get_numlines(File1, SegEnd) },
+        diff_out__show_file(File1, "", Prev, SegEnd).
+display_diff_cvs_merge_conflict_2(Prev, File1, File2, [Edit | Diff]) -->
+        { first_mentioned_positions(Edit, StartOfEdit, _) },
+        diff_out__show_file(File1, "", Prev, StartOfEdit),
+        { file__get_file_name(File1, FileName1) },
+        { file__get_file_name(File2, FileName2) },
+        ( { Edit = add(X, Y1 - Y2) },
+                io__write_strings(["<<<<<<< ", FileName1, "\n"]),
+                diff_out__show_file(File2, "", Y1, Y2),
+                io__write_string("=======\n"),
+                io__write_strings([">>>>>>> ", FileName2, "\n"]),
+                { Next = X }
+        ; { Edit = delete(X1 - X2, _) },
+                io__write_strings(["<<<<<<< ", FileName1, "\n"]),
+                io__write_string("=======\n"),
+                diff_out__show_file(File1, "", X1, X2),
+                io__write_strings([">>>>>>> ", FileName2, "\n"]),
+                { Next = X2 }
+        ; { Edit = change(X1 - X2, Y1 - Y2) },
+                io__write_strings(["<<<<<<< ", FileName1, "\n"]),
+                diff_out__show_file(File1, "", X1, X2),
+                io__write_string("=======\n"),
+                diff_out__show_file(File2, "", Y1, Y2),
+                io__write_strings([">>>>>>> ", FileName2, "\n"]),
+                { Next = X2 }
+        ),
+        display_diff_cvs_merge_conflict_2(Next, File1, File2, Diff).
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
         % Types for context/unified diffs.
 
         % A context diff is a bit more complicated than a "standard"
@@ -371,10 +531,15 @@
 diff_to_context_diff(_Xsize, _Ysize, _Context, [], []).
 diff_to_context_diff(Xsize, Ysize, Context, [Edit | Diff], CDiff) :-
         diff_to_context_diff(Xsize, Ysize, Context, Diff, CDiff0),
+
+                % Work out how far the context of this edit reaches.
         first_mentioned_positions(Edit, Xfirst0, Yfirst0),
+        int__max(Xfirst0 - Context, 0, Xfirst),
+        int__max(Yfirst0 - Context, 0, Yfirst),
         last_mentioned_positions(Edit, Xlast0, Ylast0),
-        adjust_context(Context, Xsize, Xfirst0, Xlast0, Xfirst, Xlast),
-        adjust_context(Context, Ysize, Yfirst0, Ylast0, Yfirst, Ylast),
+        int__min(Xlast0 + Context, Xsize, Xlast),
+        int__min(Ylast0 + Context, Ysize, Ylast),
+
         ( CDiff0 = [],
                 CDiff = [context_edit(Xfirst - Xlast, Yfirst - Ylast, [Edit])]
         ; CDiff0 = [context_edit(XsegLo - XsegHi, YsegLo - YsegHi, DDiff) |
@@ -393,37 +558,6 @@
                 )
         ).
 
-:- pred first_mentioned_positions(edit :: in, pos :: out, pos :: out) is det.
-
-first_mentioned_positions(add(X, Y - _), X, Y).
-first_mentioned_positions(delete(X - _, Y), X, Y).
-first_mentioned_positions(change(X - _, Y - _), X, Y).
-
-:- pred last_mentioned_positions(edit :: in, pos :: out, pos :: out) is det.
-
-last_mentioned_positions(add(X, _ - Y), X, Y).
-last_mentioned_positions(delete(_ - X, Y), X, Y).
-last_mentioned_positions(change(_ - X, _ - Y), X, Y).
-
-        % Adjust a range to incorporate a given number of lines
-        % of context.  Ensure that the new range stays within the
-        % size of the file being considered.
-:- pred adjust_context(int :: in, int :: in, int :: in, int :: in,
-                int :: out, int :: out) is det.
-adjust_context(Context, Size, First0, Last0, First, Last) :-
-        First1 is First0 - Context,
-        Last1 is Last0 + Context,
-        ( First1 < 0 ->
-                First = 0
-        ;
-                First = First1
-        ),
-        ( Last1 > Size ->
-                Last = Size
-        ;
-                Last = Last1
-        ).
-
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
@@ -437,47 +571,58 @@
         { diff_to_context_diff(Size1, Size2, Context, Diff, CDiff) },
         { file__get_file_name(File1, Name1) },
         { file__get_file_name(File2, Name2) },
-                % XXX Should also print out file dates.  But how??
+                % XXX Should also print out file dates.  But how?
         io__write_strings(["--- ", Name1, "\n"]),
         io__write_strings(["+++ ", Name2, "\n"]),
-        display_unified_diff_2(File1, File2, CDiff).
-
-:- pred display_unified_diff_2(file, file, context_diff, io__state, io__state).
-:- mode display_unified_diff_2(in, in, in, di, uo) is det.
+        globals__io_lookup_bool_option(initial_tab, InitialTab),
+        { InitialTab = no,
+                NoneStr = " ",
+                AddStr = "+",
+                DelStr = "-"
+        ; InitialTab = yes,
+                NoneStr = "\t",
+                AddStr = "+\t",
+                DelStr = "-\t"
+        },
+        display_unified_diff_2(File1, File2, CDiff, NoneStr, AddStr, DelStr).
 
-display_unified_diff_2(_File1, _File2, []) --> [].
-display_unified_diff_2(File1, File2, [Edit | CDiff]) -->
+:- pred display_unified_diff_2(file, file, context_diff, string, string, string,
+                                io__state, io__state).
+:- mode display_unified_diff_2(in, in, in, in, in, in, di, uo) is det.
+
+display_unified_diff_2(_File1, _File2, [], _, _, _) --> [].
+display_unified_diff_2(File1, File2, [Edit | CDiff],
+                        NoneStr, AddStr, DelStr) -->
         { Edit = context_edit(Xlow - Xhigh, Ylow - Yhigh, Diff) },
-        { Xlow1 is Xlow + 1 },
-        { Ylow1 is Ylow + 1 },
-        { Xsize is Xhigh - Xlow },
-        { Ysize is Yhigh - Ylow },
         io__format("@@ -%d,%d +%d,%d @@\n",
-                [i(Xlow1), i(Xsize), i(Ylow1), i(Ysize)]),
-        display_unified_diff_3(Xlow, Xhigh, File1, File2, Diff),
-        display_unified_diff_2(File1, File2, CDiff).
+                [i(Xlow + 1), i(Xhigh - Xlow), i(Ylow + 1), i(Yhigh - Ylow)]),
+        display_unified_diff_3(Xlow, Xhigh, File1, File2, Diff,
+                        NoneStr, AddStr, DelStr),
+        display_unified_diff_2(File1, File2, CDiff, NoneStr, AddStr, DelStr).
 
 :- pred display_unified_diff_3(int, int, file, file, diff,
-                io__state, io__state).
-:- mode display_unified_diff_3(in, in, in, in, in, di, uo) is det.
+                                string, string, string, io__state, io__state).
+:- mode display_unified_diff_3(in, in, in, in, in, in, in, in, di, uo) is det.
 
-display_unified_diff_3(Prev, Size1, File1, _File2, []) -->
-        diff_out__show_file(File1, " ", Prev - Size1).
-display_unified_diff_3(Prev, Size1, File1, File2, [Edit | Diff]) -->
+display_unified_diff_3(Prev, Size1, File1, _File2, [], NoneStr, _, _) -->
+        diff_out__show_file(File1, NoneStr, Prev, Size1).
+display_unified_diff_3(Prev, Size1, File1, File2, [Edit | Diff],
+                        NoneStr, AddStr, DelStr) -->
         { first_mentioned_positions(Edit, StartOfEdit, _) },
-        diff_out__show_file(File1, " ", Prev - StartOfEdit),
-        ( { Edit = add(X, Seg2) },
-                diff_out__show_file(File2, "+", Seg2),
+        diff_out__show_file(File1, NoneStr, Prev, StartOfEdit),
+        ( { Edit = add(X, Y1 - Y2) },
+                diff_out__show_file(File2, AddStr, Y1, Y2),
                 { Next = X }
         ; { Edit = delete(X1 - X2, _) },
-                diff_out__show_file(File1, "-", X1 - X2),
+                diff_out__show_file(File1, DelStr, X1, X2),
                 { Next = X1 }
         ; { Edit = change(X1 - X2, Y1 - Y2) },
-                diff_out__show_file(File1, "-", X1 - X2),
-                diff_out__show_file(File2, "+", Y1 - Y2),
+                diff_out__show_file(File1, DelStr, X1, X2),
+                diff_out__show_file(File2, AddStr, Y1, Y2),
                 { Next = X1 }
         ),
-        display_unified_diff_3(Next, Size1, File1, File2, Diff).
+        display_unified_diff_3(Next, Size1, File1, File2, Diff,
+                                NoneStr, AddStr, DelStr).
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
@@ -495,74 +640,97 @@
                 % XXX Should also print out file dates.  But how??
         io__write_strings(["*** ", Name1, "\n"]),
         io__write_strings(["--- ", Name2, "\n"]),
-        display_context_diff_2(File1, File2, CDiff).
 
-:- pred display_context_diff_2(file, file, context_diff, io__state, io__state).
-:- mode display_context_diff_2(in, in, in, di, uo) is det.
+        globals__io_lookup_bool_option(initial_tab, InitialTab),
+        { InitialTab = no,
+                NoneStr = "  ",
+                AddStr = "+ ",
+                DelStr = "- ",
+                ChgStr = "! "
+        ; InitialTab = yes,
+                NoneStr = "\t",
+                AddStr = "+\t",
+                DelStr = "-\t",
+                ChgStr = "!\t"
+        },
+        display_context_diff_2(File1, File2, CDiff,
+                        NoneStr, AddStr, DelStr, ChgStr).
 
-display_context_diff_2(_File1, _File2, []) --> [].
-display_context_diff_2(File1, File2, [Edit | CDiff]) -->
+:- pred display_context_diff_2(file, file, context_diff,
+                string, string, string, string, io__state, io__state).
+:- mode display_context_diff_2(in, in, in, in, in, in, in, di, uo) is det.
+
+display_context_diff_2(_File1, _File2, [], _, _, _, _) --> [].
+display_context_diff_2(File1, File2, [Edit | CDiff],
+                NoneStr, AddStr, DelStr, ChgStr) -->
         { Edit = context_edit(Xlow - Xhigh, Ylow - Yhigh, Diff) },
-        { Xlow1 is Xlow + 1 },
-        { Ylow1 is Ylow + 1 },
         io__write_string("***************\n"),
-        io__format("*** %d,%d ****\n", [i(Xlow1), i(Xhigh)]),
+        io__format("*** %d,%d ****\n", [i(Xlow + 1), i(Xhigh)]),
 
                 % Don't display the "context from" lines if there's
                 % nothing deleted or changed.
-        ( { list__member(AddEdit, Diff) => AddEdit = add(_, _) } ->
+        ( { all [AEdit] list__member(AEdit, Diff) => AEdit = add(_, _) } ->
                 []
         ;
-                display_context_diff_left(Xlow, Xhigh, File1, Diff)
+                display_context_diff_left(Xlow, Xhigh, File1, Diff,
+                                NoneStr, DelStr, ChgStr)
         ),
-        io__format("--- %d,%d ----\n", [i(Ylow1), i(Yhigh)]),
+        io__format("--- %d,%d ----\n", [i(Ylow + 1), i(Yhigh)]),
 
                 % Don't display the "context to" lines if there's
                 % nothing added or changed.
-        ( { list__member(DelEdit, Diff) => DelEdit = delete(_, _) } ->
+        ( { all [DEdit] list__member(DEdit, Diff) => DEdit = delete(_, _) } ->
                 []
         ;
-                display_context_diff_right(Ylow, Yhigh, File2, Diff)
+                display_context_diff_right(Ylow, Yhigh, File2, Diff,
+                                NoneStr, AddStr, ChgStr)
         ),
-        display_context_diff_2(File1, File2, CDiff).
-
-:- pred display_context_diff_left(int, int, file, diff, io__state, io__state).
-:- mode display_context_diff_left(in, in, in, in, di, uo) is det.
+        display_context_diff_2(File1, File2, CDiff,
+                        NoneStr, AddStr, DelStr, ChgStr).
 
-display_context_diff_left(Prev, Size1, File1, []) -->
-        diff_out__show_file(File1, "  ", Prev - Size1).
-display_context_diff_left(Prev, Size1, File1, [Edit | Diff]) -->
+:- pred display_context_diff_left(int, int, file, diff, string, string, string,
+                        io__state, io__state).
+:- mode display_context_diff_left(in, in, in, in, in, in, in, di, uo) is det.
+
+display_context_diff_left(Prev, Size1, File1, [], NoneStr, _, _) -->
+        diff_out__show_file(File1, NoneStr, Prev, Size1).
+display_context_diff_left(Prev, Size1, File1, [Edit | Diff],
+                        NoneStr, DelStr, ChgStr) -->
         { first_mentioned_positions(Edit, StartOfEdit, _) },
-        diff_out__show_file(File1, "  ", Prev - StartOfEdit),
+        diff_out__show_file(File1, NoneStr, Prev, StartOfEdit),
         ( { Edit = add(X, _) },
                 { Next = X }
         ; { Edit = delete(X1 - X2, _) },
-                diff_out__show_file(File1, "- ", X1 - X2),
+                diff_out__show_file(File1, DelStr, X1, X2),
                 { Next = X2 }
         ; { Edit = change(X1 - X2, _) },
-                diff_out__show_file(File1, "! ", X1 - X2),
+                diff_out__show_file(File1, ChgStr, X1, X2),
                 { Next = X2 }
         ),
-        display_context_diff_left(Next, Size1, File1, Diff).
+        display_context_diff_left(Next, Size1, File1, Diff,
+                        NoneStr, DelStr, ChgStr).
 
-:- pred display_context_diff_right(int, int, file, diff, io__state, io__state).
-:- mode display_context_diff_right(in, in, in, in, di, uo) is det.
-
-display_context_diff_right(Prev, Size2, File2, []) -->
-        diff_out__show_file(File2, "  ", Prev - Size2).
-display_context_diff_right(Prev, Size2, File2, [Edit | Diff]) -->
+:- pred display_context_diff_right(int, int, file, diff,
+                        string, string, string, io__state, io__state).
+:- mode display_context_diff_right(in, in, in, in, in, in, in, di, uo) is det.
+
+display_context_diff_right(Prev, Size2, File2, [], NoneStr, _, _) -->
+        diff_out__show_file(File2, NoneStr, Prev, Size2).
+display_context_diff_right(Prev, Size2, File2, [Edit | Diff],
+                        NoneStr, AddStr, ChgStr) -->
         { first_mentioned_positions(Edit, StartOfEdit, _) },
-        diff_out__show_file(File2, "  ", Prev - StartOfEdit),
+        diff_out__show_file(File2, NoneStr, Prev, StartOfEdit),
         ( { Edit = add(_, Y1 - Y2) },
-                diff_out__show_file(File2, "+ ", Y1 - Y2),
+                diff_out__show_file(File2, AddStr, Y1, Y2),
                 { Next = Y2 }
         ; { Edit = delete(_, Y) },
                 { Next = Y }
         ; { Edit = change(_, Y1 - Y2) },
-                diff_out__show_file(File2, "! ", Y1 - Y2),
+                diff_out__show_file(File2, ChgStr, Y1, Y2),
                 { Next = Y2 }
         ),
-        display_context_diff_right(Next, Size2, File2, Diff).
+        display_context_diff_right(Next, Size2, File2, Diff,
+                                NoneStr, AddStr, ChgStr).
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
@@ -631,8 +799,8 @@
                 []
         ).
 display_diff_side_by_side_2(Prev, SBS, File1, File2, [Edit | Diff]) -->
-        { first_mentioned_positions(Edit, StartOfEdit, _) },
         { SBS = side_by_side_info(_, _, _, Suppress, _) },
+        { first_mentioned_positions(Edit, StartOfEdit, _) },
         ( { Suppress = no } ->
                 show_sbs_same_lines(File1, SBS, Prev - StartOfEdit)
         ;
@@ -650,19 +818,10 @@
                 % and display "changed" lines for the minimum of these
                 % sizes.  Then we display "added" or "deleted" lines for
                 % whatever is left over.
-                {
-                        SizeX is X2 - X1,
-                        SizeY is Y2 - Y1,
-                        int__min(SizeX, SizeY, Size)
-
-                },
+                { int__min(X2 - X1, Y2 - Y1, Size) },
                 show_sbs_changed_lines(File1, File2, SBS, X1, Y1, Size),
-                {
-                        NewX1 is X1 + Size,
-                        NewY1 is Y1 + Size
-                },
-                show_sbs_deleted_lines(File1, SBS, NewX1 - X2),
-                show_sbs_added_lines(File2, SBS, NewY1 - Y2),
+                show_sbs_deleted_lines(File1, SBS, (X1 + Size) - X2),
+                show_sbs_added_lines(File2, SBS, (Y1 + Size) - Y2),
                 { Next = X2 }
         ),
         display_diff_side_by_side_2(Next, SBS, File1, File2, Diff).
@@ -683,14 +842,12 @@
                         print_half_line(Chars1, SBS, 0, 0, Width, OutPos),
                         tab_to_column(OutPos, Width),
                         io__write_string("|"),
-                        { Width1 is Width + 1 },
-                        { Width2 is Width + 2 },
-                        tab_to_column(Width1, Width2),
+                        tab_to_column(Width + 1, Width + 2),
                         { string__to_char_list(Line2, Chars2) },
                         print_half_line(Chars2, SBS, 0, 0, Width, _),
                         io__write_string("\n"),
-                        { X2 is X1 + 1, Y2 is Y1 + 1, Size1 is Size - 1 },
-                        show_sbs_changed_lines(File1, File2, SBS, X2, Y2, Size1)
+                        show_sbs_changed_lines(File1, File2, SBS,
+                                        X1 + 1, Y1 + 1, Size - 1)
                 ;
                         { error("show_sbs_changed_lines: file ended prematurely") }
                 )
@@ -706,7 +863,6 @@
         ( { Low < High } ->
                 ( { file__get_line(File, Low, Line) } ->
                         { SBS = side_by_side_info(Width, _, LeftCol, _, _) },
-                        { Low1 is Low + 1 },
                         { string__to_char_list(Line, Chars) },
                         print_half_line(Chars, SBS, 0, 0, Width, OutPos),
 
@@ -716,12 +872,11 @@
                                 tab_to_column(OutPos, Width),
                                 io__write_string("(")
                         ;
-                                { Width2 is Width + 2 },
-                                tab_to_column(OutPos, Width2),
+                                tab_to_column(OutPos, Width + 2),
                                 print_half_line(Chars, SBS, 0, 0, Width, _)
                         ),
                         io__write_string("\n"),
-                        show_sbs_same_lines(File, SBS, Low1 - High)
+                        show_sbs_same_lines(File, SBS, (Low + 1) - High)
                 ;
                         { error("show_sbs_same_lines: file ended prematurely") }
                 )
@@ -737,13 +892,12 @@
         ( { Low < High } ->
                 ( { file__get_line(File, Low, Line) } ->
                         { SBS = side_by_side_info(Width, _, _, _, _) },
-                        { Low1 is Low + 1 },
                         { string__to_char_list(Line, Chars) },
                         tab_to_column(0, Width),
                         io__write_string("> "),
                         print_half_line(Chars, SBS, 0, 0, Width, _),
                         io__write_string("\n"),
-                        show_sbs_added_lines(File, SBS, Low1 - High)
+                        show_sbs_added_lines(File, SBS, (Low + 1) - High)
                 ;
                         { error("show_sbs_added_lines: file ended prematurely") }
                 )
@@ -759,12 +913,11 @@
         ( { Low < High } ->
                 ( { file__get_line(File, Low, Line) } ->
                         { SBS = side_by_side_info(Width, _, _, _, _) },
-                        { Low1 is Low + 1 },
                         { string__to_char_list(Line, Chars) },
                         print_half_line(Chars, SBS, 0, 0, Width, OutPos),
                         tab_to_column(OutPos, Width),
                         io__write_string("<\n"),
-                        show_sbs_deleted_lines(File, SBS, Low1 - High)
+                        show_sbs_deleted_lines(File, SBS, (Low + 1) - High)
                 ;
                         { error("show_sbs_deleted_lines: file ended prematurely") }
                 )
@@ -772,8 +925,8 @@
                 []
         ).
 
-:- pred tab_width(int :: out) is det.
-tab_width(8).
+:- func tab_width = int.
+tab_width = 8.
 
         % Put a number of spaces on the output stream.  Update
         % the output column as we go.
@@ -785,9 +938,7 @@
                 { OutPos = OutPos0 }
         ;
                 io__write_char(' '),
-                { Spaces1 is Spaces - 1 },
-                { OutPos1 is OutPos0 + 1 },
-                put_spaces(Spaces1, OutPos1, OutPos)
+                put_spaces(Spaces - 1, OutPos0 + 1, OutPos)
         ).
 
         % Given a "from" column and a "to" column, put sufficient
@@ -797,13 +948,11 @@
 :- mode tab_to_column(in, in, di, uo) is det.
 
 tab_to_column(From, To) -->
-        { tab_width(Tab) },
-        { AfterTab is From + Tab - (From mod Tab) },
+        { AfterTab is From + tab_width - (From rem tab_width) },
         ( { AfterTab > To } ->
                 ( { From < To } ->
                         io__write_char(' '),
-                        { From1 is From + 1 },
-                        tab_to_column(From1, To)
+                        tab_to_column(From + 1, To)
                 ;
                         []
                 )
@@ -831,10 +980,8 @@
 print_half_line([], _SBS, _InPos, OutPos, _OutBound, OutPos) --> [].
 print_half_line([C | Cs], SBS, InPos0, OutPos0, OutBound, OutPos) -->
         ( { C = '\t' } ->
-                { tab_width(Tab) },
-
                         % Calculate how many spaces this tab is worth.
-                { Spaces is Tab - InPos0 mod Tab },
+                { Spaces is tab_width - InPos0 rem tab_width },
                 ( { InPos0 = OutPos0 } ->
                         globals__io_lookup_bool_option(expand_tabs, ExpandTabs),
                         ( { ExpandTabs = yes } ->
@@ -846,8 +993,7 @@
                                 ;
                                         TabStop = TabStop0
                                 },
-                                { WriteSpaces is TabStop - OutPos0 },
-                                put_spaces(WriteSpaces, OutPos0, OutPos1)
+                                put_spaces(TabStop - OutPos0, OutPos0, OutPos1)
                         ;
                                 % If we're not exanding tabs, just print it and
                                 % hope everything lines up okay.
@@ -873,7 +1019,7 @@
         ***********/
         ;
                 % The default case.  Print and be done with it.
-                { InPos is InPos0+1 },
+                { InPos is InPos0 + 1 },
                 ( { InPos < OutBound } ->
                         { OutPos1 = InPos },
                         io__write_char(C)
Index: difftype.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/difftype.m,v
retrieving revision 1.1
diff -u -t -u -r1.1 difftype.m
--- difftype.m	1998/01/13 00:51:58	1.1
+++ difftype.m	1998/09/13 08:07:19
@@ -5,7 +5,7 @@
 %-----------------------------------------------------------------------------%
 
 % Main author: bromage
-% Based on lcsstype.m, written by bromage and simplified by
+% Based heavily on lcsstype.m, written by bromage and simplified by
 % Marnix Klooster <marnix at worldonline.nl>
 
 % This module contains the type of a diff.
@@ -44,6 +44,82 @@
         % Invariant: The edits must be in order, and must
         % not overlap or touch.
 :- type diff == list(edit).
+
+%-----------------------------------------------------------------------------%
+
+:- pred first_mentioned_positions(edit :: in, pos :: out, pos :: out) is det.
+
+:- pred last_mentioned_positions(edit :: in, pos :: out, pos :: out) is det.
+
+%-----------------------------------------------------------------------------%
+
+        % Add an edit to the start of a diff, producing a new diff.
+        % This predicate determines what kind of edit this is, and
+        % merges with the adjacent edits if appropriate.
+:- pred difftype__add_edit(segment, segment, diff, diff).
+:- mode difftype__add_edit(in, in, in, out) is det.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+:- import_module int.
+
+first_mentioned_positions(add(X, Y - _), X, Y).
+first_mentioned_positions(delete(X - _, Y), X, Y).
+first_mentioned_positions(change(X - _, Y - _), X, Y).
+
+last_mentioned_positions(add(X, _ - Y), X, Y).
+last_mentioned_positions(delete(_ - X, Y), X, Y).
+last_mentioned_positions(change(_ - X, _ - Y), X, Y).
+
+%-----------------------------------------------------------------------------%
+
+difftype__add_edit(X1 - X2, Y1 - Y2, [], Diff) :-
+        ( X1 = X2 ->
+                ( Y1 = Y2 ->
+                        Diff = []
+                ;
+                        Diff = [add(X1, Y1 - Y2)]
+                )
+        ;
+                ( Y1 = Y2 ->
+                        Diff = [delete(X1 - X2, Y1)]
+                ;
+                        Diff = [change(X1 - X2, Y1 - Y2)]
+                )
+        ).
+difftype__add_edit(X1 - X2, Y1 - Y2, [Edit0 | Diff0], Diff) :-
+        ( Edit0 = add(X2, Y2 - Y3) ->
+                ( X1 = X2 ->
+                        Diff = [add(X1, Y1 - Y3) | Diff0]
+                ;
+                        Diff = [change(X1 - X2, Y1 - Y3) | Diff0]
+                )
+        ; Edit0 = delete(X2 - X3, Y2) ->
+                ( Y1 = Y2 ->
+                        Diff = [delete(X1 - X3, Y1) | Diff0]
+                ;
+                        Diff = [change(X1 - X3, Y1 - Y2) | Diff0]
+                )
+        ; Edit0 = change(X2 - X3, Y2 - Y3) ->
+                Diff = [change(X1 - X3, Y1 - Y3) | Diff0]
+        ;
+                % This is just copied from the base case.  Pretty much.
+                ( X1 = X2 ->
+                        ( Y1 = Y2 ->
+                                Diff = [Edit0 | Diff0]
+                        ;
+                                Diff = [add(X1, Y1 - Y2), Edit0 | Diff0]
+                        )
+                ;
+                        ( Y1 = Y2 ->
+                                Diff = [delete(X1 - X2, Y1), Edit0 | Diff0]
+                        ;
+                                Diff = [change(X1 - X2, Y1 - Y2), Edit0 | Diff0]
+                        )
+                )
+        ).
 
 %-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
Index: file.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/file.m,v
retrieving revision 1.10
diff -u -t -u -r1.10 file.m
--- file.m	1998/01/13 00:52:00	1.10
+++ file.m	1998/09/13 08:17:38
@@ -104,14 +104,12 @@
                 io__state, io__state).
 :- mode file__read_stream2(in, in, array_uo, di, uo) is det.
 file__read_stream2(Stream, LineNo, File) -->
-        io__read_line(Stream, Res),
+        io__read_line_as_string(Stream, Res),
         ( { Res = eof },
                 { array__init(LineNo, "", File) }
         ; { Res = ok(Line) },
-                { string__from_char_list(Line, Line1) },
-                { LineNo1 is LineNo + 1 },
-                file__read_stream2(Stream, LineNo1, File1),
-                { array__set(File1, LineNo, Line1, File) }
+                file__read_stream2(Stream, LineNo + 1, File1),
+                { array__set(File1, LineNo, Line, File) }
         ; { Res = error(Error) },
                 { io__error_message(Error, Msg) },
                 { error(Msg) }
@@ -122,9 +120,8 @@
 file__get_line(file(_, Contents), LineNo, Line) :-
         array__semidet_lookup(Contents, LineNo, Line).
 
-file__get_numlines(file(_, Contents), NumLines) :-
-        array__bounds(Contents, _, NumLines1),
-        NumLines is NumLines1 + 1.
+file__get_numlines(file(_, Contents), NumLines1 + 1) :-
+        array__bounds(Contents, _, NumLines1).
 
 %-----------------------------------------------------------------------------%
 
Index: filter.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/filter.m,v
retrieving revision 1.1
diff -u -t -u -r1.1 filter.m
--- filter.m	1998/01/13 00:52:01	1.1
+++ filter.m	1998/09/08 02:06:27
@@ -13,10 +13,11 @@
 % This causes edits to be dropped if they contain changes which only
 % add, delete or change blank lines.
 
-% TO DO: What is a blank line, exactly, and does its definition change
+% TO DO: What exactly is a blank line, and does its definition change
 %        if --ignore-space-change or --ignore-all-space have been
 %        specified?  At the moment, we define a blank line to be a line
-%        containing zero or more whitespace characters.
+%        containing zero or more whitespace characters.  Check if this is
+%        correct or not.
 
 %-----------------------------------------------------------------------------%
 
@@ -87,13 +88,12 @@
                 (
                         file__get_line(File, First, Line),
                         string__to_char_list(Line, Chars),
-                        (
+                        all [C] (
                                 list__member(C, Chars)
                         =>
                                 char__is_whitespace(C)
                         ),
-                        Next is First + 1,
-                        range_has_only_blank_lines(Next, Last, File)
+                        range_has_only_blank_lines(First + 1, Last, File)
                 )
         ).
 
Index: options.m
===================================================================
RCS file: /home/staff/zs/imp/mercury/samples/diff/options.m,v
retrieving revision 1.1
diff -u -t -u -r1.1 options.m
--- options.m	1998/01/13 00:52:08	1.1
+++ options.m	1998/09/08 02:57:40
@@ -49,14 +49,15 @@
         ;       brief
         ;       ifdef
         ;       side_by_side
+        ;       cvs_merge_conflict
 
         % Output options
         ;       show_c_function         % Not handled (and unlikely to be soon)
         ;       show_function_line      % Not handled (and unlikely to be soon)
         ;       label
         ;       width
-        ;       expand_tabs             % Not handled
-        ;       initial_tab             % Not handled
+        ;       expand_tabs
+        ;       initial_tab
         ;       paginate                % Not handled (and unlikely to be soon)
         ;       left_column
         ;       suppress_common_lines
@@ -65,16 +66,16 @@
         % Matching options
         ;       new_file                % Not handled (and unlikely to be soon)
         ;       unidirectional_new_file % Not handled (and unlikely to be soon)
-        ;       ignore_case             % Not handled
-        ;       ignore_all_space        % Not handled
-        ;       ignore_space_change     % Not handled
+        ;       ignore_case
+        ;       ignore_all_space
+        ;       ignore_space_change
 
         % Diff options
-        ;       minimal                 % Accepted but ignored
+        ;       minimal
         ;       speed_large_files       % Accepted but ignored
         ;       file_split_speed_hack   % Accepted but ignored (GNU diff
-                                        % does this too, so let's not feel
-                                        % too bad about it)
+                                        % ignores this too, so let's not
+                                        % feel too bad about it)
 
         % Change filter options
         ;       ignore_matching_lines   % Not handled (and unlikely to be soon)
@@ -148,6 +149,7 @@
 long_option("ignore-all-space",         ignore_all_space).
 long_option("exclude",                  exclude).
 long_option("side-by-side",             side_by_side).
+long_option("cvs-merge-conflict",       cvs_merge_conflict).
 long_option("left-column",              left_column).
 long_option("suppress-common-lines",    suppress_common_lines).
 long_option("sdiff-merge-assist",       sdiff_merge_assist).
@@ -218,6 +220,7 @@
 option_defaults(brief,                          bool(no)).
 option_defaults(ifdef,                          maybe_string(no)).
 option_defaults(side_by_side,                   bool(no)).
+option_defaults(cvs_merge_conflict,             bool(no)).
 
         % Output options
 option_defaults(show_c_function,                bool_special).
@@ -225,7 +228,7 @@
 option_defaults(label,                          accumulating([])).
 option_defaults(width,                          int(130)).
 option_defaults(expand_tabs,                    bool(no)).
-option_defaults(initial_tab,                    bool_special).
+option_defaults(initial_tab,                    bool(no)).
 option_defaults(paginate,                       bool_special).
 option_defaults(left_column,                    bool(no)).
 option_defaults(suppress_common_lines,          bool(no)).
@@ -234,9 +237,9 @@
         % Matching options
 option_defaults(new_file,                       bool_special).
 option_defaults(unidirectional_new_file,        bool_special).
-option_defaults(ignore_case,                    bool_special).
-option_defaults(ignore_all_space,               bool_special).
-option_defaults(ignore_space_change,            bool_special).
+option_defaults(ignore_case,                    bool(no)).
+option_defaults(ignore_all_space,               bool(no)).
+option_defaults(ignore_space_change,            bool(no)).
 
         % Diff options
 option_defaults(minimal,                        bool(no)).
@@ -247,7 +250,7 @@
 option_defaults(ignore_matching_lines,          string_special).
 option_defaults(ignore_blank_lines,             bool(no)).
 
-        % Directory comparison options
+        % Directory comparison options (none of these are handled)
 option_defaults(starting_file,                  string_special).
 option_defaults(recursive,                      bool_special).
 option_defaults(report_identical_files,         bool_special).
@@ -296,10 +299,6 @@
         Msg = "Option not handled: --show-c-function".
 special_handler(show_function_line, _, _, error(Msg)) :-
         Msg = "Option not handled: --show-function-line".
-special_handler(expand_tabs, _, _, error(Msg)) :-
-        Msg = "Option not handled: --expand-tabs".
-special_handler(initial_tab, _, _, error(Msg)) :-
-        Msg = "Option not handled: --initial-tab".
 special_handler(paginate, _, _, error(Msg)) :-
         Msg = "Option not handled: --paginate".
 special_handler(sdiff_merge_assist, _, _, error(Msg)) :-
@@ -308,12 +307,6 @@
         Msg = "Option not handled: --new-file".
 special_handler(unidirectional_new_file, _, _, error(Msg)) :-
         Msg = "Option not handled: --unidirectional-new-file".
-special_handler(ignore_case, _, _, error(Msg)) :-
-        Msg = "Option not handled: --ignore-case".
-special_handler(ignore_all_space, _, _, error(Msg)) :-
-        Msg = "Option not handled: --ignore-all-space".
-special_handler(ignore_space_change, _, _, error(Msg)) :-
-        Msg = "Option not handled: --ignore-space-change".
 special_handler(speed_large_files, _, _, error(Msg)) :-
         Msg = "Option not handled: --speed-large-files".
 special_handler(ignore_matching_lines, _, _, error(Msg)) :-
@@ -392,44 +385,47 @@
                 map__search(OptionTable, rcs, bool(UseRCS)),
                 map__search(OptionTable, brief, bool(UseBrief)),
                 map__search(OptionTable, ifdef, maybe_string(UseIfdef)),
-                map__search(OptionTable, side_by_side, bool(UseSideBySide))
+                map__search(OptionTable, side_by_side, bool(UseSideBySide)),
+                map__search(OptionTable, cvs_merge_conflict, bool(CVS))
         ->
                 postprocess_output_style_2(UseHelp, UseVersion, UseContext,
                         UseUnified, UseEd, UseForwardEd, UseRCS, UseBrief,
-                        UseIfdef, UseSideBySide,
+                        UseIfdef, UseSideBySide, CVS,
                         Style)
         ;
                 error("postprocess_output_style")
         ).
 
 :- pred postprocess_output_style_2(bool, bool, maybe(int), maybe(int), bool,
-                bool, bool, bool, maybe(string), bool,
+                bool, bool, bool, maybe(string), bool, bool,
                 diff_out__output_style).
-:- mode postprocess_output_style_2(in, in, in, in, in, in, in, in, in, in,
+:- mode postprocess_output_style_2(in, in, in, in, in, in, in, in, in, in, in,
                 out) is semidet.
 
-postprocess_output_style_2(no, no, no, no, no, no, no, no, no, no,
+postprocess_output_style_2(no, no, no, no, no, no, no, no, no, no, no,
                                         normal).
-postprocess_output_style_2(yes, no, no, no, no, no, no, no, no, no,
+postprocess_output_style_2(yes, no, no, no, no, no, no, no, no, no, no,
                                         help_only).
-postprocess_output_style_2(no, yes, no, no, no, no, no, no, no, no,
+postprocess_output_style_2(no, yes, no, no, no, no, no, no, no, no, no,
                                         version_only).
-postprocess_output_style_2(no, no, yes(C), no, no, no, no, no, no, no,
+postprocess_output_style_2(no, no, yes(C), no, no, no, no, no, no, no, no,
                                         context(C)).
-postprocess_output_style_2(no, no, no, yes(U), no, no, no, no, no, no,
+postprocess_output_style_2(no, no, no, yes(U), no, no, no, no, no, no, no,
                                         unified(U)).
-postprocess_output_style_2(no, no, no, no, yes, no, no, no, no, no,
+postprocess_output_style_2(no, no, no, no, yes, no, no, no, no, no, no,
                                         ed).
-postprocess_output_style_2(no, no, no, no, no, yes, no, no, no, no,
+postprocess_output_style_2(no, no, no, no, no, yes, no, no, no, no, no,
                                         forward_ed).
-postprocess_output_style_2(no, no, no, no, no, no, yes, no, no, no,
+postprocess_output_style_2(no, no, no, no, no, no, yes, no, no, no, no,
                                         rcs).
-postprocess_output_style_2(no, no, no, no, no, no, no, yes, no, no,
+postprocess_output_style_2(no, no, no, no, no, no, no, yes, no, no, no,
                                         brief).
-postprocess_output_style_2(no, no, no, no, no, no, no, no, yes(Sym), no,
+postprocess_output_style_2(no, no, no, no, no, no, no, no, yes(Sym), no, no,
                                         ifdef(Sym)).
-postprocess_output_style_2(no, no, no, no, no, no, no, no, no, yes,
+postprocess_output_style_2(no, no, no, no, no, no, no, no, no, yes, no,
                                         side_by_side).
+postprocess_output_style_2(no, no, no, no, no, no, no, no, no, no, yes,
+                                        cvs_merge_conflict).
 
 %-----------------------------------------------------------------------------%
 
@@ -465,7 +461,6 @@
         io__write_string("\nMatching options:\n"),
         io__write_string("\t-d, --minimal\n"),
         io__write_string("\t\tTry hard to find as small a set of changes as possible.\n"),
-        io__write_string("\t\t(Currently a no-op --- we always produce the minimal set.)\n"),
         io__write_string("\t-B, --ignore-blank-lines\n"),
         io__write_string("\t\tIgnore changes whose lines are all blank.\n").
 

New File: match.m
===================================================================
%-----------------------------------------------------------------------------%
% Copyright (C) 1998 The University of Melbourne.
% This file may only be copied under the terms of the GNU General
% Public License - see the file COPYING in the Mercury distribution.
%-----------------------------------------------------------------------------%

% Main author: bromage

% This module contains code to match common lines before diffing, based on
% the command-line options presented.  The important command-line options
% are --ignore-case, --ignore-all-space and --ignore-space-change.

% The output of build_matches is two arrays of integers, where any two
% lines are assigned the same integer iff they are identical (modulo case,
% space and/or space change depending on the command line options).  An
% added benefit of doing this here is that the diff algorithm (myers.m)
% only has to compare integers instead of strings.

% TO DO: We should collapse sequences of lines which only appear in one
%        file and pretend the whole sequence is just one line.  (GNU
%        diff does the same thing a slightly different way, but this
%        approach seems a bit more Mercury-esque.)  Since Myers'
%	 algorithm runs in O(ND) time, and performing this pre-filtering
%	 here would reduce the value of D (by quite a lot in real-world
%	 cases), things should speed up.

%-----------------------------------------------------------------------------%

:- module match.

:- interface.
:- import_module file, io, array.

:- pred build_matches(file :: in, file :: in,
		array(int) :: out, array(int) :: out,
		io__state :: di, io__state :: uo) is det.

%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%

:- implementation.
:- import_module globals, options.
:- import_module bool, list, int, std_util, string, char, map, require.

:- type match_options
	--->	match_options(
			bool,		% No options set
			bool,		% --ignore-case
			bool,		% --ignore-all-space
			bool		% --ignore-space-change
		).

build_matches(File1, File2, FileX, FileY) -->
	globals__io_lookup_bool_option(ignore_case, IgnCase),
	globals__io_lookup_bool_option(ignore_all_space, IgnAllSpc),
	globals__io_lookup_bool_option(ignore_space_change, IgnSpcChg),
	{
		bool__or_list([IgnCase, IgnAllSpc, IgnSpcChg], AnyOpts),
		bool__not(AnyOpts, NoOpts),
		Opts = match_options(NoOpts, IgnCase, IgnAllSpc, IgnSpcChg),
		map__init(MatchMap0),
		file__get_numlines(File1, SizeX),
		array__init(SizeX, -1, FileX0),
		build_matches_for_file(Opts, File1, SizeX - 1, MatchMap0,
			MatchMap1, 0, ID1, FileX0, FileX),
		file__get_numlines(File2, SizeY),
		array__init(SizeY, -1, FileY0),
		build_matches_for_file(Opts, File2, SizeY - 1, MatchMap1, _,
			ID1, _, FileY0, FileY)
	}.

:- pred build_matches_for_file(match_options, file, int,
	map(string, int), map(string, int), int, int, array(int), array(int)).
:- mode build_matches_for_file(in, in, in, in, out, in, out,
	array_di, array_uo) is det.

build_matches_for_file(Opts, OrigFile, I, MatchMap0, MatchMap, ID0, ID,
		File0, File) :-
	( I < 0 ->
		MatchMap = MatchMap0,
		ID = ID0,
		File = File0
	;
		( file__get_line(OrigFile, I, Line0) ->
			Line1 = Line0
		;
			error("build_matches_for_file")
		),
		Opts = match_options(NoOpts, IgnCase, IgnAllSpc, IgnSpcChg),
		( NoOpts = yes ->
			Line = Line1
		;
			string__to_char_list(Line1, Chars0),
			normalise_line(no, IgnCase, IgnAllSpc, IgnSpcChg,
				Chars0, Chars1),
			string__from_char_list(Chars1, Line)
		),
		( map__search(MatchMap0, Line, MaybeID) ->
			array__set(File0, I, MaybeID, File1),
			MatchMap1 = MatchMap0,
			ID1 = ID0
		;
			array__set(File0, I, ID0, File1),
			map__det_insert(MatchMap0, Line, ID0, MatchMap1),
			ID1 is ID0 + 1
		),
		build_matches_for_file(Opts, OrigFile, I - 1, MatchMap1,
			MatchMap, ID1, ID, File1, File)
	).

:- pred normalise_line(bool, bool, bool, bool, list(char), list(char)).
:- mode normalise_line(in, in, in, in, in, out) is det.

normalise_line(_, _, _, _, [], []).
normalise_line(LastSpace, IgnCase, IgnAllSpc, IgnSpcChg, [C0 | Cs0], Cs) :-
	( IgnCase = yes ->
		char__to_lower(C0, C)
	;
		C = C0
	),
	(
		char__is_whitespace(C),
		(
			IgnAllSpc = yes
		->
			normalise_line(LastSpace, IgnCase, IgnAllSpc, IgnSpcChg,
					Cs0, CsX)
		;
			IgnSpcChg = yes
		->
			( LastSpace = yes ->
				normalise_line(yes, IgnCase, IgnAllSpc,
						IgnSpcChg, Cs0, CsX)
			;
				normalise_line(yes, IgnCase, IgnAllSpc,
						IgnSpcChg, Cs0, Cs1),
				CsX = [' ' | Cs1]
				
			)
		;
			fail
		)
	->
		Cs = CsX
	;
		normalise_line(no, IgnCase, IgnAllSpc, IgnSpcChg,
				Cs0, Cs1),
		Cs = [C | Cs1]
	).

%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%

New File: myers.m
===================================================================
%-----------------------------------------------------------------------------%
% Copyright (C) 1998 The University of Melbourne.
% This file may only be copied under the terms of the GNU General
% Public License - see the file COPYING in the Mercury distribution.
%-----------------------------------------------------------------------------%

% Main author: bromage

% TO DO: We should implement the big-snake heuristic (a.k.a.
%	--speed-large-files).
%
% ALSO TO DO: Gene Myers et al have since produced another algorithm
%	which takes O(NP) time where P is the number of deletions in
%	the edit script.  If the `too expensive' heuristic can be
%	retro-fitted onto that algorithm easily enough, we should try
%	out this algorithm and see how fast it runs.  In theory, we
%	should be looking at about a 2x speedup.

%-----------------------------------------------------------------------------%

:- module myers.

:- interface.
:- import_module difftype, array, io.

:- pred diff_by_myers(array(int), array(int), diff, io__state, io__state).
:- mode diff_by_myers(in, in, out, di, uo) is det.

%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%

:- implementation.
:- import_module options, globals.
:- import_module map, require, std_util, int, list, char, bool.

% The basic algorithm is described in:
%	"An O(ND) Difference Algorithm and its Variations", Eugene Myers,
%	Algorithmica Vol. 1 No. 2, 1986, pp. 251-266.
%
% This uses the variation in section 4b.

diff_by_myers(FileX, FileY, Diff) -->
	globals__io_lookup_bool_option(minimal, Minimal),
	{
		array__size(FileX, SizeX),
		array__size(FileY, SizeY),
		SizeMax = SizeX + SizeY + 3,
		DOffset = SizeY + 1,

		% If we don't insist on --minimal, calculate the
		% approximate square root of the input size for
		% the "too expensive" heuristic.  The effect of
		% this is to limit the amount of work to about
		% O(n ** (1.5 log n)) at the expense of finding a
		% possibly non-minimal diff.

		( Minimal = yes,
			Heur = none
		; Minimal = no,
			int__log2(SizeMax, SizeLog2),
			int__max(minimum_too_expensive, 1 << (SizeLog2 // 2),
					SizeHeuristic),
			Heur = too_expensive(SizeHeuristic)
		),

			% Fill the arrays with nondescript numbers which
			% the algorithm shouldn't produce.  (For debugging
			% purposes.)
		array__init(SizeMax, -65537, Fwd),
		array__init(SizeMax, -65537, Bwd),
		myers__bsearch(DOffset, FileX, FileY, 0, SizeX, 0, SizeY,
			Heur, Fwd, _, Bwd, _, [], Diff)
	}.

	% XXX This lower bound is a guess.  Need to do some measurements
	%     to see if it's good or not.
:- func minimum_too_expensive = int.
minimum_too_expensive = 256.

:- pred myers__bsearch(int, array(int), array(int), int, int, int, int, heur,
		array(int), array(int), array(int), array(int),
		diff, diff).
:- mode myers__bsearch(in, in, in, in, in, in, in, in,
		array_di, array_uo, array_di, array_uo,
		in, out) is det.

myers__bsearch(DOffset, FileX, FileY, Xlow0, Xhigh0, Ylow0, Yhigh0, Heur,
			Fwd0, Fwd, Bwd0, Bwd, Diff0, Diff) :-
	myers__scan_forward(FileX, FileY, Xhigh0, Yhigh0,
		Xlow0, Xlow, Ylow0, Ylow),
	myers__scan_backward(FileX, FileY, Xlow, Ylow,
		Xhigh0, Xhigh, Yhigh0, Yhigh),

	(
		( Xlow >= Xhigh
		; Ylow >= Yhigh
		)
	->
		Fwd = Fwd0, Bwd = Bwd0,
		difftype__add_edit(Xlow - Xhigh, Ylow - Yhigh, Diff0, Diff)
	;
		myers__find_middle(DOffset, FileX, FileY,
			Xlow, Xhigh, Ylow, Yhigh, Heur,
			Fwd0, Fwd1, Bwd0, Bwd1, Xmid, Ymid, Cost,
			LeftHeur - RightHeur),
		(
			Cost > 0
		->
			myers__bsearch(DOffset, FileX, FileY,
				Xmid, Xhigh, Ymid, Yhigh, LeftHeur,
				Fwd1, Fwd2, Bwd1, Bwd2, Diff0, Diff1),
			myers__bsearch(DOffset, FileX, FileY,
				Xlow, Xmid, Ylow, Ymid, RightHeur,
				Fwd2, Fwd, Bwd2, Bwd, Diff1, Diff)
		;
			error("myers__bsearch")
		)
	).

:- type myers_constants
	--->	constants(
			int,		% DOffset
			array(int),	% X
			array(int),	% Y
			int,		% Xlow
			int,		% Xhigh
			int,		% Ylow
			int,		% Yhigh
			int,		% Dmin
			int,		% Dmax
			bool,		% DeltaOdd
			heur		% "Too expensive" heuristic.
		).

:- type heur
	--->	too_expensive(int)
	;	none.

	% The best part about this algorithm is: We don't actually
	% need to find the middle of the diff.  We only have to find
	% an estimate to it.  If we don't find the exact middle,
	% we will have a correct diff, but it won't necessarily be
	% minimal.
:- pred myers__find_middle(int, array(int), array(int), pos, pos, pos, pos,
		heur,
		array(int), array(int), array(int), array(int),
		pos, pos, int, pair(heur)).
:- mode myers__find_middle(in, in, in, in, in, in, in, in,
		array_di, array_uo, array_di, array_uo,
		out, out, out, out) is det.

myers__find_middle(DOffset, FileX, FileY, Xlow, Xhigh, Ylow, Yhigh, Heur,
		Fwd0, Fwd, Bwd0, Bwd, Xmid, Ymid, Cost, HeurReq) :-

	Dmin = Xlow - Yhigh,
	Dmax = Xhigh - Ylow,

	Fmid = Xlow - Ylow,
	array__set(Fwd0, Fmid + DOffset, Xlow, Fwd1),
	Bmid = Xhigh - Yhigh,
	array__set(Bwd0, Bmid + DOffset, Xhigh, Bwd1),

	( 1 = (Fmid - Bmid) /\ 1 ->
		DeltaOdd = yes
	;
		DeltaOdd = no
	),

	Constants = constants(
		DOffset, FileX, FileY, Xlow, Xhigh, Ylow, Yhigh,
		Dmin, Dmax, DeltaOdd, Heur
	),

	myers__find_middle_2(Constants, Fwd1, Fwd, Bwd1, Bwd,
		Fmid, Fmid, Bmid, Bmid, 1, Cost, Xmid - Ymid, HeurReq).


:- pred myers__find_middle_2(myers_constants,
		array(int), array(int), array(int), array(int),
		int, int, int, int, int, int, pair(pos), pair(heur)).
:- mode myers__find_middle_2(in, array_di, array_uo, array_di, array_uo,
		in, in, in, in, in, out, out, out) is det.

myers__find_middle_2(Constants, Fwd0, Fwd, Bwd0, Bwd,
		Fmin, Fmax, Bmin, Bmax, Cost0, Cost, Mid, HeurReq) :-
	Constants = constants(DOffset, _, _, _, _, _, _, Dmin, Dmax, _, _),
	( Fmin > Dmin ->
		Fmin1 = Fmin - 1,
		array__set(Fwd0, Fmin1 + DOffset - 1, -1, Fwd1)
	;
		Fmin1 = Fmin + 1,
		Fwd1 = Fwd0
	),
	( Fmax < Dmax ->
		Fmax1 = Fmax + 1,
		array__set(Fwd1, Fmax1 + DOffset + 1, -1, Fwd2)
	;
		Fmax1 = Fmax - 1,
		Fwd2 = Fwd1
	),
	myers__find_forward_reaching_path(Constants, Fwd2, Fwd, Bwd0, Bwd,
		Fmin1, Fmax1, Bmin, Bmax, Fmax1, Cost0, Cost, Mid, HeurReq).


:- pred myers__find_forward_reaching_path(myers_constants,
		array(int), array(int), array(int), array(int),
		int, int, int, int, int, int, int, pair(pos), pair(heur)).
:- mode myers__find_forward_reaching_path(in, array_di, array_uo,
		array_di, array_uo, in, in, in, in, in, in, out, out, out)
				is det.

myers__find_forward_reaching_path(Constants, Fwd0, Fwd, Bwd0, Bwd,
		Fmin, Fmax, Bmin, Bmax, SearchCost, Cost0, Cost, Mid,
		HeurReq) :-
	( SearchCost < Fmin ->
		Constants = constants(DOffset, _, _, _, _, _, _, Dmin, Dmax, _,
					_),
		int__max_int(MaxInt),
		( Bmin > Dmin ->
			Bmin1 = Bmin - 1,
			array__set(Bwd0, Bmin1 + DOffset - 1, MaxInt, Bwd1)
		;
			Bmin1 = Bmin + 1,
			Bwd1 = Bwd0
		),
		( Bmax < Dmax ->
			Bmax1 = Bmax + 1,
			array__set(Bwd1, Bmax1 + DOffset + 1, MaxInt, Bwd2)
		;
			Bmax1 = Bmax - 1,
			Bwd2 = Bwd1
		),
		myers__find_backward_reaching_path(Constants,
			Fwd0, Fwd, Bwd2, Bwd, Fmin, Fmax, Bmin1, Bmax1,
			Bmax1, Cost0, Cost, Mid, HeurReq)
	;
		Constants = constants(DOffset, _, _, _, _, _, _, _, _, _, _),
		array__lookup(Fwd0, SearchCost + DOffset - 1, Tlo),
		array__lookup(Fwd0, SearchCost + DOffset + 1, Thi),
		( Tlo >= Thi ->
			X0 = Tlo + 1
		;
			X0 = Thi
		),
		Y0 = X0 - SearchCost,
		Constants = constants(_, FileX, FileY, _, Xhigh, _, Yhigh,
			_, _, _, _),
		myers__scan_forward(FileX, FileY, Xhigh, Yhigh, X0, X, Y0, Y),
		array__set(Fwd0, SearchCost + DOffset, X, Fwd1),

		Constants = constants(_, _, _, _, _, _, _, _, _, DeltaOdd, _),
		(
			DeltaOdd = yes,
			Bmin =< SearchCost,
			SearchCost =< Bmax,
			array__lookup(Bwd0, SearchCost + DOffset, BB),
			BB =< X
		->
			Mid = X - Y,
			Cost = 2 * Cost0 + 1,
			Fwd = Fwd1,
			Bwd = Bwd0,
			HeurReq = none - none
		;
			myers__find_forward_reaching_path(Constants,
				Fwd1, Fwd, Bwd0, Bwd, Fmin, Fmax, Bmin, Bmax,
				SearchCost - 2, Cost0, Cost, Mid, HeurReq)
		)
	).


:- pred myers__find_backward_reaching_path(myers_constants,
		array(int), array(int), array(int), array(int),
		int, int, int, int, int, int, int, pair(pos), pair(heur)).
:- mode myers__find_backward_reaching_path(in, array_di, array_uo,
		array_di, array_uo, in, in, in, in, in, in,
		out, out, out) is det.

myers__find_backward_reaching_path(Constants, Fwd0, Fwd, Bwd0, Bwd,
		Fmin, Fmax, Bmin, Bmax, SearchCost, Cost0, Cost, Mid,
		HeurReq) :-
	( SearchCost < Bmin ->
		myers__try_heuristics(Constants, Fwd0, Fwd, Bwd0, Bwd,
			Fmin, Fmax, Bmin, Bmax, Cost0, Cost, Mid, HeurReq)
	;
		Constants = constants(DOffset, _, _, _, _, _, _, _, _, _, _),
		array__lookup(Bwd0, SearchCost + DOffset - 1, Tlo),
		array__lookup(Bwd0, SearchCost + DOffset + 1, Thi),
		( Tlo < Thi ->
			X0 = Tlo
		;
			X0 = Thi - 1
		),
		Y0 = X0 - SearchCost,
		Constants = constants(_, FileX, FileY, Xlow, _, Ylow, _,
			_, _, _, _),
		myers__scan_backward(FileX, FileY, Xlow, Ylow, X0, X, Y0, Y),
		array__set(Bwd0, SearchCost + DOffset, X, Bwd1),

		Constants = constants(_, _, _, _, _, _, _, _, _, DeltaOdd, _),
		(
			DeltaOdd = no,
			Fmin =< SearchCost,
			SearchCost =< Fmax,
			array__lookup(Fwd0, SearchCost + DOffset, FF),
			X =< FF
		->
			Mid = X - Y,
			Cost = 2 * Cost0,
			Fwd = Fwd0,
			Bwd = Bwd1,
			HeurReq = none - none
		;
			myers__find_backward_reaching_path(Constants,
				Fwd0, Fwd, Bwd1, Bwd, Fmin, Fmax, Bmin, Bmax,
				SearchCost - 2, Cost0, Cost, Mid, HeurReq)
		)
	).


	% Try applying some heuristics to see if we can avoid some work.
:- pred myers__try_heuristics(myers_constants,
		array(int), array(int), array(int), array(int),
		int, int, int, int, int, int, pair(pos), pair(heur)).
:- mode myers__try_heuristics(in, array_di, array_uo,
		array_di, array_uo, in, in, in, in, in, out, out, out) is det.

myers__try_heuristics(Constants, Fwd0, Fwd, Bwd0, Bwd,
		Fmin, Fmax, Bmin, Bmax, Cost0, Cost, Mid, HeurReq) :-
	Constants = constants(_, _, _, _, _, _, _, _, _, _, Heur),
	(
		Heur = too_expensive(Cutoff),
		Cost0 >= Cutoff
	->
			% If we've done too much work, stop here.
		Fwd = Fwd0, Bwd = Bwd0,
		myers__too_expensive_heuristic(Constants, Fwd, Bwd,
			Fmin, Fmax, Bmin, Bmax, Cost0, Cost, Mid, HeurReq)
	;
		% Can't apply heuristic, so try looking for a diff of size
		% Cost0 + 1.

		myers__find_middle_2(Constants, Fwd0, Fwd, Bwd0, Bwd,
			Fmin, Fmax, Bmin, Bmax, Cost0 + 1, Cost, Mid, HeurReq)
	).

%-----------------------------------------------------------------------------%

	% We've done too much work, so make our best guess.
:- pred myers__too_expensive_heuristic(myers_constants, array(int), array(int),
		int, int, int, int, int, int, pair(pos), pair(heur)).
:- mode myers__too_expensive_heuristic(in, array_ui, array_ui,
		in, in, in, in, in, out, out, out) is det.

myers__too_expensive_heuristic(Constants, Fwd, Bwd,
		Fmin, Fmax, Bmin, Bmax, Cost0, Cost, Mid, HeurReq) :-
	% Find the best diagonal that we can, take the end of
	% that diagonal as the "middle".  Do not apply the
	% heuristic recursively to that best diagonal.

	Constants = constants(DOffset, _, _, Xlow, Xhigh, Ylow, Yhigh,
			_, _, _, Heur),

		% Find the best forward diagonal.
	myers__find_best_forward_diagonal(Fmax, Fmin, Fwd,
			Xhigh, Yhigh, DOffset, -1, FXYBest, 0, FXBest),

		% Find the best backward diagonal.
	int__max_int(MaxInt),
	myers__find_best_backward_diagonal(Bmax, Bmin, Bwd,
			Xlow, Ylow, DOffset, MaxInt, BXYBest, 0, BXBest),

		% Choose which of these diagonals is the better one
		% and return that as the "middle" point.
	(
		FXYBest - (Xhigh + Yhigh) < (Xlow + Ylow) - BXYBest
	->
		Xmid = FXBest,
		Ymid = FXYBest - FXBest,
		HeurReq = none - Heur
	;
		Xmid = BXBest,
		Ymid = BXYBest - BXBest,
		HeurReq = Heur - none
	),
	Mid = Xmid - Ymid,
	Cost = 2 * Cost0 - 1.

:- pred myers__find_best_forward_diagonal(int, int, array(int), int, int, int,
			int, int, int, int).
:- mode myers__find_best_forward_diagonal(in, in, array_ui, in, in, in,
			in, out, in, out) is det.

myers__find_best_forward_diagonal(D, Fmin, Fwd, Xhigh, Yhigh, DOffset,
			FXYBest0, FXYBest, FXBest0, FXBest) :-
	( D < Fmin ->
		FXYBest = FXYBest0,
		FXBest = FXBest0
	;
		array__lookup(Fwd, D + DOffset, X0),
		int__min(Xhigh, X0, X1),
		Y0 = X1 - D,

		( Yhigh < Y0 ->
			X = Yhigh + D,
			Y = Yhigh
		;
			X = X1,
			Y = Y0
		),

		NewFXY = X + Y,
		( FXYBest0 < NewFXY ->
			myers__find_best_forward_diagonal(D - 2, Fmin, Fwd,
				Xhigh, Yhigh, DOffset, NewFXY, FXYBest,
				X, FXBest)
		;
			myers__find_best_forward_diagonal(D - 2, Fmin, Fwd,
				Xhigh, Yhigh, DOffset, FXYBest0, FXYBest,
				FXBest0, FXBest)
		)
	).

:- pred myers__find_best_backward_diagonal(int, int, array(int), int, int, int,
			int, int, int, int).
:- mode myers__find_best_backward_diagonal(in, in, array_ui, in, in, in,
			in, out, in, out) is det.

myers__find_best_backward_diagonal(D, Bmin, Bwd, Xlow, Ylow, DOffset,
			BXYBest0, BXYBest, BXBest0, BXBest) :-
	( D < Bmin ->
		BXYBest = BXYBest0,
		BXBest = BXBest0
	;
		array__lookup(Bwd, D + DOffset, X0),
		int__max(Xlow, X0, X1),
		Y0 = X1 - D,

		( Y0 < Ylow ->
			X = Ylow + D,
			Y = Ylow
		;
			X = X1,
			Y = Y0
		),

		NewBXY = X + Y,
		( NewBXY < BXYBest0 ->
			myers__find_best_backward_diagonal(D - 2, Bmin, Bwd,
				Xlow, Ylow, DOffset, NewBXY, BXYBest,
				X, BXBest)
		;
			myers__find_best_backward_diagonal(D - 2, Bmin, Bwd,
				Xlow, Ylow, DOffset, BXYBest0, BXYBest,
				BXBest0, BXBest)
		)
	).

%-----------------------------------------------------------------------------%

	% Travel forwards along a snake.
:- pred myers__scan_forward(array(int), array(int), int, int,
		int, int, int, int).
:- mode myers__scan_forward(in, in, in, in, in, out, in, out) is det.

myers__scan_forward(FileX, FileY, Xhigh, Yhigh, Xlow0, Xlow, Ylow0, Ylow) :-
	(
		Xlow0 < Xhigh,
		Ylow0 < Yhigh,
		array__lookup(FileX, Xlow0, Line),
		array__lookup(FileY, Ylow0, Line)
	->
		myers__scan_forward(FileX, FileY, Xhigh, Yhigh,
			Xlow0 + 1, Xlow, Ylow0 + 1, Ylow)
	;
		Xlow = Xlow0, Ylow = Ylow0
	).


	% Travel backwards along a snake.
:- pred myers__scan_backward(array(int), array(int), int, int,
		int, int, int, int).
:- mode myers__scan_backward(in, in, in, in, in, out, in, out) is det.

myers__scan_backward(FileX, FileY, Xlow, Ylow, Xhigh0, Xhigh, Yhigh0, Yhigh) :-
	(
		Xhigh0 > Xlow,
		Yhigh0 > Ylow,
		array__lookup(FileX, Xhigh0 - 1, Line),
		array__lookup(FileY, Yhigh0 - 1, Line)
	->
		myers__scan_backward(FileX, FileY, Xlow, Ylow,
			Xhigh0 - 1, Xhigh, Yhigh0 - 1, Yhigh)
	;
		Xhigh = Xhigh0, Yhigh = Yhigh0
	).

%-----------------------------------------------------------------------------%
%-----------------------------------------------------------------------------%




More information about the developers mailing list