[m-rev.] for review: discrete interval encoding tree

Peter Wang novalazy at gmail.com
Mon Mar 3 13:15:27 AEDT 2014


On Mon, 3 Mar 2014 00:48:39 +1100 (EST), Julien Fischer <jfischer at opturion.com> wrote:
> 
> Hi Peter,
> 
> On Fri, 28 Feb 2014, Peter Wang wrote:
> 
> > I have tested by bootchecking the compiler with diet and the test_bitset
> > module.  Any other ideas?
> 
> None spring to mind.
> 
> > Add discrete interval encoding tree module to the standard library.
> >
> > Discrete Interval Encoding Trees are a highly efficient set
> > implementation for fat sets.
> >
> > This implementation is largely a translation of Caml Diet.
> > I have permission from the authors to license this version either
> > under the LGPL, or LGPL/BSD dual license.  I choose the latter.
> 
> I would prefer it if the licensing of library modules were consistent
> across the entire library, so for now I suggest you pick again!
> (Also, (1) the Mercury system does not currently include the text of
> the BSD license and (2) which BSD license? -- there are several
> variants.)

If you insist, but the option is there for the future.  (The precise
variant is a problem but you can hassle the original authors if it
really matters.)

> > diff --git a/NEWS b/NEWS
> > index f35b152..4f1b43d 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -7,6 +7,8 @@ Changes to the Mercury standard library:
> >   io module.  These behave like the print and write predicates, but also
> >   write a terminating newline character.
> >
> > +* We have added a discrete interval encoding tree (diet) module.
> 
> A little more detail here would be welcome.
> 

* We have added a module for discrete interval encoding trees,
  which are a highly efficient set implementation for fat sets.

> > +:- pred foldl4(pred(T, Acc1, Acc1, Acc2, Acc2, Acc3, Acc3, Acc4, Acc4),
> > +    diet(T), Acc1, Acc1, Acc2, Acc2, Acc3, Acc3, Acc4, Acc4) <= enum(T).
> > +:- mode foldl4(pred(in, in, out, in, out, in, out, in, out) is det,
> > +    in, in, out, in, out, in, out, in, out) is det.
> > +:- mode foldl4(pred(in, in, out, in, out, in, out, mdi, muo) is det,
> > +    in, in, out, in, out, in, out, mdi, muo) is det.
> > +:- mode foldl4(pred(in, in, out, in, out, in, out, di, uo) is det,
> > +    in, in, out, in, out, in, out, di, uo) is det.
> > +:- mode foldl4(pred(in, in, out, in, out, in, out, in, out) is semidet,
> > +    in, in, out, in, out, in, out, in, out) is semidet.
> > +:- mode foldl4(pred(in, in, out, in, out, in, out, mdi, muo) is semidet,
> > +    in, in, out, in, out, in, out, mdi, muo) is semidet.
> > +:- mode foldl4(pred(in, in, out, in, out, in, out, di, uo) is semidet,
> > +    in, in, out, in, out, in, out, di, uo) is semidet.
> > +
> > +:- pred foldl5(pred(T, A, A, B, B, C, C, D, D, E, E), diet(T),
> > +    A, A, B, B, C, C, D, D, E, E) <= enum(T).
> 
> Why have you suddenly changed the naming scheme you were using for
> accumulator arguments?

I was bouncing around between U/V/W, A/B/C and Acc1/2/3.
I will go with A/B/C for all for them.

> > +%-----------------------------------------------------------------------------%
> > +
> > +:- type diet(T)
> > +    --->    empty
> > +    ;       node(
> > +                interval    :: interval,
> > +                node_height :: int,
> > +                left        :: diet(T),
> > +                right       :: diet(T)
> > +            ).
> > +
> > +:- type interval == {int, int}. % inclusive
> 
> Rather than having a separate type for an interval, I think it would be
> better to have the two integer fields directly in node constructor.  The
> extra level of indirection doesn't appear to serve much purpose.

I considered it but the result for take_min, etc. becomes two arguments
so it's not completely trivial.  It can be done later, with benchmarking.

> > +%-----------------------------------------------------------------------------%
> > +
> > +:- func det_from_int(int) = T <= enum(T).
> > +
> > +det_from_int(I) = X :-
> > +    ( X0 = from_int(I) ->
> > +        X = X0
> > +    ;
> > +        unexpected($module, $pred, "from_int failed")
> > +    ).
> 
> Move this to the enum module.

Are you sure?  I think enum should stay minimal.

> > +%-----------------------------------------------------------------------------%
> > +% vim: ft=mercury ts=4 sts=4 sw=4 et
> > diff --git a/library/library.m b/library/library.m
> > index 7e1f0ab..baabd7e 100644
> > --- a/library/library.m
> > +++ b/library/library.m
> > @@ -63,6 +63,7 @@
> > :- import_module cord.
> > :- import_module counter.
> > :- import_module deconstruct.
> > +:- import_module diet.
> > :- import_module digraph.
> > :- import_module dir.
> > :- import_module enum.
> 
> You also need to update the mercury_std_library_module/1 predicate
> further down in this module.

Done.

Peter



More information about the reviews mailing list