[mercury-users] Style, Newbie decision question, "What? No X?!", Microbenchmark

Ralph Becket rafe at cs.mu.OZ.AU
Tue Nov 19 11:51:03 AEDT 2002


Julian Fondren, Monday, 18 November 2002:
> ------------------------------------------------------------------------
> -- Style
> 
> I'm a bit confused about the presence of 'func'
> 
>  :- func foo(int) = int.
> 
> and 'pred'
> 
>  :- pred foo(int, int).
> 
> in Mercury; when should you supposed to use one or the other?
> Is one preferred over the other?  I suppose that functions make
> for more terse code (foo(bar(Baz)) vs. bar(Baz,Quux), foo(Quux)),
> and that predicates more naturally extend to multiple modes...

In general it is better to use a func rather than a pred with
a single mode `(in, in, ..., in) = out is det'.

The reason for this is twofold:
1. you don't have to supply a mode declaration for funcs with this mode;
2. you can use funcs in expressions.

It is a matter of style, but only die-hard Prolog programmers would
argue that predicate style is preferable to functional style given the
choice :-)

> ------------------------------------------------------------------------
> -- Newbie decision question
> 
> given
> 
>  :- pred char_filter(character::in, character::out) is det.

So this would be better expressed as a func.

> the following does not work because of commutativity (the second
> clause can match even when the first would, making the determinism
> 'multi')
> 
>  char_filter('\t', '\n').
>  char_filter(C, C).

This is not det; it is equivalent to

char_filter(X, Y) :-
	(	X = '\t', Y = '\n'
	;	X = C,    Y = C
	).

where the non-determinism should be obvious.

> and neither does
> 
>  char_filter(In, Out) :-
>      (
>          In = '\t',
>          Out = '\n'
>      ;
>          In \= '\t',
>          Out = In
>      ).

Some work has been done on supporting exclusive/exhaustive promises, but
it is not finished.  This work, when and if it is committed, would allow
us to write something like

:- promise exclusive_exhaustive(X = Y; X \= Y).

and the compiler would then recognise the above as being det.

> work because, apparently, both of these clauses can fail at
> once (though I don't see how), making the determinism 'semidet'.

There's a limit to the amount of analysis one can expect of the
compiler.  Currently, as far as I know, the only whole-disjunction
analysis that is performed is switch detection.  What you have
written above is not a switch, so the compiler merely sees it as the
disjunction of two semidet subgoals.

> The following, I've discovered, *does* work:
> 
>  char_filter(In, Out) :-
>      (
>          In = '\t'
>      ->
>          Out = '\n'
>      ;
>          Out = In
>      ).

Please use ( if _ then _ else _ ) rather than ( _ -> _ ; _ ).

> but this doesn't scale well:
> 
>  is_whitespace(C, Bool) :-
>      (
>          C = ' ',
>      ->
>          Bool = yes
>      ;
>          (
>              C = '\t',
>          ->
>              Bool = yes
>          ;
>              (
>                  C = '\r',
>              ->
>                  Bool = yes
>              ;
>                  (
>                      C = '\n',
>                  ->
>                      Bool = yes
>                  ;
>                      Bool = no
>                  )
>              )
>          )
>     ).

You only need parentheses around the top-level if-then-else.  You can 
chain conditional goals like this:

is_whitespace(C, Bool) :-
	(      if C = ' '  then Bool = yes
	  else if C = '\t' then Bool = yes
	  else if C = '\r' then Bool = yes
	  else if C = '\n' then Bool = yes
	  else                  Bool = no
	).

[Again, this should probably be a func if you insist on having a bool
result.]

Of course, an easier way to write this would be as

is_whitespace(C, Bool) :-
	( if   ( C = ' ' ; C = '\t' ; C = '\r' ; C = '\n' )
	  then Bool = yes
	  else Bool = no
	).

or, using a conditional expression

is_whitespace(C, Bool) :-
	Bool = ( if   ( C = ' ' ; C = '\t' ; C = '\r' ; C = '\n' )
	         then yes
	         else no
	       ).

Of course, the "right" way to express this is as as semidet predicate:

:- pred is_whitespace(char).
:- mode is_whitespace(in) is semidet.

is_whitespace(' ').
is_whitespace('\t').
is_whitespace('\r').
is_whitespace('\n').

> ... a lot less clear than
> 
> is_whitespace(' ', yes).
> is_whitespace('\t', yes).
> is_whitespace('\r', yes).
> is_whitespace('\n', yes).
> is_whitespace(C, no).

Which is just syntactic sugar for an explicit disjunction.

> which, as I've said, doesn't work because any of the first
> four matches can also match the fifth.  So how can I consisely
> write predicates that switch over multiple values of some huge
> type (like integers, or characters)?  I have to cover all the
> posibilities, and I have to make each match unique, and I can't,
> so far as I know, say 'match if nothing else can'.

You can split them into two predicates: an inner one that tests for the
special cases and an outer one that provides the default case:

is_whitespace(C, Bool) :-
	Bool = ( if is_whitespace_2(C) then yes else no ).

is_whitespace_2(' ').
is_whitespace_2('\t').
is_whitespace_2('\r').
is_whitespace_2('\n').

> ------------------------------------------------------------------------
> -- "What?  No X?!"
> 
> No directory access, no concurrency (though I see allusions to an
> experimental multithreaded Mercury: does it use OS threads or something
> lighter?  How can I experiment with it?), no networking -- but the
> first and the last seem potentially well-covered by the foreign language
> access.

Check out the extras part of the distribution.  The posix modules offer
various kinds of standard file and directory access operations as well
as support for sockets etc.  The compiler does support concurrency, as
you can see from mercury/extras/concurrency.

> Does anyone have a directory-stream/networking library handy?  If not,
> how can I contribute mine when I finish them?  Is anyone thinking about
> Erlang/Concurrent-Haskell -style concurrency (with independent
> lightweight processes and message-passing, as opposed to evil Posix
> threads with mutexes and semaphores)?

See above.

The concurrency modules provide both semaphores and Haskell style mvars.
Providing an Erlang like facility is mainly a case of finding a need for
it!  No doubt we'll get someone doing some language/concurrency research
soon and something like that will materialise.

> Also, is there a 'remove line terminator' in the library that I don't
> see?  If not, how does anyone get io__read_line_as_string (which
> returns a string with a newline) to string__to_int ?  I've written

Here's a function you could use:

:- func chomp(string) = string.

chomp(S0) = ( if string__remove_suffix(S0, "\n", S) then S else S0 ).

which should do the trick.  I think we could probably add this one to
the library...  or at least something to strip trailing whitespace etc.
Have a look at the func string__words.

> ------------------------------------------------------------------------
> -- Microbenchmark
> 
> Yesterday I noticed that a text file had evil paragraphs (tab
> indenting, no interparagraph seperation, entire-paragraph-as-one-line)
> so I wrote a little Erlang program and had it simply take two arguments
> (input-file and output-file) and translate tabs to newlines... and then
> I watched in horror (until I got bored).  The Erlang program takes
> several minutes to convert a ~400k file.  I started writing the
> Mercury version before the Erlang version completed and made a program
> with identical interface that converted the file in a little under two
> seconds.  Yay Mercury!

Good stuff.

> (What's more: a little later I wanted to translate a unix-style text
> file to a DOS-style text file, and I'd converted the aforementioned
> Mercury program to a unix2dos and had it compiled in less than a
> minute.  Maybe I should turn this into a unix_filter module?)

To convert from Unix to DOS format:

main(!IO) :-
    io__read_char(Result, !IO),
    (
        Result = eof
    ;
        Result = error(_),
        throw(Result)
    ;
        Result = ok(Char),
        ( if Char = '\n' then io__write_string("\r\n", !IO)
                         else io__write_char(Char,     !IO) ),
        main(!IO)
    ).

- Ralph
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the users mailing list