[m-dev.] Proposal: parsing module for the library.

Julien Fischer juliensf at csse.unimelb.edu.au
Tue Jan 13 15:46:37 AEDT 2009


On Tue, 13 Jan 2009, Ralph Becket wrote:

> Below is a recursive descent parsing module I'd like to add to the
> library (I've used pretty much identical code half a dozen times over
> the last two or three years, so it's useful stuff).

In principle, I have no objections to adding a module to the standard
library that provides this sort of functionality.

Having modules named ``parser'' and ``parsing''
in the stdlib is likely to be confusing - I suggest renaming this module to
something like ``dcg_util'' or ``dcg_parser_util''.
(As, in fact, would reanming ``parser'' to ``term_parser'', but lots of
existing code currently uses the former so that isn't likely to happen
soon.)

As a future extension, you may want to consider making it work with the
stream module's stream interfaces.

> %-----------------------------------------------------------------------------%
> % parsing.m
> % Ralph Becket <rafe at csse.unimelb.edu.au>
> % Tue Jan 13 11:32:49 EST 2009
> % vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
> %
> % Support for DCG style parsers.
> %
> %-----------------------------------------------------------------------------%
>
> :- module parsing.
>
> :- interface.
>
> :- import_module char.
> :- import_module float.
> :- import_module int.
> :- import_module list.
> :- import_module maybe.
> :- import_module string.
> :- import_module unit.
>
>
>
>    % The parser "state", passed around in DCG arguments.
>    %
> :- type ps.
>
>    % The parser source (input string).
>    %
> :- type src.
>
>    % Construct a new parser source and state from a string.
>    %
> :- pred new_src_and_ps(string::in, src::out, ps::out) is det.
>
> :- type parser(T) == pred(T, ps, ps).
> :- inst parser == ( pred(out, in, out) is semidet ).
>
>    % Read the next char.
>    %
> :- pred char(src::in)
>        : parser(char) `with_inst` parser.

I suggest ``next_char'' rather than ``char''

>
>    % Match a char from the given string.
>    %
> :- pred char_in_class(src::in, string::in)
>        : parser(char) `with_inst` parser.
>
>    % Match a string exactly and any subsequent whitespace.
>    %
> :- pred punct(src::in, string::in)
>        : parser(unit) `with_inst` parser.
>
>    % keyword(Src, IdChars, Keyword, _) matches Keyword exactly (i.e., it must
>    % not be followed by any character in IdChars) and any subsequent
>    % whitespace.
>    %
> :- pred keyword(src::in, string::in, string::in)
>        : parser(unit) `with_inst` parser.
>
>    % identifier(Src, InitIdChars, IdChars, Identifier) matches the next
>    % identifer (result in Identifier) comprising a char from InitIdChars
>    % followed by zero or more chars from IdChars.
>    %
> :- pred identifier(src::in, string::in, string::in)
>        : parser(string) `with_inst` parser.
>
>    % Consume any whitespace.
>    %
> :- pred whitespace(src::in)
>        : parser(unit) `with_inst` parser.
>
>    % Consume any input up to, and including, the next newline character
>    % marking the end of the current line.
>    %
> :- pred skip_to_eol(src::in)
>        : parser(unit) `with_inst` parser.
>
>    % Succeed if we have reached the end of the input.
>    %
> :- pred eof(src::in)
>        : parser(unit) `with_inst` parser.
>
>    % Parse a float literal
>    %
> :- pred float(src::in)
>        : parser(float) `with_inst` parser.

I would call that predicate ``float_literal'' rather than just
``float''.

>
>    % Parse an int literal.
>    %
> :- pred int(src::in)
>        : parser(int) `with_inst` parser.

Similarly.

>        % Parse an string literal.  The string argument is the quote character.
>        %
> :- pred string(src::in, char::in)
>        : parser(string) `with_inst` parser.

Similarly.

> :- pred optional(src::in, parser(T)::in(parser))
>        : parser(maybe(T)) `with_inst` parser.
>
> :- pred zero_or_more(src::in, parser(T)::in(parser))
>        : parser(list(T)) `with_inst` parser.
>
> :- pred one_or_more(src::in, parser(T)::in(parser))
>        : parser(list(T)) `with_inst` parser.
>
> :- pred brackets(src::in, string::in, string::in, parser(T)::in(parser))
>        : parser(T) `with_inst` parser.
>
> :- pred separated_list(src::in, string::in, parser(T)::in(parser))
>        : parser(list(T)) `with_inst` parser.
>
> :- pred comma_separated_list(src::in, parser(T)::in(parser))
>        : parser(list(T)) `with_inst` parser.


The above will need to be documented.

Julien.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions:          mercury-developers-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the developers mailing list