[m-dev.] Proposal: parsing module for the library.
Julien Fischer
juliensf at csse.unimelb.edu.au
Tue Jan 13 15:46:37 AEDT 2009
On Tue, 13 Jan 2009, Ralph Becket wrote:
> Below is a recursive descent parsing module I'd like to add to the
> library (I've used pretty much identical code half a dozen times over
> the last two or three years, so it's useful stuff).
In principle, I have no objections to adding a module to the standard
library that provides this sort of functionality.
Having modules named ``parser'' and ``parsing''
in the stdlib is likely to be confusing - I suggest renaming this module to
something like ``dcg_util'' or ``dcg_parser_util''.
(As, in fact, would reanming ``parser'' to ``term_parser'', but lots of
existing code currently uses the former so that isn't likely to happen
soon.)
As a future extension, you may want to consider making it work with the
stream module's stream interfaces.
> %-----------------------------------------------------------------------------%
> % parsing.m
> % Ralph Becket <rafe at csse.unimelb.edu.au>
> % Tue Jan 13 11:32:49 EST 2009
> % vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
> %
> % Support for DCG style parsers.
> %
> %-----------------------------------------------------------------------------%
>
> :- module parsing.
>
> :- interface.
>
> :- import_module char.
> :- import_module float.
> :- import_module int.
> :- import_module list.
> :- import_module maybe.
> :- import_module string.
> :- import_module unit.
>
>
>
> % The parser "state", passed around in DCG arguments.
> %
> :- type ps.
>
> % The parser source (input string).
> %
> :- type src.
>
> % Construct a new parser source and state from a string.
> %
> :- pred new_src_and_ps(string::in, src::out, ps::out) is det.
>
> :- type parser(T) == pred(T, ps, ps).
> :- inst parser == ( pred(out, in, out) is semidet ).
>
> % Read the next char.
> %
> :- pred char(src::in)
> : parser(char) `with_inst` parser.
I suggest ``next_char'' rather than ``char''
>
> % Match a char from the given string.
> %
> :- pred char_in_class(src::in, string::in)
> : parser(char) `with_inst` parser.
>
> % Match a string exactly and any subsequent whitespace.
> %
> :- pred punct(src::in, string::in)
> : parser(unit) `with_inst` parser.
>
> % keyword(Src, IdChars, Keyword, _) matches Keyword exactly (i.e., it must
> % not be followed by any character in IdChars) and any subsequent
> % whitespace.
> %
> :- pred keyword(src::in, string::in, string::in)
> : parser(unit) `with_inst` parser.
>
> % identifier(Src, InitIdChars, IdChars, Identifier) matches the next
> % identifer (result in Identifier) comprising a char from InitIdChars
> % followed by zero or more chars from IdChars.
> %
> :- pred identifier(src::in, string::in, string::in)
> : parser(string) `with_inst` parser.
>
> % Consume any whitespace.
> %
> :- pred whitespace(src::in)
> : parser(unit) `with_inst` parser.
>
> % Consume any input up to, and including, the next newline character
> % marking the end of the current line.
> %
> :- pred skip_to_eol(src::in)
> : parser(unit) `with_inst` parser.
>
> % Succeed if we have reached the end of the input.
> %
> :- pred eof(src::in)
> : parser(unit) `with_inst` parser.
>
> % Parse a float literal
> %
> :- pred float(src::in)
> : parser(float) `with_inst` parser.
I would call that predicate ``float_literal'' rather than just
``float''.
>
> % Parse an int literal.
> %
> :- pred int(src::in)
> : parser(int) `with_inst` parser.
Similarly.
> % Parse an string literal. The string argument is the quote character.
> %
> :- pred string(src::in, char::in)
> : parser(string) `with_inst` parser.
Similarly.
> :- pred optional(src::in, parser(T)::in(parser))
> : parser(maybe(T)) `with_inst` parser.
>
> :- pred zero_or_more(src::in, parser(T)::in(parser))
> : parser(list(T)) `with_inst` parser.
>
> :- pred one_or_more(src::in, parser(T)::in(parser))
> : parser(list(T)) `with_inst` parser.
>
> :- pred brackets(src::in, string::in, string::in, parser(T)::in(parser))
> : parser(T) `with_inst` parser.
>
> :- pred separated_list(src::in, string::in, parser(T)::in(parser))
> : parser(list(T)) `with_inst` parser.
>
> :- pred comma_separated_list(src::in, parser(T)::in(parser))
> : parser(list(T)) `with_inst` parser.
The above will need to be documented.
Julien.
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to: mercury-developers at csse.unimelb.edu.au
Administrative Queries: owner-mercury-developers at csse.unimelb.edu.au
Subscriptions: mercury-developers-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the developers
mailing list