[m-dev.] regular expressions

schachte at cs.mu.OZ.AU schachte at cs.mu.OZ.AU
Tue Dec 21 01:48:52 AEDT 1999


On 16 Dec, Fergus Henderson wrote:

> DCGs work great for regexp matching.
> But what about regexp substitution?

You need something a bit more general than DCGs.  I'll use my global
variable notation.  Let's suppose there is a variable $in for what
you're parsing, $out for the result of the substitution, and $match
for the part of the input that was matched (appended to whatever
$match was before).  Then define:

	[T|Ts] ::-
		append([T|Ts], $:=in, $in),
		append([T|Ts], $:=out, $out),
		append($match, [T|Ts], $:=match).

	P / R ::-
		[$out:=$out] scoped_to P,
		append(Rs, $:=out, $out).

	match(P, M) ::-
		Pre = $match,
		[$match:=[]] scoped_to ( P, M = $match),
		append(Pre, M, $:=match).

	end ::-
		$in = [],
		$out = [].

	global(P) ::-
		(   end ->
			true
		;   P ->
			global(P)
		;   [_],
		    global(P)
		).


I should say that this is completely untested, but the idea is that
$in and $out are consumed/built in parallel.  The (/)/2 predicate
invokes its first argument, but throws away any changes it makes to
$out, substituting R.  Recall that $:=x is short for V, with the goal
$x := V inserted immediately after the goal containing $:=x.

Then where in perl you might write:

	s/"([^"]*)"/(\1)/g;

in Mercury you could write:

	nonquote ::- [C], C \= '"'.

	nonquotestar ::- repeat0(nonquote).

	string_paren ::-
		( ['"'], match(nonquotestar, Str), ['"'] ) / Par,
		append(['(' | Str], [')'], Par).

	subst ::-
	      global(string_paren).


This could be a lot more compact with a few new operators and if not
for the need to create a predicate for every higher order call, and
this is certainly (currently) not as efficient as what perl would do,
but I think this approach is reasonably powerful and comfortable.


-- 
Peter Schachte                | I have made this a rather long letter
mailto:pets at cs.mu.OZ.AU       | because I haven't had time to make it
http://www.cs.mu.oz.au/~pets/ | shorter.
PGP: finger pets at 128.250.37.3 |     -- Blaise Pascal 


--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list