[m-rev.] comments: non-blocking lex

Michael Day mikeday at bigpond.net.au
Mon Oct 15 12:30:09 AEST 2001


Hi,

While lex provides "the holy grail of piecemeal lexing of stdin", this
patch allows it to do so without blocking when there is no input to read.
It should not break backwards compatibility in most cases, but allows read
predicates to return blocked if no input is available and provides a
read_no_block that will return the next token if enough input is present
to create one.

This is convenient if you wish to tokenise multiple input sources
simultaneously without blocking (sockets for example). There is the
potential for an infinite loop however if read is called and the read
predicate persists in calling blocked.

Additionally, this patch adds a variant of manipulate source that lets you
thread another accumulator through, for example when the source is *not*
the io state but you wish to manipulate it *using* the io state.

Any comments about the interface, or whether this could be achieved in a
better way?

Michael


Index: lex.m
===================================================================
RCS file: /home/mercury1/repository/mercury/extras/lex/lex.m,v
retrieving revision 1.2
diff -u -r1.2 lex.m
--- lex.m	4 Oct 2001 07:46:04 -0000	1.2
+++ lex.m	15 Oct 2001 03:25:06 -0000
@@ -48,7 +48,9 @@
     %
 :- type read_result
     --->    ok(char)
-    ;       eof.
+    ;       eof
+    ;       blocked
+    .

     % read_pred(Offset, Result, SrcIn, SrcOut) reads the char at
     % Offset from SrcIn and returns SrcOut.
@@ -173,6 +175,10 @@
             lexer_state(Tok, Src)).
 :- mode read(out, di, uo) is det.

+:- pred read_no_block(maybe(io__read_result(Tok)), lexer_state(Tok, Src),
+            lexer_state(Tok, Src)).
+:- mode read_no_block(out, di, uo) is det.
+
     % Stop a running instance of a lexer and retrieve the input source.
     %
 :- func stop(lexer_state(_Tok, Src)) = Src.
@@ -186,6 +192,10 @@
                 lexer_state(Tok, Src), lexer_state(Tok, Src)).
 :- mode manipulate_source(pred(di, uo) is det, di, uo) is det.

+:- pred manipulate_source(pred(Src, Src, Acc, Acc),
+                lexer_state(Tok, Src), lexer_state(Tok, Src), Acc, Acc).
+:- mode manipulate_source(pred(di, uo, di, uo) is det, di, uo, di, uo) is det.
+
 %------------------------------------------------------------------------------%
 %------------------------------------------------------------------------------%

@@ -294,6 +304,10 @@
     read_0(Result, Instance0, Instance, Buf0, Buf, Src0, Src),
     LexerState = args_lexer_state(Instance, Buf, Src).

+read_no_block(Result, LexerState0, LexerState) :-
+    lexer_state_args(LexerState0, Instance0, Buf0, Src0),
+    read_no_block_0(Result, Instance0, Instance, Buf0, Buf, Src0, Src),
+    LexerState = args_lexer_state(Instance, Buf, Src).


 :- pred read_0(io__read_result(Tok),
@@ -320,6 +334,44 @@
         Buf = Buf1,
         Src = Src1,
         process_eof(Result, Instance0, Instance, BufState1, Buf)
+    ;
+        BufReadResult = blocked,
+        % infinite loop if source always blocks
+        read_0(Result, Instance0, Instance, Buf1, Buf, Src1, Src)
+    ).
+
+:- pred read_no_block_0(maybe(io__read_result(Tok)),
+            lexer_instance(Tok, Src), lexer_instance(Tok, Src),
+            buf, buf, Src, Src).
+:- mode read_no_block_0(out,
+            in(lexer_instance), out(lexer_instance),
+            array_di, array_uo, di, uo) is det.
+
+    % Basically, just read chars from the buf and advance the live lexemes
+    % until we have a winner or hit an error (no parse) or block.
+    %
+read_no_block_0(Result, Instance0, Instance, Buf0, Buf, Src0, Src) :-
+
+    BufState0    = Instance0 ^ buf_state,
+
+    buf__read(BufReadResult, BufState0, BufState1, Buf0, Buf1, Src0, Src1),
+    (
+        BufReadResult = ok(Char),
+        process_char(Result0, Char,
+                Instance0, Instance, BufState1, Buf1, Buf, Src1, Src),
+        Result = yes(Result0)
+    ;
+        BufReadResult = eof,
+        Buf = Buf1,
+        Src = Src1,
+        process_eof(Result0, Instance0, Instance, BufState1, Buf),
+        Result = yes(Result0)
+    ;
+        BufReadResult = blocked,
+        Buf = Buf1,
+        Src = Src1,
+        Instance = Instance0,
+        Result = no
     ).

 %------------------------------------------------------------------------------%
@@ -518,6 +570,11 @@
 manipulate_source(P, State0, State) :-
     lexer_state_args(State0, Instance, Buf, Src0),
     P(Src0, Src),
+    State = args_lexer_state(Instance, Buf, Src).
+
+manipulate_source(P, State0, State, Acc0, Acc) :-
+    lexer_state_args(State0, Instance, Buf, Src0),
+    P(Src0, Src, Acc0, Acc),
     State = args_lexer_state(Instance, Buf, Src).

 % -----------------------------------------------------------------------------%

--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the reviews mailing list