[m-rev.] for review: speed up lexer.m
Zoltan Somogyi
zs at csse.unimelb.edu.au
Mon May 19 18:48:58 AEST 2008
On 16-May-2008, Zoltan Somogyi <zs at csse.unimelb.edu.au> wrote:
> I am working on another change that speeds up the same piece of code,
> by turning all those nested if-then-elses in get_token etc into switches.
Here it is, for review by anyone.
library/lexer.m:
This module has four versions of the get_token predicate, some
of which consume significant time during typical compilations.
This diff factors out some of the code common among them,
and restructures this common code from long sequences
of nested if-then-elses to switches.
This yields an overall speedup of 3.2% on tools/speedtest.
Don't disable vim's wrapmargin functionality.
library/char.m:
Add notes to the predicates which are now linked to code in lexer.m.
Zoltan.
cvs diff: Diffing .
cvs diff: Diffing analysis
cvs diff: Diffing bindist
cvs diff: Diffing boehm_gc
cvs diff: Diffing boehm_gc/Mac_files
cvs diff: Diffing boehm_gc/cord
cvs diff: Diffing boehm_gc/cord/private
cvs diff: Diffing boehm_gc/doc
cvs diff: Diffing boehm_gc/include
cvs diff: Diffing boehm_gc/include/private
cvs diff: Diffing boehm_gc/libatomic_ops-1.2
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/doc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps/gcc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps/hpc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps/ibmc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps/icc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps/msftc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/src/atomic_ops/sysdeps/sunc
cvs diff: Diffing boehm_gc/libatomic_ops-1.2/tests
cvs diff: Diffing boehm_gc/tests
cvs diff: Diffing boehm_gc/windows-untested
cvs diff: Diffing boehm_gc/windows-untested/vc60
cvs diff: Diffing boehm_gc/windows-untested/vc70
cvs diff: Diffing boehm_gc/windows-untested/vc71
cvs diff: Diffing browser
cvs diff: Diffing bytecode
cvs diff: Diffing compiler
cvs diff: Diffing compiler/notes
cvs diff: Diffing debian
cvs diff: Diffing debian/patches
cvs diff: Diffing deep_profiler
cvs diff: Diffing deep_profiler/notes
cvs diff: Diffing doc
cvs diff: Diffing extras
cvs diff: Diffing extras/base64
cvs diff: Diffing extras/cgi
cvs diff: Diffing extras/complex_numbers
cvs diff: Diffing extras/complex_numbers/samples
cvs diff: Diffing extras/complex_numbers/tests
cvs diff: Diffing extras/concurrency
cvs diff: Diffing extras/curs
cvs diff: Diffing extras/curs/samples
cvs diff: Diffing extras/curses
cvs diff: Diffing extras/curses/sample
cvs diff: Diffing extras/dynamic_linking
cvs diff: Diffing extras/error
cvs diff: Diffing extras/fixed
cvs diff: Diffing extras/gator
cvs diff: Diffing extras/gator/generations
cvs diff: Diffing extras/gator/generations/1
cvs diff: Diffing extras/graphics
cvs diff: Diffing extras/graphics/easyx
cvs diff: Diffing extras/graphics/easyx/samples
cvs diff: Diffing extras/graphics/mercury_allegro
cvs diff: Diffing extras/graphics/mercury_allegro/examples
cvs diff: Diffing extras/graphics/mercury_allegro/samples
cvs diff: Diffing extras/graphics/mercury_allegro/samples/demo
cvs diff: Diffing extras/graphics/mercury_allegro/samples/mandel
cvs diff: Diffing extras/graphics/mercury_allegro/samples/pendulum2
cvs diff: Diffing extras/graphics/mercury_allegro/samples/speed
cvs diff: Diffing extras/graphics/mercury_glut
cvs diff: Diffing extras/graphics/mercury_opengl
cvs diff: Diffing extras/graphics/mercury_tcltk
cvs diff: Diffing extras/graphics/samples
cvs diff: Diffing extras/graphics/samples/calc
cvs diff: Diffing extras/graphics/samples/gears
cvs diff: Diffing extras/graphics/samples/maze
cvs diff: Diffing extras/graphics/samples/pent
cvs diff: Diffing extras/lazy_evaluation
cvs diff: Diffing extras/lex
cvs diff: Diffing extras/lex/samples
cvs diff: Diffing extras/lex/tests
cvs diff: Diffing extras/log4m
cvs diff: Diffing extras/logged_output
cvs diff: Diffing extras/moose
cvs diff: Diffing extras/moose/samples
cvs diff: Diffing extras/moose/tests
cvs diff: Diffing extras/mopenssl
cvs diff: Diffing extras/morphine
cvs diff: Diffing extras/morphine/non-regression-tests
cvs diff: Diffing extras/morphine/scripts
cvs diff: Diffing extras/morphine/source
cvs diff: Diffing extras/net
cvs diff: Diffing extras/odbc
cvs diff: Diffing extras/posix
cvs diff: Diffing extras/posix/samples
cvs diff: Diffing extras/quickcheck
cvs diff: Diffing extras/quickcheck/tutes
cvs diff: Diffing extras/references
cvs diff: Diffing extras/references/samples
cvs diff: Diffing extras/references/tests
cvs diff: Diffing extras/solver_types
cvs diff: Diffing extras/solver_types/library
cvs diff: Diffing extras/trailed_update
cvs diff: Diffing extras/trailed_update/samples
cvs diff: Diffing extras/trailed_update/tests
cvs diff: Diffing extras/windows_installer_generator
cvs diff: Diffing extras/windows_installer_generator/sample
cvs diff: Diffing extras/windows_installer_generator/sample/images
cvs diff: Diffing extras/xml
cvs diff: Diffing extras/xml/samples
cvs diff: Diffing extras/xml_stylesheets
cvs diff: Diffing java
cvs diff: Diffing java/runtime
cvs diff: Diffing library
Index: library/char.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/char.m,v
retrieving revision 1.59
diff -u -b -r1.59 char.m
--- library/char.m 14 Aug 2007 04:21:06 -0000 1.59
+++ library/char.m 19 May 2008 07:58:58 -0000
@@ -188,6 +188,8 @@
(from_int(X) = Y :- char.to_int(Y, X))
].
+% The information here is duplicated in lookup_token_action in lexer.m.
+% If you update this; you will also need update that.
char.is_whitespace(' ').
char.is_whitespace('\t').
char.is_whitespace('\n').
@@ -221,7 +223,8 @@
).
% We explicitly enumerate here for efficiency.
- % (this predicate is part of the inner loop of the lexer.)
+ % (The information here and in some of the following predicates,
+ % e.g. char.lower_upper, is duplicated in lookup_token_action in lexer.m.)
char.is_alnum_or_underscore(Char) :-
( Char = '0'
; Char = '1'
Index: library/lexer.m
===================================================================
RCS file: /home/mercury/mercury1/repository/mercury/library/lexer.m,v
retrieving revision 1.57
diff -u -b -r1.57 lexer.m
--- library/lexer.m 15 May 2008 05:01:43 -0000 1.57
+++ library/lexer.m 19 May 2008 07:58:59 -0000
@@ -1,5 +1,5 @@
%-----------------------------------------------------------------------------%
-% vim: ft=mercury ts=4 sw=4 et wm=0 tw=0
+% vim: ft=mercury ts=4 sw=4 et
%-----------------------------------------------------------------------------%
% Copyright (C) 1993-2000, 2003-2007 The University of Melbourne.
% This file may only be copied under the terms of the GNU Library General
@@ -312,6 +312,26 @@
%-----------------------------------------------------------------------------%
+:- type get_token_action
+ ---> action_whitespace
+ ; action_alpha_lower
+ ; action_alpha_upper_uscore
+ ; action_zero
+ ; action_nonzero_digit
+ ; action_special_token
+ ; action_dot
+ ; action_percent
+ ; action_quote
+ ; action_slash
+ ; action_hash
+ ; action_backquote
+ ; action_dollar
+ ; action_graphic_token.
+
+:- type scanned_past_whitespace
+ ---> scanned_past_whitespace
+ ; not_scanned_past_whitespace.
+
:- pred get_token(io.input_stream::in, token::out, token_context::out,
io::di, io::uo) is det.
@@ -327,48 +347,36 @@
Token = eof
;
Result = ok,
- ( char.is_whitespace(Char) ->
- get_token_2(Stream, Token, Context, !IO)
- ; ( char.is_upper(Char) ; Char = '_' ) ->
- get_context(Stream, Context, !IO),
- get_variable(Stream, [Char], Token, !IO)
- ; char.is_lower(Char) ->
- get_context(Stream, Context, !IO),
- get_name(Stream, [Char], Token, !IO)
- ; Char = '0' ->
- get_context(Stream, Context, !IO),
- get_zero(Stream, Token, !IO)
- ; char.is_digit(Char) ->
- get_context(Stream, Context, !IO),
- get_number(Stream, [Char], Token, !IO)
- ; special_token(Char, SpecialToken) ->
- get_context(Stream, Context, !IO),
- ( SpecialToken = open ->
- Token = open_ct
+ ( lookup_token_action(Char, Action) ->
+ execute_get_token_action(Stream, Char, Action,
+ not_scanned_past_whitespace, Token, Context, !IO)
;
- Token = SpecialToken
- )
- ; Char = ('.') ->
- get_context(Stream, Context, !IO),
- get_dot(Stream, Token, !IO)
- ; Char = ('%') ->
- skip_to_eol(Stream, Token, Context, !IO)
- ; ( Char = '"' ; Char = '''' ) ->
- get_context(Stream, Context, !IO),
- start_quoted_name(Stream, Char, [], Token, !IO)
- ; Char = ('/') ->
- get_slash(Stream, Token, Context, !IO)
- ; Char = ('#') ->
- get_source_line_number(Stream, [], Token, Context, !IO)
- ; Char = ('`') ->
get_context(Stream, Context, !IO),
- Token = name("`")
- ; Char = '$' ->
+ Token = junk(Char)
+ )
+ ).
+
+ % This is just like get_token, except that we have already scanned past
+ % some whitespace, so '(' gets scanned as `open' rather than `open_ct'.
+ %
+:- pred get_token_2(io.input_stream::in, token::out, token_context::out,
+ io::di, io::uo) is det.
+
+get_token_2(Stream, Token, Context, !IO) :-
+ io.read_char_unboxed(Stream, Result, Char, !IO),
+ (
+ Result = error(Error),
get_context(Stream, Context, !IO),
- get_implementation_defined_literal_rest(Stream, Token, !IO)
- ; graphic_token_char(Char) ->
+ Token = io_error(Error)
+ ;
+ Result = eof,
get_context(Stream, Context, !IO),
- get_graphic(Stream, [Char], Token, !IO)
+ Token = eof
+ ;
+ Result = ok,
+ ( lookup_token_action(Char, Action) ->
+ execute_get_token_action(Stream, Char, Action,
+ scanned_past_whitespace, Token, Context, !IO)
;
get_context(Stream, Context, !IO),
Token = junk(Char)
@@ -381,43 +389,27 @@
string_get_token(String, Len, Token, Context, !Posn) :-
Posn0 = !.Posn,
( string_read_char(String, Len, Char, !Posn) ->
- ( char.is_whitespace(Char) ->
- string_get_token_2(String, Len, Token, Context, !Posn)
- ; ( char.is_upper(Char) ; Char = '_' ) ->
- string_get_variable(String, Len, Posn0, Token, Context, !Posn)
- ; char.is_lower(Char) ->
- string_get_name(String, Len, Posn0, Token, Context, !Posn)
- ; Char = '0' ->
- string_get_zero(String, Len, Posn0, Token, Context, !Posn)
- ; char.is_digit(Char) ->
- string_get_number(String, Len, Posn0, Token, Context, !Posn)
- ; special_token(Char, SpecialToken) ->
- string_get_context(Posn0, Context, !Posn),
- ( SpecialToken = open ->
- Token = open_ct
+ ( lookup_token_action(Char, Action) ->
+ execute_string_get_token_action(String, Len, Posn0, Char, Action,
+ not_scanned_past_whitespace, Token, Context, !Posn)
;
- Token = SpecialToken
+ string_get_context(Posn0, Context, !Posn),
+ Token = junk(Char)
)
- ; Char = ('.') ->
- string_get_dot(String, Len, Posn0, Token, Context, !Posn)
- ; Char = ('%') ->
- string_skip_to_eol(String, Len, Token, Context, !Posn)
- ; ( Char = '"' ; Char = '''' ) ->
- string_start_quoted_name(String, Len, Char, [], Posn0, Token,
- Context, !Posn)
- ; Char = ('/') ->
- string_get_slash(String, Len, Posn0, Token, Context, !Posn)
- ; Char = ('#') ->
- string_get_source_line_number(String, Len, !.Posn, Token, Context,
- !Posn)
- ; Char = ('`') ->
+ ;
string_get_context(Posn0, Context, !Posn),
- Token = name("`")
- ; Char = '$' ->
- string_get_implementation_defined_literal_rest(String, Len, Posn0,
- Token, Context, !Posn)
- ; graphic_token_char(Char) ->
- string_get_graphic(String, Len, Posn0, Token, Context, !Posn)
+ Token = eof
+ ).
+
+:- pred string_get_token_2(string::in, int::in, token::out,
+ token_context::out, posn::in, posn::out) is det.
+
+string_get_token_2(String, Len, Token, Context, !Posn) :-
+ Posn0 = !.Posn,
+ ( string_read_char(String, Len, Char, !Posn) ->
+ ( lookup_token_action(Char, Action) ->
+ execute_string_get_token_action(String, Len, Posn0, Char, Action,
+ scanned_past_whitespace, Token, Context, !Posn)
;
string_get_context(Posn0, Context, !Posn),
Token = junk(Char)
@@ -427,123 +419,281 @@
Token = eof
).
+ % Decide on how the given character should be treated. Note that
+ % performance suffers significantly if this predicate is not inlined.
+ %
+:- pragma inline(lookup_token_action/2).
+:- pred lookup_token_action(char::in, get_token_action::out) is semidet.
+
+lookup_token_action(Char, Action) :-
+ % The body of this predicate should be turned into a single table lookup
+ % by the compiler.
+ (
+ % This list of characters comes from the code of char.is_whitespace.
+ % Any update here will also require an update there.
+ ( Char = ' '
+ ; Char = '\t'
+ ; Char = '\n'
+ ; Char = '\r'
+ ; Char = '\f'
+ ; Char = '\v'
+ ),
+ Action = action_whitespace
+ ;
+ % This list of characters comes from char.is_alnum_or_underscore and
+ % char.lower_upper.
+ ( Char = 'a' ; Char = 'b' ; Char = 'c' ; Char = 'd'
+ ; Char = 'e' ; Char = 'f' ; Char = 'g' ; Char = 'h'
+ ; Char = 'i' ; Char = 'j' ; Char = 'k' ; Char = 'l'
+ ; Char = 'm' ; Char = 'n' ; Char = 'o' ; Char = 'p'
+ ; Char = 'q' ; Char = 'r' ; Char = 's' ; Char = 't'
+ ; Char = 'u' ; Char = 'v' ; Char = 'w' ; Char = 'x'
+ ; Char = 'y' ; Char = 'z'
+ ),
+ Action = action_alpha_lower
+ ;
+ % This list of characters comes from char.is_alnum_or_underscore and
+ % char.lower_upper.
+ ( Char = '_'
+ ; Char = 'A' ; Char = 'B' ; Char = 'C' ; Char = 'D'
+ ; Char = 'E' ; Char = 'F' ; Char = 'G' ; Char = 'H'
+ ; Char = 'I' ; Char = 'J' ; Char = 'K' ; Char = 'L'
+ ; Char = 'M' ; Char = 'N' ; Char = 'O' ; Char = 'P'
+ ; Char = 'Q' ; Char = 'R' ; Char = 'S' ; Char = 'T'
+ ; Char = 'U' ; Char = 'V' ; Char = 'W' ; Char = 'X'
+ ; Char = 'Y' ; Char = 'Z'
+ ),
+ Action = action_alpha_upper_uscore
+ ;
+ Char = '0',
+ Action = action_zero
+ ;
+ % This list of characters comes from char.is_alnum_or_underscore and
+ % char.is_digit.
+ ( Char = '1' ; Char = '2' ; Char = '3' ; Char = '4'
+ ; Char = '5' ; Char = '6' ; Char = '7' ; Char = '8'
+ ; Char = '9'
+ ),
+ Action = action_nonzero_digit
+ ;
+ % These are the characters for which special_token succeeds.
+ ( Char = ('(')
+ ; Char = (')')
+ ; Char = ('[')
+ ; Char = (']')
+ ; Char = ('{')
+ ; Char = ('}')
+ ; Char = ('|')
+ ; Char = (',')
+ ; Char = (';')
+ ),
+ Action = action_special_token
+ ;
+ Char = ('.'),
+ Action = action_dot
+ ;
+ Char = ('%'),
+ Action = action_percent
+ ;
+ ( Char = '"'
+ ; Char = ''''
+ ),
+ Action = action_quote
+ ;
+ Char = ('/'),
+ Action = action_slash
+ ;
+ Char = ('#'),
+ Action = action_hash
+ ;
+ Char = ('`'),
+ Action = action_backquote
+ ;
+ Char = ('$'),
+ Action = action_dollar
+ ;
+ % These are the characters for which graphic_token_char succeeds.
+ % The ones that are commented out have their own actions.
+ ( Char = ('!')
+ % ; Char = ('#') handled as action_hash
+ % ; Char = ('$') handled as action_dollar
+ ; Char = ('&')
+ ; Char = ('*')
+ ; Char = ('+')
+ ; Char = ('-')
+ % ; Char = ('.') handled as action_dot
+ % ; Char = ('/') handled as action_slash
+ ; Char = (':')
+ ; Char = ('<')
+ ; Char = ('=')
+ ; Char = ('>')
+ ; Char = ('?')
+ ; Char = ('@')
+ ; Char = ('^')
+ ; Char = ('~')
+ ; Char = ('\\')
+ ),
+ Action = action_graphic_token
+ ).
+
%-----------------------------------------------------------------------------%
- % This is just like get_token, except that we have already scanned past
- % some whitespace, so '(' gets scanned as `open' rather than `open_ct'.
+ % Handle the character we just read the way lookup_token_action decided
+ % it should be treated. Note that inlining this predicate does not
+ % significantly affect performance.
%
-:- pred get_token_2(io.input_stream::in, token::out, token_context::out,
- io::di, io::uo) is det.
+% :- pragma inline(execute_get_token_action/8).
+:- pred execute_get_token_action(io.input_stream::in, char::in,
+ get_token_action::in, scanned_past_whitespace::in, token::out,
+ token_context::out, io::di, io::uo) is det.
-get_token_2(Stream, Token, Context, !IO) :-
- io.read_char_unboxed(Stream, Result, Char, !IO),
+execute_get_token_action(Stream, Char, Action, ScannedPastWhiteSpace,
+ Token, Context, !IO) :-
(
- Result = error(Error),
- get_context(Stream, Context, !IO),
- Token = io_error(Error)
- ;
- Result = eof,
- get_context(Stream, Context, !IO),
- Token = eof
- ;
- Result = ok,
- ( char.is_whitespace(Char) ->
+ Action = action_whitespace,
get_token_2(Stream, Token, Context, !IO)
- ; ( char.is_upper(Char) ; Char = '_' ) ->
+ ;
+ Action = action_alpha_upper_uscore,
get_context(Stream, Context, !IO),
get_variable(Stream, [Char], Token, !IO)
- ; char.is_lower(Char) ->
+ ;
+ Action = action_alpha_lower,
get_context(Stream, Context, !IO),
get_name(Stream, [Char], Token, !IO)
- ; Char = '0' ->
+ ;
+ Action = action_zero,
get_context(Stream, Context, !IO),
get_zero(Stream, Token, !IO)
- ; char.is_digit(Char) ->
+ ;
+ Action = action_nonzero_digit,
get_context(Stream, Context, !IO),
get_number(Stream, [Char], Token, !IO)
- ; special_token(Char, SpecialToken) ->
+ ;
+ Action = action_special_token,
get_context(Stream, Context, !IO),
- Token = SpecialToken
- ; Char = ('.') ->
+ handle_special_token(Char, ScannedPastWhiteSpace, Token)
+ ;
+ Action = action_dot,
get_context(Stream, Context, !IO),
get_dot(Stream, Token, !IO)
- ; Char = ('%') ->
+ ;
+ Action = action_percent,
skip_to_eol(Stream, Token, Context, !IO)
- ; ( Char = '"' ; Char = '''' ) ->
+ ;
+ Action = action_quote,
get_context(Stream, Context, !IO),
start_quoted_name(Stream, Char, [], Token, !IO)
- ; Char = ('/') ->
+ ;
+ Action = action_slash,
get_slash(Stream, Token, Context, !IO)
- ; Char = ('#') ->
+ ;
+ Action = action_hash,
get_source_line_number(Stream, [], Token, Context, !IO)
- ; Char = ('`') ->
+ ;
+ Action = action_backquote,
get_context(Stream, Context, !IO),
Token = name("`")
- ; Char = '$' ->
+ ;
+ Action = action_dollar,
get_context(Stream, Context, !IO),
get_implementation_defined_literal_rest(Stream, Token, !IO)
- ; graphic_token_char(Char) ->
- get_context(Stream, Context, !IO),
- get_graphic(Stream, [Char], Token, !IO)
;
+ Action = action_graphic_token,
get_context(Stream, Context, !IO),
- Token = junk(Char)
- )
+ get_graphic(Stream, [Char], Token, !IO)
).
-:- pred string_get_token_2(string::in, int::in, token::out,
+ % The string version of execute_get_token_action.
+ %
+% :- pragma inline(execute_string_get_token_action/10).
+:- pred execute_string_get_token_action(string::in, int::in, posn::in,
+ char::in, get_token_action::in, scanned_past_whitespace::in, token::out,
token_context::out, posn::in, posn::out) is det.
-string_get_token_2(String, Len, Token, Context, !Posn) :-
- Posn0 = !.Posn,
- ( string_read_char(String, Len, Char, !Posn) ->
- ( char.is_whitespace(Char) ->
+execute_string_get_token_action(String, Len, Posn0, Char, Action,
+ ScannedPastWhiteSpace, Token, Context, !Posn) :-
+ (
+ Action = action_whitespace,
string_get_token_2(String, Len, Token, Context, !Posn)
- ; ( char.is_upper(Char) ; Char = '_' ) ->
+ ;
+ Action = action_alpha_upper_uscore,
string_get_variable(String, Len, Posn0, Token, Context, !Posn)
- ; char.is_lower(Char) ->
+ ;
+ Action = action_alpha_lower,
string_get_name(String, Len, Posn0, Token, Context, !Posn)
- ; Char = '0' ->
+ ;
+ Action = action_zero,
string_get_zero(String, Len, Posn0, Token, Context, !Posn)
- ; char.is_digit(Char) ->
+ ;
+ Action = action_nonzero_digit,
string_get_number(String, Len, Posn0, Token, Context, !Posn)
- ; special_token(Char, SpecialToken) ->
+ ;
+ Action = action_special_token,
string_get_context(Posn0, Context, !Posn),
- Token = SpecialToken
- ; Char = ('.') ->
+ handle_special_token(Char, ScannedPastWhiteSpace, Token)
+ ;
+ Action = action_dot,
string_get_dot(String, Len, Posn0, Token, Context, !Posn)
- ; Char = ('%') ->
+ ;
+ Action = action_percent,
string_skip_to_eol(String, Len, Token, Context, !Posn)
- ; ( Char = '"' ; Char = '''' ) ->
+ ;
+ Action = action_quote,
string_start_quoted_name(String, Len, Char, [], Posn0, Token,
Context, !Posn)
- ; Char = ('/') ->
+ ;
+ Action = action_slash,
string_get_slash(String, Len, Posn0, Token, Context, !Posn)
- ; Char = ('#') ->
+ ;
+ Action = action_hash,
string_get_source_line_number(String, Len, !.Posn, Token, Context,
!Posn)
- ; Char = ('`') ->
+ ;
+ Action = action_backquote,
string_get_context(Posn0, Context, !Posn),
Token = name("`")
- ; Char = '$' ->
+ ;
+ Action = action_dollar,
string_get_implementation_defined_literal_rest(String, Len, Posn0,
Token, Context, !Posn)
- ; graphic_token_char(Char) ->
+ ;
+ Action = action_graphic_token,
string_get_graphic(String, Len, Posn0, Token, Context, !Posn)
+ ).
+
+%-----------------------------------------------------------------------------%
+
+ % Decide what to do for a token which consists of a special character.
+ % The reason for inlining this predicate is that each caller has a
+ % specific value for ScannedPastWhiteSpace, and thus after inlining,
+ % the compiler should be able to eliminate the switch on
+ % ScannedPastWhiteSpace.
+ %
+:- pragma inline(handle_special_token/3).
+:- pred handle_special_token(char::in, scanned_past_whitespace::in, token::out)
+ is det.
+
+handle_special_token(Char, ScannedPastWhiteSpace, Token) :-
+ ( special_token(Char, SpecialToken) ->
+ (
+ ScannedPastWhiteSpace = not_scanned_past_whitespace,
+ ( SpecialToken = open ->
+ Token = open_ct
;
- string_get_context(Posn0, Context, !Posn),
- Token = junk(Char)
+ Token = SpecialToken
)
;
- string_get_context(Posn0, Context, !Posn),
- Token = eof
+ ScannedPastWhiteSpace = scanned_past_whitespace,
+ Token = SpecialToken
+ )
+ ;
+ error("lexer.m, handle_special_token: unknown special token")
).
-%-----------------------------------------------------------------------------%
-
:- pred special_token(char::in, token::out) is semidet.
-special_token('(', open). % May get converted to open_ct
+% The list of characters here is duplicated in lookup_token_action above.
+special_token('(', open). % May get converted to open_ct above.
special_token(')', close).
special_token('[', open_list).
special_token(']', close_list).
@@ -553,6 +703,7 @@
special_token(',', comma).
special_token(';', name(";")).
+% The list of characters here is duplicated in lookup_token_action above.
graphic_token_char('!').
graphic_token_char('#').
graphic_token_char('$').
cvs diff: Diffing mdbcomp
cvs diff: Diffing profiler
cvs diff: Diffing robdd
cvs diff: Diffing runtime
cvs diff: Diffing runtime/GETOPT
cvs diff: Diffing runtime/machdeps
cvs diff: Diffing samples
cvs diff: Diffing samples/c_interface
cvs diff: Diffing samples/c_interface/c_calls_mercury
cvs diff: Diffing samples/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/c_interface/mercury_calls_c
cvs diff: Diffing samples/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/c_interface/mercury_calls_fortran
cvs diff: Diffing samples/c_interface/simpler_c_calls_mercury
cvs diff: Diffing samples/c_interface/simpler_cplusplus_calls_mercury
cvs diff: Diffing samples/c_interface/standalone_c
cvs diff: Diffing samples/diff
cvs diff: Diffing samples/muz
cvs diff: Diffing samples/rot13
cvs diff: Diffing samples/solutions
cvs diff: Diffing samples/solver_types
cvs diff: Diffing samples/tests
cvs diff: Diffing samples/tests/c_interface
cvs diff: Diffing samples/tests/c_interface/c_calls_mercury
cvs diff: Diffing samples/tests/c_interface/cplusplus_calls_mercury
cvs diff: Diffing samples/tests/c_interface/mercury_calls_c
cvs diff: Diffing samples/tests/c_interface/mercury_calls_cplusplus
cvs diff: Diffing samples/tests/c_interface/mercury_calls_fortran
cvs diff: Diffing samples/tests/c_interface/simpler_c_calls_mercury
cvs diff: Diffing samples/tests/c_interface/simpler_cplusplus_calls_mercury
cvs diff: Diffing samples/tests/diff
cvs diff: Diffing samples/tests/muz
cvs diff: Diffing samples/tests/rot13
cvs diff: Diffing samples/tests/solutions
cvs diff: Diffing samples/tests/toplevel
cvs diff: Diffing scripts
cvs diff: Diffing slice
cvs diff: Diffing ssdb
cvs diff: Diffing tests
cvs diff: Diffing tests/benchmarks
cvs diff: Diffing tests/debugger
cvs diff: Diffing tests/debugger/declarative
cvs diff: Diffing tests/dppd
cvs diff: Diffing tests/general
cvs diff: Diffing tests/general/accumulator
cvs diff: Diffing tests/general/string_format
cvs diff: Diffing tests/general/structure_reuse
cvs diff: Diffing tests/grade_subdirs
cvs diff: Diffing tests/hard_coded
cvs diff: Diffing tests/hard_coded/exceptions
cvs diff: Diffing tests/hard_coded/purity
cvs diff: Diffing tests/hard_coded/sub-modules
cvs diff: Diffing tests/hard_coded/typeclasses
cvs diff: Diffing tests/invalid
cvs diff: Diffing tests/invalid/purity
cvs diff: Diffing tests/misc_tests
cvs diff: Diffing tests/mmc_make
cvs diff: Diffing tests/mmc_make/lib
cvs diff: Diffing tests/par_conj
cvs diff: Diffing tests/recompilation
cvs diff: Diffing tests/tabling
cvs diff: Diffing tests/term
cvs diff: Diffing tests/trailing
cvs diff: Diffing tests/valid
cvs diff: Diffing tests/warnings
cvs diff: Diffing tools
cvs diff: Diffing trace
cvs diff: Diffing util
cvs diff: Diffing vim
cvs diff: Diffing vim/after
cvs diff: Diffing vim/ftplugin
cvs diff: Diffing vim/syntax
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list