[mercury-users] Sluggish XML parsing
Thomas Conway
conway at cs.mu.OZ.AU
Tue Dec 26 11:05:24 AEDT 2000
On Sun, Dec 24, 2000 at 05:33:38PM EST, Michael Day wrote:
> Nonetheless, it's nice that compiling it with -O6 takes minutes rather
> than hours - is this particular sluggishness due to all the introduced
> preds involved with higher order code? Compiler dudes?
The reason that compiling is so slow is that the production for `letter'
works on Unicodes rather than just chars, so the production looks something
like:
letter -->
baseChar or ideographic.
baseChar -->
(0x0041-0x005A) or (0x0061-0x007A) or (0x00C0-0x00D6)
or (0x00D8-0x00F6) or (0x00F8-0x00FF) or (0x0100-0x0131)
or (0x0134-0x013E) or (0x0141-0x0148) or (0x014A-0x017E)
or (0x0180-0x01C3) or (0x01CD-0x01F0) or (0x01F4-0x01F5)
or (0x01FA-0x0217) or (0x0250-0x02A8) or (0x02BB-0x02C1) or lit1(0x0386)
or (0x0388-0x038A) or lit1(0x038C) or (0x038E-0x03A1) or (0x03A3-0x03CE)
....
(where -/4 is a parser combinator that accepts a character from a range
in the obvious kind of way, and or/4 is a parser combinator that accepts
the alternation of two parsers, also in the obvious kind of way)
Which is slow to compile for two reasons: a vast number of unifications
get introduced to construct all the pred expressions, and gobs and gobs
of introduced predicates get generated.
This is why I split baseChar and a couple of other productions into a
separate submodule, to prevent them from being recompiled all the time.
If you think the compile-time is a performance bug, then send in a bug
report, and maybe someone (not me!) might do something about it. ;-)
--
Thomas Conway Mercurian )O+
<conway at cs.mu.oz.au> Every sword has two edges.
--------------------------------------------------------------------------
mercury-users mailing list
post: mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the users
mailing list