[mercury-users] Sluggish XML parsing

Thomas Conway conway at cs.mu.OZ.AU
Tue Dec 26 11:05:24 AEDT 2000


On Sun, Dec 24, 2000 at 05:33:38PM EST, Michael Day wrote:
> Nonetheless, it's nice that compiling it with -O6 takes minutes rather
> than hours - is this particular sluggishness due to all the introduced
> preds involved with higher order code? Compiler dudes?

The reason that compiling is so slow is that the production for `letter'
works on Unicodes rather than just chars, so the production looks something
like:

letter -->
    baseChar or ideographic.

baseChar -->
    (0x0041-0x005A) or (0x0061-0x007A) or (0x00C0-0x00D6)
    or (0x00D8-0x00F6) or (0x00F8-0x00FF) or (0x0100-0x0131)
    or (0x0134-0x013E) or (0x0141-0x0148) or (0x014A-0x017E)
    or (0x0180-0x01C3) or (0x01CD-0x01F0) or (0x01F4-0x01F5)
    or (0x01FA-0x0217) or (0x0250-0x02A8) or (0x02BB-0x02C1) or lit1(0x0386)
    or (0x0388-0x038A) or lit1(0x038C) or (0x038E-0x03A1) or (0x03A3-0x03CE)
    ....

(where -/4 is a parser combinator that accepts a character from a range
in the obvious kind of way, and or/4 is a parser combinator that accepts
the alternation of two parsers, also in the obvious kind of way)

Which is slow to compile for two reasons: a vast number of unifications
get introduced to construct all the pred expressions, and gobs and gobs
of introduced predicates get generated.

This is why I split baseChar and a couple of other productions into a
separate submodule, to prevent them from being recompiled all the time.
If you think the compile-time is a performance bug, then send in a bug
report, and maybe someone (not me!) might do something about it. ;-)
-- 
 Thomas Conway              Mercurian )O+  
 <conway at cs.mu.oz.au>       Every sword has two edges.
--------------------------------------------------------------------------
mercury-users mailing list
post:  mercury-users at cs.mu.oz.au
administrative address: owner-mercury-users at cs.mu.oz.au
unsubscribe: Address: mercury-users-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-users-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------



More information about the users mailing list