[patch] syntax:plain for .hgignore

Matt Mackall mpm at selenic.com
Wed Sep 12 10:27:45 CDT 2007


On Wed, Sep 12, 2007 at 05:59:14AM -0400, Jonathan S. Shapiro wrote:
> On Tue, 2007-09-11 at 15:26 -0500, Matt Mackall wrote:
> > On the other hand, if your regex is too large for your Python build
> > and it has to get broken into pieces, then regexes will probably lose.
> 
> I don't know how python implements regexes, but this statement surprises
> me. I would have expected the regex internal data structure to be
> dynamically allocated, and not to have much in the way of a size limit.
> Does python fragment the regex internally in some cases?

Some builds of Python (including Guido's) have a much smaller limit on
the size of regexps. I'm afraid I don't know much about it beyond that.

To work around this, we do this (util.c:477):

        try:
            pat = '(?:%s)' % '|'.join([regex(k, p, tail) for (k, p) in pats])
            return re.compile(pat).match
        except OverflowError:
            # We're using a Python with a tiny regex engine and we             
            # made it explode, so we'll divide the pattern list in two         
            # until it works                                                   
            l = len(pats)
            if l < 2:
                raise
            a, b = matchfn(pats[:l/2], tail), matchfn(pats[l/2:], tail)
            return lambda s: a(s) or b(s)

Each split means an extra regex test, so it won't take many splits to
be slower than string matching.

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list