[patch] syntax:plain for .hgignore

Matt Mackall mpm at selenic.com
Wed Sep 12 12:52:23 CDT 2007


On Tue, Sep 11, 2007 at 11:27:58PM +0200, Guido Ostkamp wrote:
> Hello Matt,
> 
> On Tue, 11 Sep 2007, Matt Mackall wrote:
> >Something strikes me as odd here. I wrote a quick little test:
> >
> >$ time python2.5 retest.py re exit
> >
> >real    0m0.806s
> >user    0m0.796s
> >sys     0m0.008s
> 
> >On the other hand, if your regex is too large for your Python build and 
> >it has to get broken into pieces, then regexes will probably lose.
> 
> the test above bombs out for me with this:
> 
> $ ./matt.py re exit
> Traceback (most recent call last):
>     File "./matt.py", line 20, in <module>
>       m = rematch(pats)
>     File "./matt.py", line 12, in rematch
>       return re.compile(pat).match
>     File "/usr/local/lib/python2.5/re.py", line 180, in compile
>       return _compile(pattern, flags)
>     File "/usr/local/lib/python2.5/re.py", line 231, in _compile
>       p = sre_compile.compile(pattern, flags)
>     File "/usr/local/lib/python2.5/sre_compile.py", line 530, in compile
>       groupindex, indexgroup
> OverflowError: regular expression code size limit exceeded
> 
> This is why I requested a Mercurial bugfix as the same has happened with 
> our .hgignore before. You kindly provided one in changeset '0f6a1bdf89fb'. 
> Since then it used to work for us.
> 
> I'm willing to try it with a tweaked Python 2.5.1 build, however I don't 
> know what to change. The 'configure --help' of Python does not give any 
> hint.
> 
> Do you have any hints for me what I need to change to have the regex 
> module handle larger regular expressions?

No idea, nor have I had any luck googling for it.

However, there's something that might help without rebuilding Python.
Turns out that Python (and Perl) don't actually optimize expressions
like foobar|foobaz|fooquux into foo(bar|baz|quux). So if you have a
bunch of files with common prefixes, changing foo/a, foo/b, foo/c into
foo/(a|b|c) (or foo/[abc]!) will make them match faster. Similarly,
joining common tails like (bar.o|bar.s) -> bar.[so] will also improve
things.

There's a Perl script that will do this sort of thing here:

http://search.cpan.org/src/DANKOGAI/Regexp-Optimizer-0.15/lib/Regexp/List.pm

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list