[patch] syntax:plain for .hgignore
Matt Mackall
mpm at selenic.com
Wed Sep 12 13:42:47 CDT 2007
On Wed, Sep 12, 2007 at 08:32:13PM +0200, Guido Ostkamp wrote:
> On Wed, 12 Sep 2007, Matt Mackall wrote:
>
> >>I'm willing to try it with a tweaked Python 2.5.1 build, however I
> >>don't know what to change. The 'configure --help' of Python does not
> >>give any hint.
> >>
> >>Do you have any hints for me what I need to change to have the regex
> >>module handle larger regular expressions?
> >
> >No idea, nor have I had any luck googling for it.
>
> I've checked out the Python 2.5.1 sources and found the following:
>
> The error raised is the following code in .../Python-2.5.1/Modules/_sre.c:
>
> for (i = 0; i < n; i++) {
> PyObject *o = PyList_GET_ITEM(code, i);
> unsigned long value = PyInt_Check(o) ? (unsigned
> long)PyInt_AsLong(o)
> : PyLong_AsUnsignedLong(o);
> self->code[i] = (SRE_CODE) value;
> if ((unsigned long) self->code[i] != value) {
> *** PyErr_SetString(PyExc_OverflowError,
> *** "regular expression code size limit exceeded");
> break;
> }
> }
>
> It appears that an 'unsigned long' value 'value' is stored in
> 'self->code[i]' which is of type 'unsigned short' because of
> Python-2.5.1/Modules/sre.h which defines SRE_CODE as:
>
> /* size of a code word (must be unsigned short or larger, and
> large enough to hold a Py_UNICODE character) */
> #ifdef Py_UNICODE_WIDE
> #define SRE_CODE Py_UCS4
> #else
> #define SRE_CODE unsigned short
> #endif
>
> I've changed that last SRE_CODE to become 'unsigned long'. After
> rebuilding Python, I could run your test program successfully.
>
> However, it remains unclear to me, what 'unicode' has to do with the
> general size of an regular expression stack. Maybe this is a general
> Python bug, I don't know.
>
> Interestingly, although your test program gave basically the same results
> that you mentioned (= regex was faster), our 'plain' style patch still
> remains to be faster, even with this Python version.
Can you mail your .hgignore file? Privately and/or obfuscated is fine.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list