[PATCH] performance: disable workaround for an old bug of Python

Sun Aug 21 12:49:45 EDT 2016

On Sun, Aug 21, 2016 at 5:19 PM, Yuya Nishihara <yuya at tcha.org> wrote:
> On Sat, 20 Aug 2016 23:04:06 +0200, Maciej Fijalkowski wrote:
>> On Sun, Aug 14, 2016 at 4:14 AM, Yuya Nishihara <yuya at tcha.org> wrote:
>> > On Fri, 12 Aug 2016 23:42:38 +0200, Maciej Fijalkowski wrote:
>> >> Well, seems the comment is out of date. I know the issue - the GC got
>> >> triggered every X objects in 2.6, which caused O(n^2) performance when
>> >> allocating lots of objects. These days (since 2.7) it adapts the
>> >> threshold, which means it always amortizes to O(n).
>> >
>> >> On Thu, Aug 11, 2016 at 12:36 AM, Matt Mackall <mpm at selenic.com> wrote:
>> >> > It's either fixed in 2.7 or it's not. The primary users of this code are
>> >> > dirstate and obsmarkers, so it should be pretty easy to test. This changeset has
>> >> > a benchmark:
>> >> >
>> >> > changeset:   25675:5817f71c2336
>> >> > user:        Pierre-Yves David <pierre-yves.david at fb.com>
>> >> > date:        Wed Nov 26 16:58:31 2014 -0800
>> >> > files:       mercurial/obsolete.py
>> >> > description:
>> >> > obsstore: disable garbage collection during initialization (issue4456)
>> >
>> > I can't say there's measurable win on CPython 2.7, and we have native parser
>> > now. So I'll take this patch, thanks.
>> >
>> >   $ hg debugobsolete | wc -l
>> >   106437
>> >   $ hg up 5817f71c2336  # use pure python parser
>> >   $ python -m timeit -r10 \
>> >   -s 'from mercurial import obsolete, scmutil; svfs = scmutil.vfs(".hg/store")' \
>> >   'obsolete.obsstore(svfs)'
>> >
>> >   (Python 2.6.9, GC disabled)
>> >   10 loops, best of 10: 714 msec per loop
>> >   (Python 2.6.9, GC enabled)
>> >   10 loops, best of 10: 746 msec per loop
>> >
>> >   (Python 2.7.12+, GC disabled)
>> >   10 loops, best of 10: 699 msec per loop
>> >   (Python 2.7.12+, GC enabled)
>> >   10 loops, best of 10: 703 msec per loop
>> >
>> > The result of timeit wasn't stable.
>>
>> Well.... you're using TIMEIT to do GC benchmarks - it's a terrible
>> idea, ever, but for GC especially. You're taking a minimum of 10 runs
>> - of course it'll be random, you should take average instead and do
>> more runs.
>>
>> That said, this is what I would expect, but I still strongly disagree
>> with the methodology.
>
> Good point. At least, I should try to calculate average of many runs.
>
> I thought 100k markers would be large enough to see significant difference
> even by timeit, on Python 2.6.9 as described in 5817f71c2336, but it seemed
> not.

never use timeit