[PATCH] Add a lazy index file parser

Thu Mar 15 14:13:28 CDT 2012

On Wed, 2012-03-14 at 23:55 -0700, Bryan O'Sullivan wrote:
> # HG changeset patch
> # User Bryan O'Sullivan <bos at serpentine.com>
> # Date 1331794495 25200
> # Branch stable
> # Node ID 6b8704721e53d414d0906fff7b0c7ae236bc6f29
> # Parent  ca5cc2976574d820dad5774afd8c7b3c39ec11cd
> Add a lazy index file parser

We've implemented a strict summary format in your absence, see the
checklist here:

http://mercurial.selenic.com/wiki/ContributingChanges

> We only parse entries in a revlog index file when they are actually needed.
> 
> This makes a huge difference to performance on large revlogs when
> only a few entries are used (a common case).  For instance, "hg
> tip" on a tree with 300,000 commits takes 0.3s before, 0.02 after.
> 
> For revlog-intensive operations (e.g. running "hg log" to completion),
> the lazy approach is about 1% slower than the eager parse_index2.

Low-level algorithms like revlog.headrevs() actually takes a bigger hit
(more like 50%). You can see this with the perfheads change I just
pushed. I think we can get this and more back though with further
optimization.

> +	if (entry)
> +		PyObject_GC_UnTrack(entry);

I guess this is safe still. Not sure it matters though.

> +static int
> +index_ass_subscript(indexObject* self, PyObject* item, PyObject* value)
> +{

-- 
Mathematics is the supreme nostalgia of our time.