[PATCH 2 of 5] adjustlinkrev: use C implementation to test ancestor if possible

Sun Nov 6 20:01:02 EST 2016

Excerpts from Pierre-Yves David's message of 2016-11-07 01:08:18 +0100:
> I don't doubt you have are getting a good understanding of the current 
> code, but message like this one 
> https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-November/089815.html  
> show that you do not have a full grasp on the best yet.

The first line of that message is:

  Not related to this series directly but I want to remove "_ancestrycontext".

So it's about an unrelated topic - removing "_ancestrycontext". It does not
affect this patch. I think this patch (changing isancestor to use C code
path) is relatively trivial.

> What about an intermediate version that use C code to fill a Python heap 
> object? A good way to get a good grasp on the boost I think having a 
> small Cython proof of concept would help.

That might be helpful. However, I don't have immediate plans for it. Because
we will use C code to do fast "isancestor" test for the "changelog.i" walk,
and if we pre-build the linkrev database, there is no need to do
"changelog.d" walk. So "lazyancestors" won't be actually used in
"adjustlinkrev", which is my main focus right now.

> > I'd note that I think the C code may have room to improve. And if we are
> > going to add some "state" to speed up testing ancestors, I think it should
> > be done globally, i.e. on the C index object. So the state is private to the
> > C code and the Python code just calls index.isancestor, without knowing
> > about the details. As the changelog is shared globally and it may be a waste
> > of memory to have separated states for each fctx.
> 
> I'm a bit afraid of the space complexity of such cache. However, one of 
> you colleague at the sprint some idea around this.

The memory usage of the global cache would be something like O(log(N) * N)
or O(N), depending on how useful we want it to be.

This is better than the "_ancestrycontext" approach - suppose every ctx has
an "_ancestrycontext", it's N^2 memory usage, where N is len(repo).