[PATCH 2 of 4] log: do not walk the entire revision window when it's bigger than required

Benoit Boissinot benoit.boissinot at ens-lyon.org
Wed Apr 28 02:59:46 CDT 2010


On Wed, Apr 28, 2010 at 12:43:12AM -0500, Matt Mackall wrote:
> On Wed, 2010-04-28 at 14:08 +0900, Nicolas Dumazet wrote:
> > I talked a bit about it with Benoit on IRC.
> > 
> > I admit that I still have difficulties to wrap my mind around
> > non-monotic linkrevs. I see the output of test-strip-cross and it
> > helps to see that it indeed happens. But the way the revisions are
> > created is a bit too cryptic to allow me to think about it.
> 
> If I recall correctly, it works like this:
> 
> - filerevs created with the same content and same parents have the same
>   hash
> - newly-created identical files are a common instance of such
> - pull will fix up linkrevs when doing a pull where the linked rev is
>   not included (ie on a parallel branch)
> - the file revisions come across in their original order, so an old file
>   revision can be fixed up to point to a new cset
> - the next linkrev will probably "cross" in by pointing to an earlier
>   revision

Yes that's it, we don't reorder filerevs on pull/push.
> 
> The real problem isn't the crossing, so much as the fact that we now
> have a pointer that points to the future rather than the past. Perhaps
> if we were smarter, we'd have made pull reorder things in flight. But
> strip can create a similar situation.

And anyway, we probably need to live with the fact that cross-linkrev
exists.
> 
> In the even bigger picture, the real problem is that filelog graphs bear
> only a rough relation to the real history of a file as represented by
> changelog+manifest. This makes log -f, annotate, etc., unhappy. I think
> there's probably a reasonably efficient way to reconstruct the full
> file-level history from interpolating in the filelog graph, but there
> are tricky bits..

I had a patch to start that (it was working on (file)contexts, which is
probably still too expensive).

I think the idea is that we only need to explore a small area of the
changelog graph when:
Given curr, the current revision (changelog id), for every p in
filectx.parents(), linkrev(p) is not a descendant of curr.
In that case we have a duplicate rev and need to explore the ancestors
of curr which are not ancestors of p.


> > I do think however, that this serie does not change a thing with this
> > regard. The current code indeed seems broken, now that I'm told about
> > "crossed" linkrevs.
> > But my code does not change much, it only moves a comparison earlier:
> > 
> > before:
> > 
> > def iterator1():
> >     stack = []
> >     for i in baseiterator():
> >         stack.append(i)
> >     for i in reverse(stack):
> >         yield i
> > def iterator2():
> >     for i in iterator1():
> >         if rev(i) <= maxrev:
> >             yield i
> > 
> > after:
> > 
> > def iterator1():
> >     stack = []
> >     for i in baseiterator():
> >         if rev(i) > maxrev:
> >             break
> 
> Perhaps changing that to a continue..?

I think we should drop the test with maxrev, yes.

And the increasing_window() stuff is probably not needed since in most
cases we will only touch the index (filelog data is only read for
renames, but renames only happens when p1 == nullid, which is very rare
I think).

Benoit

-- 
:wq


More information about the Mercurial-devel mailing list