Efficiently comparing manifests

Matt Mackall mpm at selenic.com
Thu May 13 14:27:56 CDT 2010


On Thu, 2010-05-13 at 15:11 -0400, Greg Ward wrote:
> On Thu, May 13, 2010 at 3:02 PM, Matt Mackall <mpm at selenic.com> wrote:
> > On Thu, 2010-05-13 at 14:44 -0400, Greg Ward wrote:
> >> So I peeled a layer off the onion and went straight to the manifest class:
> >>
> >>     cl = repo.changelog
> >>     ml = repo.manifest
> >>
> >>     node = cl.node(rev)
> >>     p1 = cl.parents(node)[0]
> >>
> >>     mymft = ml.revision(cl.read(node)[0])
> >>     p1mft = ml.revision(cl.read(p1)[0])
> >>     return mymft == p1mft
> >
> > How about simply:
> >
> > return cl.read(node)[0] == cl.read(p1)[0] # compare manifest SHA1s
> >
> > This is equivalent to:
> >
> > return repo[node].manifestnode() == repo[p1].manifestnode()
> 
> But dummy merges add an entry to the manifest log: new entry, new node
> ID, same content as first parent.

Ahh, right. That may or may not be a bug.

If you want to be clever, you can still get away with doing half the
work:

m = manifest.read(node)
h = hash(m, otherp1, otherp2) # calculate the hash assuming unchanged
return h == other

> The relevant extract from "hg debugindex" on the manifest is:
> 
>    rev    offset  length   base linkrev nodeid       p1           p2
> [...]
>  97124  25851928      81  92468  105380 d49683380e9d 2468ff6c5ee7 72a500b75e74
>  97125  25852009       0  92468  105411 fca2d351af12 d49683380e9d 8a6baebeeacf
> 
> (The manifest and changelog rev numbers are way off because this
> manifest log has been shrunk by shrink-revlog.  The interesting side
> effect of this is that the dummy merge is readily apparent here
> because it has length 0 in the manifest log.  But I can't use that,
> because I can't assume that every repo has been shrunk, or that they
> have been shrunk with an algorithm that picks parent deltas with
> perfect accuracy.)

Well you can use it, but you'll need two cases: delta is against parent
revision and delta isn't. Also, make sure you always reconstruct the
(numerically) earlier version first, revlog's internal caches will work
much better that way.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list