Efficiently comparing manifests
Greg Ward
greg at gerg.ca
Thu May 13 13:44:40 CDT 2010
API question: how do I *efficiently* compare manifests from related
changesets? Specifically, I am writing a script to detect all dummy
merges in our repository, i.e. changesets where p2 != null and
p1.manifest == self.manifest. I have to compare the manifest
contents, not just the manifest node ID, because Mercurial creates a
new manifest entry even when there are no diffs relative to the first
parent.
My first attempt was this:
def isdummymerge(ui, repo, rev):
"""Return true if rev is a dummy merge, i.e. has no delta relative to
its first parent."""
cctx = repo[rev]
parent = cctx.p1().manifest()
manifest = cctx.manifest()
return parent == manifest
(I've already established that rev refers to a merge.) But that
doesn't work because changectx.manifest() only includes file names and
node IDs; if two changesets differ only in permissions changes, then
this will incorrectly conclude that the manifests are equal. It's
also very slow; each entry in our manifest log is about 1.8 MB (18,000
files).
So I peeled a layer off the onion and went straight to the manifest class:
cl = repo.changelog
ml = repo.manifest
node = cl.node(rev)
p1 = cl.parents(node)[0]
mymft = ml.revision(cl.read(node)[0])
p1mft = ml.revision(cl.read(p1)[0])
return mymft == p1mft
This seems to work, but it's still awfully slow. (It should be a bit
faster since I'm not parsing that 1.8 MB string into an 18,000-entry
dict, but I don't think that's the bottleneck.) ISTR someone
(Benoit?) advising use of changectx.manifestdelta() (or
manifest.readdelta()) for stuff like this, but I don't see how that
can work: it appears to me that manifest.readdelta() just returns the
diff between rev-1 and rev. That's no good: I need the diff between
parent1(node) and node.
Suggestions?
Greg
More information about the Mercurial-devel
mailing list