Problems extracting renames

Matt Mackall mpm at
Tue Feb 12 14:01:29 CST 2008

On Tue, 2008-02-12 at 18:28 +0000, Till Varoquaux wrote:
> According to my very cursory understanding of hg the move information
> is just metadata added to make peoples like me happy. However the copy
> information is at the core of the system and in the underlyings is
> just a copy followed by a delete. I am a bit surprised that hg does
> not fall back to show these copy informations. What am I missing here?
> Also suppose someone was hellbent on extracting the rename
> informations from a repository would you have pointers on where to
> start?

File revisions are stored as "revlogs".

Renames are stored via "metadata" in file revlogs. They are currently
the only user of this sort of file-level metadata. Metadata is indicated
by a the revision stored in the revlog starting with "\1\n" when
reconsistuted. Something like this:

copy: original-file.c
copyrev: <hash of copied version>
<rest of file data>

Here's a quick script that will find all the copies in a repo:

from mercurial import ui, hg, node

u = ui.ui()
r = hg.repository(u, ".")

files = {}
for n in xrange(r.changelog.count()):
    for f in r.changectx(n).files():
        files[f] = 1

for f in files:
    fr = r.file(f)
    for rev in range(fr.count()):
        n = fr.node(rev)
        ri = fr.renamed(n)
        cn = r.changelog.node(fr.linkrev(n))
        if ri:
            fr2 = r.file(ri[0])
            cn2 = r.changelog.node(fr2.linkrev(ri[1]))
            print "%s@%s from %s@%s" % (f, node.short(cn),
                                        ri[0], node.short(cn2))

Basically, we look at each changeset to build a list of files. Then we
look at each revision of each file to check for rename info.

The only tricky bit is we've got to translate from the hidden revision
ids at the filelog level to the publically visible ones at the changelog

Mathematics is the supreme nostalgia of our time.

More information about the Mercurial mailing list