[Bug 4205] New: Populating tags cache slow for large, multi-headed repositories

mercurial-bugs at selenic.com mercurial-bugs at selenic.com
Mon Mar 24 23:06:16 CDT 2014


http://bz.selenic.com/show_bug.cgi?id=4205

          Priority: normal
            Bug ID: 4205
                CC: mercurial-devel at selenic.com
          Assignee: bugzilla at selenic.com
           Summary: Populating tags cache slow for large, multi-headed
                    repositories
          Severity: bug
    Classification: Unclassified
                OS: All
          Reporter: gregory.szorc at gmail.com
          Hardware: All
            Status: UNCONFIRMED
           Version: unspecified
         Component: Mercurial
           Product: Mercurial

In issue 4201, tags cache creation was identified as a performance pain point
when dealing with large (in terms of changesets and number of heads)
repositories.

Currently, tags cache resolution requires resolving the .hgtags filelog node
for each head changeset. Currently, this means resolving each head changeset's
manifest and looking up the corresponding .hgtags filelog node from the
manifest.

On large repositories (in terms of files and number of heads), manifest
resolution can be very slow (at least with the default revlog settings - not
generaldelta, lz4 revlog, etc). Manifest resolution dominates the initial tags
cache population time.

In issue 4201, I wrote a quick patch to take a hybrid approach to reading
.hgtags filelog node values. It iterated over the filelog and saved nodes for
head changesets by consulting the filelog revision's linkrev. This cut down on
the number of manifest lookups significantly. However, the patch failed to
produce any real world wins. The likely cause is long delta chains and the fact
that many heads didn't change the .hgtags file. Lack of aggressive manifest
caching likely doesn't help matters.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Mercurial-devel mailing list