[PATCH 5 of 7 upgraderepo V2] repair: migrate revlogs during upgrade

Gregory Szorc gregory.szorc at gmail.com
Thu Nov 24 23:18:27 EST 2016


On Thu, Nov 24, 2016 at 7:28 PM, Gregory Szorc <gregory.szorc at gmail.com>
wrote:

> # HG changeset patch
> # User Gregory Szorc <gregory.szorc at gmail.com>
> # Date 1480040072 28800
> #      Thu Nov 24 18:14:32 2016 -0800
> # Node ID a0a73b7b8699a3b831d8274dbfb0d7918b36be54
> # Parent  17c9861cd0084c99bac7071e07fc910373568df3
> repair: migrate revlogs during upgrade
>
> Our next step for in-place upgrade is to migrate store data. Revlogs
> are the biggest source of data within the store and a store is useless
> without them, so we implement their migration first.
>
> Our strategy for migrating revlogs is to walk the store and call
> `revlog.clone()` on each revlog. There are some minor complications.
>
> Because revlogs have different storage options (e.g. changelog has
> generaldelta and delta chains disabled), we need to obtain the
> correct class of revlog so inserted data is encoded properly for its
> type.
>
> Various attempts at implementing progress indicators that didn't lead
> to frustration from false "it's almost done" indicators were made.
>
> I initially used a single progress bar based on number of revlogs.
> However, this quickly churned through all filelogs, got to 99% then
> effectively froze at 99.99% when it got to the manifest.
>
> So I converted the progress bar to total revision count. This was a
> little bit better. But the manifest was still significantly slower
> than filelogs and it took forever to process the last few percent.
>
> I then tried both revision/chunk bytes and raw bytes as the
> denominator. This had the opposite effect: because so much data is in
> manifests, it would churn through filelogs without showing much
> progress. When it got to manifests, it would fill in 90+% of the
> progress bar.
>
> I finally gave up having a unified progress bar and instead implemented
> 3 progress bars: 1 for filelog revisions, 1 for manifest revisions, and
> 1 for changelog revisions. I added extra messages indicating the total
> number of revisions of each so users know there are more progress bars
> coming.
>
> I also added extra messages before and after each stage to give extra
> details about what is happening. Strictly speaking, this isn't
> necessary. But the numbers are impressive. For example, when converting
> a non-generaldelta mozilla-central repository, the messages you see are:
>
>    migrating 2475593 total revisions (1833043 in filelogs, 321156 in
> manifests, 321394 in changelog)
>    migrating 1.67 GB in store; 2508 GB tracked data
>    migrating 267868 filelogs containing 1833043 revisions (1.09 GB in
> store; 57.3 GB tracked data)
>    finished migrating 1833043 filelog revisions across 267868 filelogs;
> change in size: -415776 bytes
>    migrating 1 manifests containing 321156 revisions (518 MB in store;
> 2451 GB tracked data)
>
> That "2508 GB" figure really blew me away. I had no clue that the raw
> tracked data in mozilla-central was that large. Granted, 2451 GB is in
> the manifest and "only" 57.3 GB is in filelogs. But still.
>

While we're talking about numbers, revlog.clone() calls self.revision() for
every rev. And as part of that it needs to validate the SHA-1 of the
fulltext.

Since we have 2508 GB of data, the overhead of SHA-1 is not insignificant!

It took ~50m of CPU time to perform a `hg debugupgraderepo` on
mozilla-central on my i7-6700K. That's *without* any delta recomputation.

Assuming SHA-1 can hash at 1 GB/s on my machine (which seems to be the
ballbark I've benchmarked it at), that means 2508 of the ~3000s (83.6%) in
`hg debugupgraderepo` were spent doing SHA-1 verification!

On first glance, that percentage seems a bit high (perhaps I'm
underestimating SHA-1's speed).

The farthest I've let a conversion with statprof run before I kill things
is ~10m. But that statprof said we were spending ~32% of samples in
hashing. Considering 97.7% of the repo data is in the manifest, that
percentage can only go up since at least 40m of the ~50m wall time of an
upgrade was in the hash. So it is certainly plausible that SHA-1 hashing is
responsible for ~80% of CPU time during execution. Yikes.

I guess what I'm trying to say is a) really large manifests suck
performance via hashing (go treemanifests!) b) I'll probably do a follow-up
to mitigate (perhaps optionally) the impact of SHA-1 hashing when doing
repo upgrades.


>
> It's worth noting that gratuitous loading of source revlogs in order
> to display numbers and progress bars does serve a purpose: it ensures
> we can open all source revlogs. We don't want to spend several minutes
> copying revlogs only to encounter a permissions error or similar later.
>
> As part of this commit, we also add swapping of the store directory
> to the upgrade function. After revlogs are converted, we move the
> old store into the backup directory then move the temporary repo's
> store into the old store's location. On well-behaved systems, this
> should be 2 atomic operations and the window of inconsistency show be
> very narrow.
>
> There are still a few improvements to be made to store copying and
> upgrading. But this commit gets the bulk of the work out of the way.
>
> diff --git a/mercurial/repair.py b/mercurial/repair.py
> --- a/mercurial/repair.py
> +++ b/mercurial/repair.py
> @@ -11,15 +11,19 @@ from __future__ import absolute_import
>  import errno
>  import hashlib
>  import tempfile
> +import time
>
>  from .i18n import _
>  from .node import short
>  from . import (
>      bundle2,
>      changegroup,
> +    changelog,
>      error,
>      exchange,
> +    manifest,
>      obsolete,
> +    revlog,
>      scmutil,
>      util,
>  )
> @@ -638,6 +642,162 @@ def upgradedetermineactions(repo, improv
>
>      return newactions
>
> +def _revlogfrompath(repo, path):
> +    """Obtain a revlog from a repo path.
> +
> +    An instance of the appropriate class is returned.
> +    """
> +    if path == '00changelog.i':
> +        return changelog.changelog(repo.svfs)
> +    elif path.endswith('00manifest.i'):
> +        mandir = path[:-len('00manifest.i')]
> +        return manifest.manifestrevlog(repo.svfs, dir=mandir)
> +    else:
> +        # Filelogs don't do anything special with settings. So we can use
> a
> +        # vanilla revlog.
> +        return revlog.revlog(repo.svfs, path)
> +
> +def _copyrevlogs(ui, srcrepo, dstrepo, tr, deltareuse,
> aggressivemergedeltas):
> +    """Copy revlogs between 2 repos."""
> +    revcount = 0
> +    srcsize = 0
> +    srcrawsize = 0
> +    dstsize = 0
> +    fcount = 0
> +    frevcount = 0
> +    fsrcsize = 0
> +    frawsize = 0
> +    fdstsize = 0
> +    mcount = 0
> +    mrevcount = 0
> +    msrcsize = 0
> +    mrawsize = 0
> +    mdstsize = 0
> +    crevcount = 0
> +    csrcsize = 0
> +    crawsize = 0
> +    cdstsize = 0
> +
> +    # Perform a pass to collect metadata. This validates we can open all
> +    # source files and allows a unified progress bar to be displayed.
> +    for unencoded, encoded, size in srcrepo.store.walk():
> +        if unencoded.endswith('.d'):
> +            continue
> +
> +        rl = _revlogfrompath(srcrepo, unencoded)
> +        revcount += len(rl)
> +
> +        datasize = 0
> +        rawsize = 0
> +        idx = rl.index
> +        for rev in rl:
> +            e = idx[rev]
> +            datasize += e[1]
> +            rawsize += e[2]
> +
> +        srcsize += datasize
> +        srcrawsize += rawsize
> +
> +        # This is for the separate progress bars.
> +        if isinstance(rl, changelog.changelog):
> +            crevcount += len(rl)
> +            csrcsize += datasize
> +            crawsize += rawsize
> +        elif isinstance(rl, manifest.manifestrevlog):
> +            mcount += 1
> +            mrevcount += len(rl)
> +            msrcsize += datasize
> +            mrawsize += rawsize
> +        elif isinstance(rl, revlog.revlog):
> +            fcount += 1
> +            frevcount += len(rl)
> +            fsrcsize += datasize
> +            frawsize += rawsize
> +
> +    if not revcount:
> +        return
> +
> +    ui.write(_('migrating %d total revisions (%d in filelogs, %d in
> manifests, '
> +               '%d in changelog)\n') %
> +             (revcount, frevcount, mrevcount, crevcount))
> +    ui.write(_('migrating %s in store; %s tracked data\n') % (
> +             (util.bytecount(srcsize), util.bytecount(srcrawsize))))
> +
> +    # Used to keep track of progress.
> +    progress = []
> +    def oncopiedrevision(rl, rev, node):
> +        progress[1] += 1
> +        srcrepo.ui.progress(progress[0], progress[1], total=progress[2])
> +
> +    # Do the actual copying.
> +    # FUTURE this operation can be farmed off to worker processes.
> +    seen = set()
> +    for unencoded, encoded, size in srcrepo.store.walk():
> +        if unencoded.endswith('.d'):
> +            continue
> +
> +        oldrl = _revlogfrompath(srcrepo, unencoded)
> +        newrl = _revlogfrompath(dstrepo, unencoded)
> +
> +        if isinstance(oldrl, changelog.changelog) and 'c' not in seen:
> +            ui.write(_('finished migrating %d manifest revisions across
> %d '
> +                       'manifests; change in size: %s\n') %
> +                     (mrevcount, mcount, util.bytecount(mdstsize -
> msrcsize)))
> +
> +            ui.write(_('migrating changelog containing %d revisions '
> +                       '(%s in store; %s tracked data)\n') %
> +                     (crevcount, util.bytecount(csrcsize),
> +                      util.bytecount(crawsize)))
> +            seen.add('c')
> +            progress[:] = [_('changelog revisions'), 0, crevcount]
> +        elif isinstance(oldrl, manifest.manifestrevlog) and 'm' not in
> seen:
> +            ui.write(_('finished migrating %d filelog revisions across %d
> '
> +                       'filelogs; change in size: %s\n') %
> +                     (frevcount, fcount, util.bytecount(fdstsize -
> fsrcsize)))
> +
> +            ui.write(_('migrating %d manifests containing %d revisions '
> +                       '(%s in store; %s tracked data)\n') %
> +                     (mcount, mrevcount, util.bytecount(msrcsize),
> +                      util.bytecount(mrawsize)))
> +            seen.add('m')
> +            progress[:] = [_('manifest revisions'), 0, mrevcount]
> +        elif 'f' not in seen:
> +            ui.write(_('migrating %d filelogs containing %d revisions '
> +                       '(%s in store; %s tracked data)\n') %
> +                     (fcount, frevcount, util.bytecount(fsrcsize),
> +                      util.bytecount(frawsize)))
> +            seen.add('f')
> +            progress[:] = [_('file revisions'), 0, frevcount]
> +
> +        ui.progress(progress[0], progress[1], total=progress[2])
> +
> +        ui.note(_('cloning %d revisions from %s\n') % (len(oldrl),
> unencoded))
> +        oldrl.clone(tr, newrl, addrevisioncb=oncopiedrevision,
> +                    deltareuse=deltareuse,
> +                    aggressivemergedeltas=aggressivemergedeltas)
> +
> +        datasize = 0
> +        idx = newrl.index
> +        for rev in newrl:
> +            datasize += idx[rev][1]
> +
> +        dstsize += datasize
> +
> +        if isinstance(newrl, changelog.changelog):
> +            cdstsize += datasize
> +        elif isinstance(newrl, manifest.manifestrevlog):
> +            mdstsize += datasize
> +        else:
> +            fdstsize += datasize
> +
> +    ui.progress(progress[0], None)
> +
> +    ui.write(_('finished migrating %d changelog revisions; change in
> size: '
> +               '%s\n') % (crevcount, util.bytecount(cdstsize - csrcsize)))
> +
> +    ui.write(_('finished migrating %d total revisions; total change in
> store '
> +               'size: %s\n') % (revcount, util.bytecount(dstsize -
> srcsize)))
> +
>  def _upgraderepo(ui, srcrepo, dstrepo, requirements, actions):
>      """Do the low-level work of upgrading a repository.
>
> @@ -651,7 +811,25 @@ def _upgraderepo(ui, srcrepo, dstrepo, r
>      assert srcrepo.currentwlock()
>      assert dstrepo.currentwlock()
>
> -    # TODO copy store
> +    ui.write(_('(it is safe to interrupt this process any time before '
> +               'data migration completes)\n'))
> +
> +    if 'redeltaall' in actions:
> +        deltareuse = revlog.revlog.DELTAREUSENEVER
> +    elif 'redeltaparent' in actions:
> +        deltareuse = revlog.revlog.DELTAREUSESAMEREVS
> +    elif 'redeltamultibase' in actions:
> +        deltareuse = revlog.revlog.DELTAREUSESAMEREVS
> +    else:
> +        deltareuse = revlog.revlog.DELTAREUSEALWAYS
> +
> +    with dstrepo.transaction('upgrade') as tr:
> +        _copyrevlogs(ui, srcrepo, dstrepo, tr, deltareuse,
> +                     'redeltamultibase' in actions)
> +
> +    # TODO copy non-revlog store files
> +
> +    ui.write(_('data fully migrated to temporary repository\n'))
>
>      backuppath = tempfile.mkdtemp(prefix='upgradebackup.',
> dir=srcrepo.path)
>      backupvfs = scmutil.vfs(backuppath)
> @@ -672,7 +850,16 @@ def _upgraderepo(ui, srcrepo, dstrepo, r
>      ui.write(_('replaced files will be backed up at %s\n') %
>               backuppath)
>
> -    # TODO do the store swap here.
> +    # Now swap in the new store directory. Doing it as a rename should
> make
> +    # the operation nearly instantaneous and atomic (at least in
> well-behaved
> +    # environments).
> +    ui.write(_('replacing store...\n'))
> +    tstart = time.time()
> +    util.rename(srcrepo.spath, backupvfs.join('store'))
> +    util.rename(dstrepo.spath, srcrepo.spath)
> +    elapsed = time.time() - tstart
> +    ui.write(_('store replacement complete; repository was inconsistent
> for '
> +               '%0.1fs\n') % elapsed)
>
>      # We first write the requirements file. Any new requirements will lock
>      # out legacy clients.
> diff --git a/tests/test-upgrade-repo.t b/tests/test-upgrade-repo.t
> --- a/tests/test-upgrade-repo.t
> +++ b/tests/test-upgrade-repo.t
> @@ -194,9 +194,13 @@ Upgrading a repository that is already m
>    beginning upgrade...
>    repository locked and read-only
>    creating temporary repository to stage migrated data:
> $TESTTMP/modern/.hg/upgrade.* (glob)
> +  (it is safe to interrupt this process any time before data migration
> completes)
> +  data fully migrated to temporary repository
>    marking source repository as being upgraded; clients will be unable to
> read from repository
>    starting in-place swap of repository data
>    replaced files will be backed up at $TESTTMP/modern/.hg/upgradebackup.*
> (glob)
> +  replacing store...
> +  store replacement complete; repository was inconsistent for *s (glob)
>    finalizing requirements file and making repository readable again
>    removing temporary repository $TESTTMP/modern/.hg/upgrade.* (glob)
>    copy of old repository backed up at $TESTTMP/modern/.hg/upgradebackup.*
> (glob)
> @@ -227,9 +231,22 @@ Upgrading a repository to generaldelta w
>    beginning upgrade...
>    repository locked and read-only
>    creating temporary repository to stage migrated data:
> $TESTTMP/upgradegd/.hg/upgrade.* (glob)
> +  (it is safe to interrupt this process any time before data migration
> completes)
> +  migrating 9 total revisions (3 in filelogs, 3 in manifests, 3 in
> changelog)
> +  migrating 341 bytes in store; 401 bytes tracked data
> +  migrating 3 filelogs containing 3 revisions (0 bytes in store; 0 bytes
> tracked data)
> +  finished migrating 3 filelog revisions across 3 filelogs; change in
> size: 0 bytes
> +  migrating 1 manifests containing 3 revisions (157 bytes in store; 220
> bytes tracked data)
> +  finished migrating 3 manifest revisions across 1 manifests; change in
> size: 0 bytes
> +  migrating changelog containing 3 revisions (184 bytes in store; 181
> bytes tracked data)
> +  finished migrating 3 changelog revisions; change in size: 0 bytes
> +  finished migrating 9 total revisions; total change in store size: 0
> bytes
> +  data fully migrated to temporary repository
>    marking source repository as being upgraded; clients will be unable to
> read from repository
>    starting in-place swap of repository data
>    replaced files will be backed up at $TESTTMP/upgradegd/.hg/upgradebackup.*
> (glob)
> +  replacing store...
> +  store replacement complete; repository was inconsistent for *s (glob)
>    finalizing requirements file and making repository readable again
>    removing temporary repository $TESTTMP/upgradegd/.hg/upgrade.* (glob)
>    copy of old repository backed up at $TESTTMP/upgradegd/.hg/upgradebackup.*
> (glob)
> @@ -252,4 +269,43 @@ generaldelta added to original requireme
>    revlogv1
>    store
>
> +store directory has files we expect
> +
> +  $ ls .hg/store
> +  00changelog.i
> +  00manifest.i
> +  data
> +  fncache
> +  undo
> +  undo.backupfiles
> +  undo.phaseroots
> +
> +manifest should be generaldelta
> +
> +  $ hg debugrevlog -m | grep flags
> +  flags  : inline, generaldelta
> +
> +verify should be happy
> +
> +  $ hg verify
> +  checking changesets
> +  checking manifests
> +  crosschecking files in changesets and manifests
> +  checking files
> +  3 files, 3 changesets, 3 total revisions
> +
> +old store should be backed up
> +
> +  $ ls .hg/upgradebackup.*/store
> +  00changelog.i
> +  00manifest.i
> +  data
> +  fncache
> +  lock
> +  phaseroots
> +  undo
> +  undo.backup.fncache
> +  undo.backupfiles
> +  undo.phaseroots
> +
>    $ cd ..
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20161124/5a9cb318/attachment.html>


More information about the Mercurial-devel mailing list