[PATCH V2] hg: support for auto sharing stores when cloning

Gregory Szorc gregory.szorc at gmail.com
Wed Jul 8 23:15:45 UTC 2015


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1436397326 25200
#      Wed Jul 08 16:15:26 2015 -0700
# Node ID a8951662fd90366b090c5b33ee68c54a10414a2a
# Parent  3948cb4d0ae70e7257e47e2fd9f657c0c1af7c2b
hg: support for auto sharing stores when cloning

Many 3rd party consumers of Mercurial have created wrappers to
essentially perform clone+share as a single operation. This is
especially popular in automated processes like continuous integration
systems. The Jenkins CI software and Mozilla's Firefox release
automation infrastructure have both implemented custom code that
effectively perform clone+share. The common use case here is that
clients want to obtain N>1 checkouts while minimizing disk space and
network requirements. Furthermore, they often don't care that a clone
is an exact mirror of a remote: they are simply looking to obtain
checkouts of specific revisions.

When multiple third parties implement a similar feature, it's a good
sign that the feature is worth adding to the core product. This patch
adds support for an easy-to-use clone+share feature.

The internal "clone" function now accepts options to control auto
sharing during clone. When the auto share mode is active, a store will
be created/updated under the base directory specified and a new
repository pointing to the shared store will be created at the path
specified by the user.

The share extension has grown the ability to pass these options into
the clone command/function.

No command line options for this feature are added because we don't
feel the feature will be popular enough to warrant their existence.

There are two modes for auto share mode. In the default mode, the shared
repo is derived from the first changeset (rev 0) in the remote
repository. This enables related repositories existing at different URLs
to automatically use the same storage. In environments that operate
several repositories (separate repo for branch/head/bookmark or separate
repo per user), this has the potential to drastically reduce storage
and network requirements. In the other mode, the name is derived from the
remote's path/URL.

diff --git a/hgext/share.py b/hgext/share.py
--- a/hgext/share.py
+++ b/hgext/share.py
@@ -2,12 +2,44 @@
 #
 # This software may be used and distributed according to the terms of the
 # GNU General Public License version 2 or any later version.
 
-'''share a common history between several working directories'''
+'''share a common history between several working directories
+
+Automatic Pooled Storage for Clones
+===================================
+
+When this extension is active, :hg:`clone` can be configured to
+automatically share/pool storage across multiple clones. This
+mode effectively converts :hg:`clone` to :hg:`clone` + :hg:`share`.
+The benefit of using this mode is the automatic management of
+store paths and intelligent pooling of related repositories.
+
+The following ``share.`` config options influence this feature:
+
+``pool``
+    Filesystem path where shared repository data will be stored. When
+    defined, :hg:`clone` will automatically use shared repository
+    storage instead of creating a store inside each clone.
+
+``poolnaming``
+    How directory names in ``share.pool`` are constructed.
+
+    "identity" means the name is derived from the first changeset in the
+    repository. In this mode, different remotes share storage if their
+    root/initial changeset is identical. In this mode, the local shared
+    repository is an aggregate of all encountered remote repositories.
+
+    "remote" means the name is derived from the source repository's
+    path or URL. In this mode, storage is only shared if the path or URL
+    requested in the :hg:`clone` command matches exactly to a repository
+    that was cloned before.
+
+    The default naming mode is "identity."
+'''
 
 from mercurial.i18n import _
-from mercurial import cmdutil, hg, util, extensions, bookmarks
+from mercurial import cmdutil, commands, hg, util, extensions, bookmarks
 from mercurial.hg import repository, parseurl
 import errno
 
 cmdtable = {}
@@ -74,12 +106,26 @@ def unshare(ui, repo):
 
     # update store, spath, svfs and sjoin of repo
     repo.unfiltered().__init__(repo.baseui, repo.root)
 
+# Wrap clone command to pass auto share options.
+def clone(orig, ui, source, *args, **opts):
+    pool = ui.config('share', 'pool', None)
+    if pool:
+        pool = util.expandpath(pool)
+
+    opts['shareopts'] = dict(
+        pool=pool,
+        mode=ui.config('share', 'poolnaming', 'identity'),
+    )
+
+    return orig(ui, source, *args, **opts)
+
 def extsetup(ui):
     extensions.wrapfunction(bookmarks.bmstore, 'getbkfile', getbkfile)
     extensions.wrapfunction(bookmarks.bmstore, 'recordchange', recordchange)
     extensions.wrapfunction(bookmarks.bmstore, 'write', write)
+    extensions.wrapcommand(commands.table, 'clone', clone)
 
 def _hassharedbookmarks(repo):
     """Returns whether this repo has shared bookmarks"""
     try:
diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -1417,9 +1417,10 @@ def clone(ui, source, dest=None, **opts)
                  pull=opts.get('pull'),
                  stream=opts.get('uncompressed'),
                  rev=opts.get('rev'),
                  update=opts.get('updaterev') or not opts.get('noupdate'),
-                 branch=opts.get('branch'))
+                 branch=opts.get('branch'),
+                 shareopts=opts.get('shareopts'))
 
     return r is None
 
 @command('^commit|ci',
diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -283,10 +283,51 @@ def copystore(ui, srcrepo, destpath):
     except: # re-raises
         release(destlock)
         raise
 
+def clonewithshare(ui, peeropts, sharepath, source, srcpeer, dest, pull=False,
+                   rev=None, update=True, stream=False):
+    """Perform a clone using a shared repo.
+
+    The store for the repository will be located at <sharepath>/.hg. The
+    specified revisions will be cloned or pulled from "source". A shared repo
+    will be created at "dest" and a working copy will be created if "update" is
+    True.
+    """
+    revs = None
+    if rev:
+        if not srcpeer.capable('lookup'):
+            raise util.Abort(_("src repository does not support "
+                               "revision lookup and so doesn't "
+                               "support clone by revision"))
+        revs = [srcpeer.lookup(r) for r in rev]
+
+    basename = os.path.basename(sharepath)
+
+    if os.path.exists(sharepath):
+        ui.status(_('(sharing from existing pooled repository %s)\n') % basename)
+    else:
+        ui.status(_('(sharing from new pooled repository %s)\n') % basename)
+        # Always use pull mode because hardlinks in share mode don't work well.
+        # Never update because working copies aren't necessary in share mode.
+        clone(ui, peeropts, source, dest=sharepath, pull=True,
+              rev=rev, update=False, stream=stream)
+
+    sharerepo = repository(ui, path=sharepath)
+    share(ui, sharerepo, dest=dest, update=update, bookmarks=False)
+
+    # We need to perform a pull against the dest repo to fetch bookmarks
+    # and other non-store data that isn't shared by default. In the case of
+    # non-existing shared repo, this means we pull from the remote twice. This
+    # is a bit weird. But at the time it was implemented, there wasn't an easy
+    # way to pull just non-changegroup data.
+    destrepo = repository(ui, path=dest)
+    exchange.pull(destrepo, srcpeer, heads=revs)
+
+    return srcpeer, peer(ui, peeropts, dest)
+
 def clone(ui, peeropts, source, dest=None, pull=False, rev=None,
-          update=True, stream=False, branch=None):
+          update=True, stream=False, branch=None, shareopts=None):
     """Make a copy of an existing repository.
 
     Create a copy of an existing repository in a new directory.  The
     source and destination are URLs, as passed to the repository
@@ -319,8 +360,15 @@ def clone(ui, peeropts, source, dest=Non
     destination is local repository (True means update to default rev,
     anything else is treated as a revision)
 
     branch: branches to clone
+
+    shareopts: dict of options to control auto sharing behavior. The "pool" key
+    activates auto sharing mode and defines the directory for stores. The
+    "mode" key determines how to construct the directory name of the shared
+    repository. "identity" means the name is derived from the node of the first
+    changeset in the repository. "remote" means the name is derived from the
+    remote's path/URL. Defaults to "identity."
     """
 
     if isinstance(source, str):
         origsource = ui.expandpath(source)
@@ -351,8 +399,38 @@ def clone(ui, peeropts, source, dest=Non
             raise util.Abort(_("destination '%s' already exists") % dest)
         elif destvfs.listdir():
             raise util.Abort(_("destination '%s' is not empty") % dest)
 
+    shareopts = shareopts or {}
+    sharepool = shareopts.get('pool')
+    sharenamemode = shareopts.get('mode')
+    if sharepool:
+        sharepath = None
+        if sharenamemode == 'identity':
+            # Resolve the name from the initial changeset in the remote
+            # repository. This returns nullid when the remote is empty. It
+            # raises RepoLookupError if revision 0 is filtered or otherwise
+            # not available. If we fail to resolve, sharing is not enabled.
+            try:
+                rootnode = srcpeer.lookup('0')
+                if rootnode != node.nullid:
+                    sharepath = os.path.join(sharepool, node.hex(rootnode))
+                else:
+                    ui.status(_('(not using pooled storage: '
+                                'remote appears to be empty)\n'))
+            except error.RepoLookupError:
+                ui.status(_('(not using pooled storage: '
+                            'unable to resolve identity of remote)\n'))
+        elif sharenamemode == 'remote':
+            sharepath = os.path.join(sharepool, util.sha1(source).hexdigest())
+        else:
+            raise util.Abort('unknown share naming mode: %s' % sharenamemode)
+
+        if sharepath:
+            return clonewithshare(ui, peeropts, sharepath, source, srcpeer,
+                                  dest, pull=pull, rev=rev, update=update,
+                                  stream=stream)
+
     srclock = destlock = cleandir = None
     srcrepo = srcpeer.local()
     try:
         abspath = origsource
diff --git a/tests/test-clone.t b/tests/test-clone.t
--- a/tests/test-clone.t
+++ b/tests/test-clone.t
@@ -673,5 +673,343 @@ Test clone from the repository in (emula
   0:e1bab28bca43
   $ hg clone -U -q src dst
   $ hg -R dst log -q
   0:e1bab28bca43
+
+Create repositories to test auto sharing functionality
+
+  $ cat >> $HGRCPATH << EOF
+  > [extensions]
+  > share=
+  > EOF
+
+  $ hg init empty
+  $ hg init source1a
+  $ cd source1a
+  $ echo initial1 > foo
+  $ hg -q commit -A -m initial
+  $ echo second > foo
+  $ hg commit -m second
   $ cd ..
+
+  $ hg init filteredrev0
+  $ cd filteredrev0
+  $ cat >> .hg/hgrc << EOF
+  > [experimental]
+  > evolution=createmarkers
+  > EOF
+  $ echo initial1 > foo
+  $ hg -q commit -A -m initial0
+  $ hg -q up -r null
+  $ echo initial2 > foo
+  $ hg -q commit -A -m initial1
+  $ hg debugobsolete c05d5c47a5cf81401869999f3d05f7d699d2b29a e082c1832e09a7d1e78b7fd49a592d372de854c8
+  $ cd ..
+
+  $ hg -q clone --pull source1a source1b
+  $ cd source1a
+  $ hg bookmark bookA
+  $ echo 1a > foo
+  $ hg commit -m 1a
+  $ cd ../source1b
+  $ hg -q up -r 0
+  $ echo head1 > foo
+  $ hg commit -m head1
+  created new head
+  $ hg bookmark head1
+  $ hg -q up -r 0
+  $ echo head2 > foo
+  $ hg commit -m head2
+  created new head
+  $ hg bookmark head2
+  $ hg -q up -r 0
+  $ hg branch branch1
+  marked working directory as branch branch1
+  (branches are permanent and global, did you want a bookmark?)
+  $ echo branch1 > foo
+  $ hg commit -m branch1
+  $ hg -q up -r 0
+  $ hg branch branch2
+  marked working directory as branch branch2
+  $ echo branch2 > foo
+  $ hg commit -m branch2
+  $ cd ..
+  $ hg init source2
+  $ cd source2
+  $ echo initial2 > foo
+  $ hg -q commit -A -m initial2
+  $ echo second > foo
+  $ hg commit -m second
+  $ cd ..
+
+Clone with auto share from an empty repo should not result in share
+
+  $ mkdir share
+  $ hg --config share.pool=share clone empty share-empty
+  (not using pooled storage: remote appears to be empty)
+  updating to branch default
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ ls share
+  $ test -d share-empty/.hg/store
+  $ test -f share-empty/.hg/sharedpath
+  [1]
+
+Clone with auto share from a repo with filtered revision 0 should not result in share
+
+  $ hg --config share.pool=share clone filteredrev0 share-filtered
+  (not using pooled storage: unable to resolve identity of remote)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files
+  updating to branch default
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+Clone from repo with content should result in shared store being created
+
+  $ hg --config share.pool=share clone source1a share-dest1a
+  (sharing from new pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  no changes found
+  adding remote bookmark bookA
+
+The shared repo should have been created
+
+  $ ls share
+  b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1
+
+The destination should point to it
+
+  $ cat share-dest1a/.hg/sharedpath; echo
+  $TESTTMP/share/b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1/.hg
+
+The destination should have bookmarks
+
+  $ hg -R share-dest1a bookmarks
+     bookA                     2:e5bfe23c0b47
+
+The default path should be the remote, not the share
+
+  $ hg -R share-dest1a config paths.default
+  $TESTTMP/source1a
+
+Clone with existing share dir should result in pull + share
+
+  $ hg --config share.pool=share clone source1b share-dest1b
+  (sharing from existing pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 4 changes to 1 files (+4 heads)
+  adding remote bookmark head1
+  adding remote bookmark head2
+
+  $ ls share
+  b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1
+
+  $ cat share-dest1b/.hg/sharedpath; echo
+  $TESTTMP/share/b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1/.hg
+
+We only get bookmarks from the remote, not everything in the share
+
+  $ hg -R share-dest1b bookmarks
+     head1                     3:4a8dc1ab4c13
+     head2                     4:99f71071f117
+
+Default path should be source, not share.
+
+  $ hg -R share-dest1b config paths.default
+  $TESTTMP/source1a
+
+Clone from unrelated repo should result in new share
+
+  $ hg --config share.pool=share clone source2 share-dest2
+  (sharing from new pooled repository 22aeff664783fd44c6d9b435618173c118c3448e)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  no changes found
+
+  $ ls share
+  22aeff664783fd44c6d9b435618173c118c3448e
+  b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1
+
+remote naming mode works as advertised
+
+  $ hg --config share.pool=shareremote --config share.poolnaming=remote clone source1a share-remote1a
+  (sharing from new pooled repository 195bb1fcdb595c14a6c13e0269129ed78f6debde)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  no changes found
+  adding remote bookmark bookA
+
+  $ ls shareremote
+  195bb1fcdb595c14a6c13e0269129ed78f6debde
+
+  $ hg --config share.pool=shareremote --config share.poolnaming=remote clone source1b share-remote1b
+  (sharing from new pooled repository c0d4f83847ca2a873741feb7048a45085fd47c46)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 6 changesets with 6 changes to 1 files (+4 heads)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  no changes found
+  adding remote bookmark head1
+  adding remote bookmark head2
+
+  $ ls shareremote
+  195bb1fcdb595c14a6c13e0269129ed78f6debde
+  c0d4f83847ca2a873741feb7048a45085fd47c46
+
+request to clone a single revision is respected in sharing mode
+
+  $ hg --config share.pool=sharerevs clone -r 4a8dc1ab4c13 source1b share-1arev
+  (sharing from new pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  no changes found
+  adding remote bookmark head1
+
+  $ hg -R share-1arev log -G
+  @  changeset:   1:4a8dc1ab4c13
+  |  bookmark:    head1
+  |  tag:         tip
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     head1
+  |
+  o  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+making another clone should only pull down requested rev
+
+  $ hg --config share.pool=sharerevs clone -r 99f71071f117 source1b share-1brev
+  (sharing from existing pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files (+1 heads)
+  adding remote bookmark head1
+  adding remote bookmark head2
+
+  $ hg -R share-1brev log -G
+  o  changeset:   2:99f71071f117
+  |  bookmark:    head2
+  |  tag:         tip
+  |  parent:      0:b5f04eac9d8f
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     head2
+  |
+  | @  changeset:   1:4a8dc1ab4c13
+  |/   bookmark:    head1
+  |    user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    summary:     head1
+  |
+  o  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+Request to clone a single branch is respected in sharing mode
+
+  $ hg --config share.pool=sharebranch clone -b branch1 source1b share-1bbranch1
+  (sharing from new pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  no changes found
+
+  $ hg -R share-1bbranch1 log -G
+  o  changeset:   1:5f92a6c1a1b1
+  |  branch:      branch1
+  |  tag:         tip
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     branch1
+  |
+  @  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+  $ hg --config share.pool=sharebranch clone -b branch2 source1b share-1bbranch2
+  (sharing from existing pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files (+1 heads)
+
+  $ hg -R share-1bbranch2 log -G
+  o  changeset:   2:6bacf4683960
+  |  branch:      branch2
+  |  tag:         tip
+  |  parent:      0:b5f04eac9d8f
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     branch2
+  |
+  | o  changeset:   1:5f92a6c1a1b1
+  |/   branch:      branch1
+  |    user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    summary:     branch1
+  |
+  @  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+-U is respected in share clone mode
+
+  $ hg --config share.pool=share clone -U source1a share-1anowc
+  (sharing from existing pooled repository b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1)
+  searching for changes
+  no changes found
+  adding remote bookmark bookA
+
+  $ ls share-1anowc


More information about the Mercurial-devel mailing list