[PATCH] hg: support for auto sharing stores when cloning

Gregory Szorc gregory.szorc at gmail.com
Fri Jul 3 21:19:17 UTC 2015


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1435958308 25200
#      Fri Jul 03 14:18:28 2015 -0700
# Node ID 17d2b4b8eaaf8bab93d75cfded8042e2d82a3bff
# Parent  84518051bc3b851f736872df045d662de548b3c9
hg: support for auto sharing stores when cloning

Many 3rd party consumers of Mercurial have created wrappers to
essentially perform clone+share as a single operation. This is
especially popular in automated processes like continuous integration
systems. The Jenkins CI software and Mozilla's Firefox release
automation infrastructure have both implemented custom code that
effectively perform clone+share. The common use case here is that
clients want to obtain N>1 checkouts while minimizing disk space and
network requirements. Furthermore, they often don't care that a clone
is an exact mirror of a remote: they are simply looking to obtain
checkouts of specific revisions.

When multiple third parties implement a similar feature, it's a good
sign that the feature is worth adding to the core product. This patch
adds support for an easy-to-use clone+share feature.

The internal "clone" function now accepts keyword arguments to control
auto sharing during clone. When the auto share mode is active, a store
will be created/updated under the base directory specified and a new
repository pointing to the shared store will be created at the path
specified by the user.

There are two modes for auto share mode. In the default mode, the shared
repo is derived from the first changeset (rev 0) in the remote
repository. This enables related repositories existing at different URLs
to automatically use the same storage. In environments that operate
several repositories (separate repo for branch/head/bookmark or separate
repo per user), this has the potential to drastically reduce storage
and network requirements. In the other mode, the name is derived from the
remote's path/URL.

diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -1412,14 +1412,21 @@ def clone(ui, source, dest=None, **opts)
     """
     if opts.get('noupdate') and opts.get('updaterev'):
         raise util.Abort(_("cannot specify both --noupdate and --updaterev"))
 
+    sharebasepath = ui.config('share', 'basepath', None)
+    if sharebasepath:
+        sharebasepath = util.expandpath(sharebasepath)
+    sharenamemode = ui.config('share', 'namemode', 'root')
+
     r = hg.clone(ui, opts, source, dest,
                  pull=opts.get('pull'),
                  stream=opts.get('uncompressed'),
                  rev=opts.get('rev'),
                  update=opts.get('updaterev') or not opts.get('noupdate'),
-                 branch=opts.get('branch'))
+                 branch=opts.get('branch'),
+                 sharebasepath=sharebasepath,
+                 sharenamemode=sharenamemode)
 
     return r is None
 
 @command('^commit|ci',
diff --git a/mercurial/help/config.txt b/mercurial/help/config.txt
--- a/mercurial/help/config.txt
+++ b/mercurial/help/config.txt
@@ -1294,8 +1294,30 @@ Controls generic server settings.
 ``maxhttpheaderlen``
     Instruct HTTP clients not to send request headers longer than this
     many bytes. Default is 1024.
 
+``share``
+---------
+
+Configuration for repository sharing.
+
+:hg:`clone` can automatically share storage for multiple clones. This
+reduces the storage requirements from 1 copy per local clone to 1 copy
+per remote repository. It also reduces network transfer requirements,
+since each changeset and its corresponding data will only be transferred
+once.
+
+``basepath``
+    Filesystem path where shared repository data will be stored. When
+    defined, :hg:`clone` will automatically use shared repository
+    storage instead of creating a unique store for each clone.
+
+``namemode``
+    How directory names in ``basepath`` are constructed. "root" means
+    the name is derived from the first changeset in the repository.
+    "remote" means the name is derived from the source repository's
+    path or URL. The default is "root."
+
 ``smtp``
 --------
 
 Configuration for extensions that need to send email messages.
diff --git a/mercurial/hg.py b/mercurial/hg.py
--- a/mercurial/hg.py
+++ b/mercurial/hg.py
@@ -283,10 +283,43 @@ def copystore(ui, srcrepo, destpath):
     except: # re-raises
         release(destlock)
         raise
 
+def clonewithshare(ui, peeropts, sharepath, source, srcpeer, dest, pull=False,
+                   rev=None, update=True, stream=False):
+    """Perform a clone using a shared repo.
+
+    The store for the repository will be located at <sharepath>/.hg. The
+    specified revisions will be cloned or pulled from "source". A shared repo
+    will be created at "dest" and a working copy will be created if "update" is
+    True.
+    """
+    revs = None
+    if rev:
+        if not srcpeer.capable('lookup'):
+            raise util.Abort(_("src repository does not support "
+                               "revision lookup and so doesn't "
+                               "support clone by revision"))
+        revs = [srcpeer.lookup(r) for r in rev]
+
+    if os.path.exists(sharepath):
+        ui.status(_('(using existing shared repository)\n'))
+        sharerepo = repository(ui, path=sharepath)
+        exchange.pull(sharerepo, srcpeer, heads=revs)
+    else:
+        ui.status(_('(creating new shared repository)\n'))
+        # Always use pull mode because hardlinks in share mode are wonky.
+        # Never update because working copies aren't necessary in share mode.
+        clone(ui, peeropts, source, dest=sharepath, pull=True,
+              rev=rev, update=False, stream=stream)
+        sharerepo = repository(ui, path=sharepath)
+
+    share(ui, sharerepo, dest=dest, update=update, bookmarks=False)
+    return srcpeer, peer(ui, peeropts, dest)
+
 def clone(ui, peeropts, source, dest=None, pull=False, rev=None,
-          update=True, stream=False, branch=None):
+          update=True, stream=False, branch=None, sharebasepath=None,
+          sharenamemode='root'):
     """Make a copy of an existing repository.
 
     Create a copy of an existing repository in a new directory.  The
     source and destination are URLs, as passed to the repository
@@ -319,8 +352,16 @@ def clone(ui, peeropts, source, dest=Non
     destination is local repository (True means update to default rev,
     anything else is treated as a revision)
 
     branch: branches to clone
+
+    sharebasepath: if defined, activates auto sharing mode with stores
+    placed at the directory specified.
+
+    sharenamemode: determine how to construct the directory name of the shared
+    repository. "root" means the name is derived from the node of the first
+    changeset in the repository. "remote" means the name is derived from the
+    remote's path/URL. Defaults to "root."
     """
 
     if isinstance(source, str):
         origsource = ui.expandpath(source)
@@ -351,8 +392,42 @@ def clone(ui, peeropts, source, dest=Non
             raise util.Abort(_("destination '%s' already exists") % dest)
         elif destvfs.listdir():
             raise util.Abort(_("destination '%s' is not empty") % dest)
 
+    # In auto share mode, the store / actual source repo is in
+    # the <autosharebasedir>/<rev0> directory. Clone consists of ensuring
+    # that directory exists and is up to date followed by the creation of
+    # a shared repo from that source.
+    if sharebasepath:
+        sharepath = None
+        if sharenamemode == 'root':
+            # Repositories can have multiple roots. But using revision 0 should
+            # be good enough for almost all use cases. Worst case there is no
+            # cache hit.
+            rootnode = srcpeer.lookup('0')
+            # Sharing empty repositories doesn't make sense.
+            if rootnode != node.nullid:
+                sharepath = os.path.join(sharebasepath, node.hex(rootnode))
+        elif sharenamemode == 'remote':
+            normpath = source
+            replacements = (
+                ('\\', '/'),
+                ('://', '_'),
+                (':', '_'),
+                ('/', '_'),
+            )
+            for s, r in replacements:
+                normpath = normpath.replace(s, r)
+            normpath = normpath.lower()
+            sharepath = os.path.join(sharebasepath, normpath)
+        else:
+            raise util.Abort('unknown sharenamemode: %s' % sharenamemode)
+
+        if sharepath:
+            return clonewithshare(ui, peeropts, sharepath, source, srcpeer,
+                                  dest, pull=pull, rev=rev, update=update,
+                                  stream=stream)
+
     srclock = destlock = cleandir = None
     srcrepo = srcpeer.local()
     try:
         abspath = origsource
diff --git a/tests/test-clone.t b/tests/test-clone.t
--- a/tests/test-clone.t
+++ b/tests/test-clone.t
@@ -674,4 +674,259 @@ Test clone from the repository in (emula
   $ hg clone -U -q src dst
   $ hg -R dst log -q
   0:e1bab28bca43
   $ cd ..
+
+Create repositories to test auto sharing functionality
+
+  $ hg init empty
+  $ hg init source1a
+  $ cd source1a
+  $ echo initial1 > foo
+  $ hg -q commit -A -m initial
+  $ echo second > foo
+  $ hg commit -m second
+  $ cd ..
+  $ hg -q clone --pull source1a source1b
+  $ cd source1a
+  $ echo 1a > foo
+  $ hg commit -m 1a
+  $ cd ../source1b
+  $ hg -q up -r 0
+  $ echo head1 > foo
+  $ hg commit -m head1
+  created new head
+  $ hg -q up -r 0
+  $ echo head2 > foo
+  $ hg commit -m head2
+  created new head
+  $ hg -q up -r 0
+  $ hg branch branch1
+  marked working directory as branch branch1
+  (branches are permanent and global, did you want a bookmark?)
+  $ echo branch1 > foo
+  $ hg commit -m branch1
+  $ hg -q up -r 0
+  $ hg branch branch2
+  marked working directory as branch branch2
+  $ echo branch2 > foo
+  $ hg commit -m branch2
+  $ cd ..
+  $ hg init source2
+  $ cd source2
+  $ echo initial2 > foo
+  $ hg -q commit -A -m initial2
+  $ echo second > foo
+  $ hg commit -m second
+  $ cd ..
+
+Clone with auto share from an empty repo should not result in share
+
+  $ mkdir share
+  $ hg --config share.basepath=share clone empty share-empty
+  updating to branch default
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ ls share
+  $ test -d share-empty/.hg/store
+  $ test -f share-empty/.hg/sharedpath
+  [1]
+
+Clone from repo with content should result in shared store being created
+
+  $ hg --config share.basepath=share clone source1a share-dest1a
+  (creating new shared repository)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ ls share
+  b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1
+
+  $ cat share-dest1a/.hg/sharedpath; echo
+  */share/b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1/.hg (glob)
+
+Clone with existing share dir should result in pull + share
+
+  $ hg --config share.basepath=share clone source1b share-dest1b
+  (using existing shared repository)
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 4 changes to 1 files (+4 heads)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ ls share
+  b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1
+
+  $ cat share-dest1b/.hg/sharedpath; echo
+  */share/b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1/.hg (glob)
+
+Clone from unrelated repo should result in new share
+
+  $ hg --config share.basepath=share clone source2 share-dest2
+  (creating new shared repository)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ ls share
+  22aeff664783fd44c6d9b435618173c118c3448e
+  b5f04eac9d8f7a6a9fcb070243cccea7dc5ea0c1
+
+remote naming mode works as advertised
+
+  $ hg --config share.basepath=shareremote --config share.namemode=remote clone source1a share-remote1a
+  (creating new shared repository)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 3 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ ls shareremote
+  source1a
+
+  $ hg --config share.basepath=shareremote --config share.namemode=remote clone source1b share-remote1b
+  (creating new shared repository)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 6 changesets with 6 changes to 1 files (+4 heads)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ ls shareremote
+  source1a
+  source1b
+
+request to clone a single revision is respected in sharing mode
+
+  $ hg --config share.basepath=sharerevs clone -r 4a8dc1ab4c13 source1b share-1arev
+  (creating new shared repository)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ hg -R share-1arev log -G
+  @  changeset:   1:4a8dc1ab4c13
+  |  tag:         tip
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     head1
+  |
+  o  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+making another clone should only pull down requested rev
+
+  $ hg --config share.basepath=sharerevs clone -r 99f71071f117 source1b share-1brev
+  (using existing shared repository)
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files (+1 heads)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ hg -R share-1brev log -G
+  @  changeset:   2:99f71071f117
+  |  tag:         tip
+  |  parent:      0:b5f04eac9d8f
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     head2
+  |
+  | o  changeset:   1:4a8dc1ab4c13
+  |/   user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    summary:     head1
+  |
+  o  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+Request to clone a single branch is respected in sharing mode
+
+  $ hg --config share.basepath=sharebranch clone -b branch1 source1b share-1bbranch1
+  (creating new shared repository)
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 1 files
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ hg -R share-1bbranch1 log -G
+  o  changeset:   1:5f92a6c1a1b1
+  |  branch:      branch1
+  |  tag:         tip
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     branch1
+  |
+  @  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+  $ hg --config share.basepath=sharebranch clone -b branch2 source1b share-1bbranch2
+  (using existing shared repository)
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 1 changes to 1 files (+1 heads)
+  updating working directory
+  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
+
+  $ hg -R share-1bbranch2 log -G
+  o  changeset:   2:6bacf4683960
+  |  branch:      branch2
+  |  tag:         tip
+  |  parent:      0:b5f04eac9d8f
+  |  user:        test
+  |  date:        Thu Jan 01 00:00:00 1970 +0000
+  |  summary:     branch2
+  |
+  | o  changeset:   1:5f92a6c1a1b1
+  |/   branch:      branch1
+  |    user:        test
+  |    date:        Thu Jan 01 00:00:00 1970 +0000
+  |    summary:     branch1
+  |
+  @  changeset:   0:b5f04eac9d8f
+     user:        test
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     initial
+  
+
+-U is respected in share clone mode
+
+  $ hg --config share.basepath=share clone -U source1a share-1anowc
+  (using existing shared repository)
+  searching for changes
+  no changes found
+
+  $ ls share-1anowc


More information about the Mercurial-devel mailing list