[PATCH 2 of 5 clonebundles V2] clonebundles: filter on bundle type

Gregory Szorc gregory.szorc at gmail.com
Fri Oct 9 14:33:51 CDT 2015


# HG changeset patch
# User Gregory Szorc <gregory.szorc at gmail.com>
# Date 1444417708 25200
#      Fri Oct 09 12:08:28 2015 -0700
# Node ID 1f416cf58a7fb31d94ddee6671637ac51c03be9f
# Parent  d117d532ed0b3d03c8e4fe85b819964669d1be1a
clonebundles: filter on bundle type

Not all clients are capable of reading every bundle. This issue of
format compatibility is addressed by the existing wire protocol bundle
retrieval workflow. Either the client requests a known format or the
client sends its capabilities to the server and the server sends down
a compatible response.

Clone bundles are statically generated before the request (as opposed
to being dynamically generated such as with the "getbundle" wire
protocol command). This means that a modern server could produce
bundles that a legacy client isn't capable of reading. Without
information in the clone bundles manifest to identify the format of the
bundle, a client may attempt to download an incompatible bundle, only to
encounter an unknown format. This could even occur after successfully
reading most of the bundle (imagine consuming a 1 GB changegroup bundle2
part only to discover the part following is incompatible). This would
waste time and resources. And it isn't very user friendly.

Clone bundle manifests thus need to advertise the *exact* format of the
hosted bundles so clients may filter out entries that they don't know
how to read. This patch starts to introduce that filtering mechanism.

We introduce the TYPE attribute to declare the entry as a specific type.
If a client doesn't know how to handle this type, the entry is silently
ignored.

This patch intentionally does *not* support bundle2. The reason is that
there is not yet a well-defined way to declare the full type of a
bundle2 instance. You can't just say "bundle2" or "HG20" because future
additions to bundle2 may introduce new bundle2 parts or parameters that
legacy clients aren't able to read. A full "compatibility string"
(similar to the bundle2 client capabilities advertised during the
"getbundle" wire protocol command) identifying all the bundle2 parts
and even their parameters will be required for clients to determine if
a statically generated bundle2 is compatible with their feature set.
Defining such a "compatibility string" is scope bloat. It is much easier
to just not support bundle2 for the time being.

The TYPE attribute is, however, compatible with bundle2 in the future.
We just need to prefix bundle2 entries with "HG20" and put the
"compatibility string" afterwards and future clients able to read the
"HG20" prefix will be able to process bundle2 entries.

diff --git a/hgext/clonebundles.py b/hgext/clonebundles.py
--- a/hgext/clonebundles.py
+++ b/hgext/clonebundles.py
@@ -31,9 +31,26 @@ can be used by site installations.
 The server operator is responsible for generating the bundle manifest file.
 
 Metadata Attributes:
 
-TBD
+TYPE
+   The type of bundle.
+
+   The first 4 bytes identify the main class of bundle. e.g. "HG10". Following
+   bytes declare class-specific features. For "HG10" changegroup files,
+   there will be 2 bytes following denoting the compression type. e.g.
+   "HG10GZ" or "HG10UN."
+
+   For changegroup files, this value is equivalent to the first 6 bytes of
+   the file.
+
+   Clients will automatically filter out types they do not know how to
+   consume.
+
+   Bundle2 bundles ("HG20") are currently not supported.
+
+   The actual value doesn't impact client behavior beyond filtering:
+   clients will still sniff the bundle type from downloaded content.
 """
 
 from mercurial import (
     extensions,
diff --git a/mercurial/exchange.py b/mercurial/exchange.py
--- a/mercurial/exchange.py
+++ b/mercurial/exchange.py
@@ -1520,17 +1520,28 @@ def _maybeapplyclonebundle(pullop):
         return
 
     res = remote._call('clonebundles')
     entries = parseclonebundlesmanifest(res)
-
-    # TODO filter entries by supported features.
-    # TODO sort entries by user preferences.
-
     if not entries:
         repo.ui.note(_('no clone bundles available on remote; '
                        'falling back to regular clone\n'))
         return
 
+    entries = filterclonebundleentries(repo.ui, entries)
+    if not entries:
+        # There is a thundering herd concern here. However, if a server
+        # operator doesn't advertise bundles appropriate for its clients,
+        # they deserve what's coming. Furthermore, from a client's
+        # perspective, no automatic fallback would mean not being able to
+        # clone!
+        repo.ui.warn(_('no compatible clone bundles available on server; '
+                       'falling back to regular clone\n'))
+        repo.ui.warn(_('(you may want to report this to the server '
+                       'operator)\n'))
+        return
+
+    # TODO sort entries by user preferences.
+
     url = entries[0]['URL']
     repo.ui.status(_('applying clone bundle from %s\n') % url)
     if trypullbundlefromurl(repo.ui, repo, url):
         repo.ui.status(_('finished applying clone bundle\n'))
@@ -1565,8 +1576,27 @@ def parseclonebundlesmanifest(s):
         m.append(attrs)
 
     return m
 
+def filterclonebundleentries(ui, entries):
+    newentries = []
+    for e in entries:
+        typ = e.get('TYPE')
+        if typ:
+            if typ not in changegroup.bundletypes:
+                ui.debug('filtering %s because unknown bundle type %s\n' %
+                         (e['URL'], typ))
+                continue
+
+            if typ.startswith('HG20'):
+                ui.debug('filtering %s because bundle2 is not supported\n' %
+                         e['URL'])
+                continue
+
+        newentries.append(e)
+
+    return newentries
+
 def trypullbundlefromurl(ui, repo, url):
     """Attempt to apply a bundle from a URL."""
     lock = repo.lock()
     try:
diff --git a/tests/test-clonebundles.t b/tests/test-clonebundles.t
--- a/tests/test-clonebundles.t
+++ b/tests/test-clonebundles.t
@@ -140,4 +140,52 @@ Bundle with full content works
   added 2 changesets with 2 changes to 2 files
   finished applying clone bundle
   searching for changes
   no changes found
+
+Entry with unknown TYPE is filtered and not used
+
+  $ cat > server/.hg/clonebundles.manifest << EOF
+  > http://bad.entry TYPE=UNKNOWN
+  > http://localhost:$HGPORT1/full.hg TYPE=HG10GZ
+  > EOF
+
+  $ hg clone -U http://localhost:$HGPORT filter-unknown-type
+  applying clone bundle from http://localhost:$HGPORT1/full.hg
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+  finished applying clone bundle
+  searching for changes
+  no changes found
+
+Automatic fallback when all entries are filtered
+
+  $ cat > server/.hg/clonebundles.manifest << EOF
+  > http://bad.entry TYPE=UNKNOWN
+  > EOF
+
+  $ hg clone -U http://localhost:$HGPORT filter-all
+  no compatible clone bundles available on server; falling back to regular clone
+  (you may want to report this to the server operator)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files
+
+Bundle2 is not yet supported
+
+  $ cat > server/.hg/clonebundles.manifest << EOF
+  > http://bad.entry TYPE=HG20
+  > http://bad.entry TYPE=HG20+foo
+  > EOF
+
+  $ hg clone -U http://localhost:$HGPORT filter-bundle2
+  no compatible clone bundles available on server; falling back to regular clone
+  (you may want to report this to the server operator)
+  requesting all changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 2 changes to 2 files


More information about the Mercurial-devel mailing list