Problem Statement

Cloning and pulling (large) repositories can consume a large amount of CPU on servers. In the face of high client volume, this can lead to resource exhaustion and service unavailability.

These operations can consume large amounts of CPU because every clone or pull that transfers changeset data results in the server creating a changegroup bundle of the data to be transferred. This operation is expensive because the producer has to read revlogs and construct new delta chains from the content. It is essentially re-encoding the revlog on the fly. For revlogs with large entries (such as manifests with 100,000 files) or large diffs, this can take a lot of CPU (and even I/O).

Solution: Pre-Generated Bundles

The inherent problem is servers are "rebundling" repository data for every clone or pull operation. What if instead of the server generating bundles at request time, it could pre-generate the bundles and save them somewhere. When the client connects, it could obtain the contents of that bundle, apply it, then pull the changes since the bundle was created.

This solution works because repository data is generally append-only and immutable. This means that clones and subsequent pulls can effectively be modeled as replays of a linear log of data. Data is strictly additive, so bundling a snapshot of the repository and then transferring the delta since that bundle is effectively equivalent to hg unbundle + hg pull.

This solution saves a significant amount of CPU on the server because reading a static file off disk (or redirecting elsewhere) is almost certainly much cheaper than rebundling.

Methods of Serving Pre-Generated Bundles

Inline bundle2 Part

In this solution, when a clone or pull is requested, the server takes inventory of what bundles are available. If an appropriate one is present, it reads its data and inserts it directly into the bundle2 reply. This is simply streaming bits off disk or elsewhere. Then, the server calculates what changesets aren't in the bundle and constructs a new bundle2 part containing those. From the client's perspective, it (likely) receives multiple changegroup bundles.

Pros:

Cons / Oddities:

External Bundle Download

In this solution, instead of the server sending the pre-generated bundle data inline with the bundle2 reply, it instead advertises a URL (and likely metadata) of a bundle to fetch. The client sees the URL, fetches and applies it, then gets the incremental data from the server.

There are a few variants of this, which will be explained shortly. However, there are some common concerns with this approach:

Inline Followup Variant

The server sends the bundle URL in bundle2 part and then sends a changegroup part with data since the bundle was generated.

Support for this exists in Mercurial today with the remote-changegroup bundle2 part. Server-side code for generating these parts and subsequent changegroup parts is not implemented in core.

Pros:

Cons:

Disconnect and Return Variant

Server sends URL. Client detaches, fetches and applies bundle. Then, the client reconnects to the server and does the equivalent of an hg pull (if necessary).

This could be implemented in a few different ways:

  1. Client issues getbundle with capabilities saying it can apply remote hosted bundles. Bundle URL part received. Client disconnects. Applies bundle. Starts over.
  2. Server advertises that it hosts bundles. Client requests a bundle, disconnects, applies bundle, and then reconnects for the pull.

These are very similar. But in #1 the bundles are integrated into the "getbundle" wire protocol command. In #2, there is likely a separate server command or "listkeys" namespace advertising bundles which the client can connect directly to for bundle info.

Pros:

Cons:

Clone Bundle Proposal

Serving bundles with partial repository content is more complicated than serving a snapshot of an entire repository. So, the initial proposal for serving from static, pre-generated bundles will focus on bootstrapping clones (not subsequent pulls).

Mozilla has implemented support for static bundle serving using this strategy and they typically serve >1TB/day using this model, saving hundreds of hours of CPU time on servers. It is implemented as a Mercurial extension - called bundleclone - that is installed on both the client and server. The proposal that follows is inspired by and very similar to Mozilla's solution.

The server advertises a "clonebundles" capability indicating that it has the potential to serve snapshots of entire repository data suitable for bootstrapping clones.

When clients call the "clonebundles" wire protocol command, they receive a manifest of available "clone bundles." Each entry contains a URL and optional key-value metadata. Manifests look something like this:

https://hg.cdn.mozilla.net/mozilla-central/d6ea652c579992daa9041cc9718bb7c6abefbc91.gzip.hg REQUIRESNI=true TYPE=HG10GZ
https://hg.cdn.mozilla.net/mozilla-central/d6ea652c579992daa9041cc9718bb7c6abefbc91.bzip2.hg REQUIRESNI=true TYPE=HG10BZ
https://s3-us-west-2.amazonaws.com/moz-hg-bundles-us-west-2/mozilla-central/d6ea652c579992daa9041cc9718bb7c6abefbc91.gzip.hg TYPE=HG10GZ ec2region=us-west-2
https://s3-external-1.amazonaws.com/moz-hg-bundles-us-east-1/mozilla-central/d6ea652c579992daa9041cc9718bb7c6abefbc91.gzip.hg TYPE=HG10GZ ec2region=us-east-1

Metadata with UPPERCASE keys is reserved for Mercurial's usage. These will describe officially supported attributes, such as bundle type, compression, required bundle2 part support, etc. The above example includes the use of REQUIRESNI, which tells clients that Server Name Indication (SNI) is required (this isn't supported on Python < 2.7.9).

Lowercase keys can be used by site deployments for custom operation. In the above example, our site operator has indicated which EC2 region a file is hosted in.

Cloning from pre-generated bundles is transparently supported as part of hg clone operations (actually part of the generic exchange.pull() code path). If a server is advertising bundles, the client will fetch the manifest automatically and choose an appropriate bundle to seed the clone from. If a bundle is available, it will fetch the URL, apply the content, then perform an incremental pull to retrieve content missing from the bundle (bookmarks, phases, etc in the case of non-bundle2) and data produced since the bundle was created (new pushes may not yet be in the advertised bundles).

The aforementioned key-value metadata is not only used by filtering of compatible entries by the client, but it is also used as a primitive form of content negotiation. Clients can express preferences for which attributes and values to prefer over others. For example, a client that knows it is on a fast network could request a "streaming clone" bundle instead of a gzipped one, trading more network utilization for lower CPU (and presumably yielding a faster clone in the process).

The default implementation of the server-side extension will literally be serving the manifest from a file on disk. However, more advanced implementations could do something more dynamic. In our example, clients were the ones performing content negotiation. However, there is nothing stopping a server operator from dynamically emitting a manifest based on the client. For example, IP detection could be used to advertise the URL in closest physical proximity to the client.

Non-Experimental Blockers

clonebundles is currently marked as experimental and needs to be enabled in order for clients to use it. The following need to be addressed before it is enabled by default.


CategoryDeveloper CategoryNewFeatures

StaticBundlePlan (last edited 2016-01-07 00:35:18 by GregorySzorc)