Wire protocol futures

Tue Sep 18 04:20:44 EDT 2018

I liked this post very much, it gave me a lot of information about
Mercurial.
Thanks for writing it up :)
I wish if I could help in achieving this goal of a new wire protocol.

On Sat, Sep 1, 2018 at 4:17 AM Gregory Szorc <gregory.szorc at gmail.com>
wrote:

> This is a long post and I apologize in advance for that. I've been
> spending a lot of company-sponsored time on the wire protocol and storage
> this year in order to get partial clones in a place where Mozilla can start
> using them heavily. I realized that I haven't done a good job articulating
> the overall vision for that work and I wanted to write up a
> semi-comprehensive brain dump of things that are on my mind and changes
> that I plan to send out for review in the next few weeks. Despite the
> length of this post, it isn't a comprehensive brain dump: I'm excluding
> details about storage refactorings for example. This post focuses mostly on
> the wire protocol.
>
> Back in the 4.6 cycle, I started work on a ground-up rewrite of the wire
> protocol. The overarching goal behind this work was implementing partial
> clone in an as-optimal-as-possible manner. (Partial clone is the ability
> for clients to have a subset of files and/or a subset of history, where
> history applies to file, manifest, and even changesets).
>
> I looked at existing implementations of partial clone (namely
> remotefilelog and narrow extensions) and saw what I perceived to be
> sub-optimal decisions on account of limitations in the existing wire
> protocol. I also saw limitations in the existing wire protocol that made
> building out scalable and future-proof servers difficult. Here is a partial
> list of problems with the existing wire protocol and command set (in no
> particular order):
>
> * The HTTP and SSH transports vary drastically and require substantially
> different client and server code paths to handle.
> * The SSH transport has no outer "framing" protocol and this makes it
> nearly impossible to have nice things (like compression of arbitrary
> payloads).
> * Lack of binary safety in command arguments (requires transferring e.g.
> hex nodes instead of binary, which adds overhead).
> * Usage of bespoke encoding formats for data. We needed to roll our own
> encoders and decoders for every data structure. This adds complexity and
> limits future changes since you can't easily add fields without breaking
> backwards compatibility. (See bundle2.)
> * Server-side command processing limited to a single thread (tolerable for
> Python due to GIL, not so much for custom servers that can implement
> multithreading more easily).
> * Lack of side-channels for representing progress events on server.
> * Server-emitted messages were translated before being sent to the client
> (bad UX experience for non-English speakers).
> * Clone bundles and pull bundles are added features that clients need to
> know about rather than being supported by the command protocol itself (more
> on this later).
> * Not possible to batch issue any set of commands (some commands could be
> batched, others couldn't).
> * Not obvious how to transition away from SHA-1.
> * Not obvious how to transition an existing repo from flat manifests to
> tree manifests.
> * Auth / access control outside of Mercurial difficult to implement.
> * Clients could not issue thousands of commands efficiently.
> * Difficult to implement server-side caching.
> * linkrev/linknode adjustment is performed on the server and not very
> extensible.
> * Data access model geared heavily towards full clones. i.e. no low-level
> APIs for accessing specific data, like just a single revision of a single
> file or just the index shape of the changelog.
> * General lack of typing and strong specification of semantics, data
> formats, etc.
>
> I postulate that many of the design decisions around the current set of
> wire protocol commands stem from limitations in the existing wire protocol
> transport format. And that transport format and general repository
> cloning/pull strategy has changed little since ~2006.
>
> For example, because we don't have a unified transport format, things like
> compression are inconsistent between SSH and HTTP and need to be dealt with
> in arcane ways. Performance and code maintainability suffers.
>
> For example, because we can't issue thousands of commands efficiently, we
> build monolithic commands (like "getbundle") that transfer many pieces of
> data. And a side-effect of monolithic commands is increased server-side
> complexity. And that makes implementing alternate servers more difficult.
> And it undermines caching potential. And it makes it more difficult to
> implement things like resumable clone.
>
> I wanted to build wire protocol transport and command layers that would
> give us the flexibility to start from first principles and implement data
> exchange on our own terms, using the knowledge that we've accrued in the
> 10+ years of the project and the cumulative decades of version control
> experience that various contributors have accrued. This means designing a
> wire protocol transport and command layer that facilitates server scaling,
> fast data access, and future changes (both from extensions and core
> changes). In my mind, this translates to the following set of requirements:
>
> * Keeping commands simple. This will make the server simple and make it
> easier to implement alternate servers.
> * Making commands deterministic and idempotent (to facilitate aggressive
> caching).
> * Supporting parallel serving and consumption with minimal overhead (to
> enable clients/servers not restricted by the GIL to go as fast as possible).
> * Extensible compression formats and ability to have fine-grained control
> over compression.
> * Providing granular access to data to facilitate multiple clone /
> checkout modes (e.g. a `svn co` style model for CI where the "clone"
> contains files for a single revision and not much more).
> * Support for out-of-band response serving built into the protocol itself
> (basically clonebundles but for any command).
> * And more.
>
> Later in this post, I'll go into details of what I've built so far and
> what is yet to come. But first, some history.
>
> In Mercurial today, the "getbundle" wire protocol command is used to
> transfer most repository data from server to client.
>
> In Mercurial 1.0, "getbundle" transferred a changegroup. A changegroup is
> a data structure containing segments of data corresponding to the
> changelog, manifestlog, and filelogs. This data contains "index" data
> (describes the DAG shape and linknodes) and "revision" data (describes
> fulltexts - usually as deltas). Essentially, a changegroup encapsulates
> revlog data. The initial version of bundle files were essentially
> changegroups.
>
> In Mercurial 1.6, the "listkeys" command was added. This command was used
> to transfer data not in revlogs/changegroups, such as bookmarks and phases.
>
> There were problems with this approach. Notably:
>
> * Server state could mutate between command requests, causing clients to
> have inconsistent or invalid data for bookmarks, phases, or anything else
> not transferred by changegroup data.
> * `hg bundle` didn't record all data necessary to express repository state.
>
> Mercurial 3.4 introduced bundle2 to solve these problems and more. Bundle2
> is a generic container format and therefore allows extensible storage via
> part names. When new data types are introduced, we invent a new bundle2
> part for them. The payload of each bundle2 part is defined by that part.
> i.e. we need to invent encoders and decoders for each part.
>
> At the wire protocol level, bundle2 shoehorned itself into the "getbundle"
> wire protocol command. If the client passed certain arguments into the
> command, the server would emit a bundle2 bundle instead of changegroup data.
>
> Over time, bundle2 kept growing. The wire protocol exchange and
> capabilities negotiation kept getting more complicated. (And that is
> arguably OK: that's the nature of an ever-changing system with backwards
> and future compatibility constraints.)
>
> At this time, all meaningful repository data can be transferred from
> server to client via "getbundle" with a bundle2 payload. From an end-user
> perspective, things are great because all data is retrieved atomically and
> standalone bundle files can hold all repository data.
>
> But on a technical level, things are not so great.
>
> In terms of data retrieval, there is effectively a single, monolithic
> server-side command: "getbundle." It's a "god RPC." And on the push side,
> the "unbundle" command is in a very similar boat as "getbundle." And
> limitations in the existing wire protocol transports makes it more
> difficult than it should be to introduce new commands.
>
> Various parties want to implement partial clone in Mercurial. The
> remotefilelog (RFL) and narrow extensions have both done this to some
> degree. But they did so building on top of the existing wire protocol
> transports. And in the case of narrow, it is built on top of the existing
> command set - namely "getbundle" and bundle2 (RFL introduces new wire
> protocol commands for transferring just file data).
>
> Let's talk about these in more detail.
>
> On the server, narrow burrows itself into the bowels of "getbundle" and
> bundle/changegroup generation. It introduces command arguments to allow
> clients to specify what files they are interested in, which nodes have been
> retrieved, etc. The server then takes all of this into account and adjusts
> the set of returned data accordingly. And there is a lot of code and
> complexity involved. And a lot of it is on the server. This makes servers
> more difficult to implement and harder to scale.
>
> Remotefilelog takes a different approach. RFL introduces new wire protocol
> commands for retrieving just file data. There is a command for retrieving
> the fulltext of just a single file. There is a command for bulk retrieval
> of file data (you essentially give it an iterable of paths and nodes and it
> spits out a changegroup-like data structure containing "index" and
> "revision" data for all of them). And RFL changes how clone/pull works.
> Instead of a single call to "getbundle" to retrieve all of the data, it
> requests just the changeset and manifest data first then follows up with
> calls to the RFL commands for file data retrieval.
>
> When I think about ways to implement partial clone, one theme that keeps
> worrying me is scalability. We already have problems scaling Mercurial
> servers. Clone and pull bundles are terrific solutions (as I wrote at [1],
> clone bundles are offloading ~97% of bytes served from hg.mozilla.org).
> But, these solutions work best with full clones, when the set of retrieved
> data is known ahead of time and can be pre-generated. Partial clones
> invalidate this world: it is no longer possible ahead of time to know
> exactly what data will be requested. And even if you did, for high velocity
> (commit rate) repos, the set of data being retrieved will be highly
> dynamic, making pre-generated bundle files prohibitively difficult to
> implement.
>
> This means that partial clone necessitates more traditional caching. (e.g.
> transparent caching of any wire protocol command response backed by an LRU
> store). But because "getbundle" is a monolithic, complicated, and an
> ever-evolving command, I have my doubts that caching of this command is
> feasible. Yes, it is certainly doable, but at high
> implementation/maintenance expense and high chance of introducing caching
> bugs. In the existing "getbundle" world, your best bet to caching is
> probably caching of the data that is inserted into the generated payload
> (e.g. caching of revision fulltexts and deltas). Unfortunately, this means
> the Mercurial server is still incurring a lot of load to assemble data and
> send it out over the wire (this includes compression). Even with partial
> clones potentially reducing server load dramatically due to having to
> transfer less data, at certain scales, even this reduced load is highly
> problematic. So, I think it is imperative to consider server scaling when
> talking about partial clone and I think wholesale caching of entire command
> responses is necessary in order to achieve it.
>
> With this mindset, I started exploring a data retrieval command set
> starting from first principles.
>
> At its core, Mercurial is a content-indexed store. It isn't as generic as
> Git (where every object is inserted into the same namespace). But it is
> close. Mercurial segments content-indexed data by changesets,
> manifest-trees, and files. (And if I had my way we would store metadata
> like bookmarks and phases in a similar manner and then have a "repolog"
> pointing to the list of head revisions and content-indexed bookmarks,
> phases, data etc so we could view the state of a repo at any past point in
> time.)
>
> Instead of a monolithic "getbundle" command that retrieved data for all of
> these things (plus metadata associated with changesets), what if we took a
> remotefilelog approach and provided APIs for accessing individual pieces of
> data? What if we had a command for accessing changeset data, a command for
> accessing manifest data, and a command for accessing file data? e.g. what
> if we had commands that accepted a list of explicit nodes or lists of base
> and head nodes and returned data about the corresponding revisions. How
> would that change things?
>
> For starters, having such a set of commands is substantially more flexible
> than where we are today with "getbundle." By giving clients granular access
> to data, you empower clients to devise new ways of consuming that data. For
> example, one could build a Subversion-like checkout feature (only fetch
> data for a specific revision of the repository) without any new features on
> the server! Given a changeset hash, you could fetch that changeset revision
> using a "get changeset data" command, find its manifest revision, fetch
> that manifest using a "get manifest data" command, then fetch corresponding
> file revisions using a "get file data" command. Other tools may also wish
> to leverage such APIs. For example, git-cinnabar (a Git extension that
> allows Git to push and pull against Mercurial repositories by speaking the
> Mercurial wire protocol) could have direct access to data (instead of going
> through "getbundle"/bundle2) and this would make it easier to import
> Mercurial data. (And probably more robust too because of issues like [2].)
>
> Another benefit would be a simpler server. Having granular and
> well-defined commands for accessing repository data would make it
> drastically simpler to implement a server, including custom servers (like
> Mononoke). You wouldn't need to implement the full spectrum of bundle2 and
> all its semantics via "getbundle." You would essentially have a pile of
> data retrieval commands. And, it would probably be relatively easy to plug
> non-revlog storage into the server at that point.
>
> And if done correctly, simple data retrieval commands with well-defined
> semantics would lend themselves to aggressive caching. For example, a "get
> revision data for file P at revisions [X, Y, Z]" can be cached almost
> effortlessly, since file revision state is immutable (modulo censoring). It
> would be possible to build pass-through caching of the entire command
> response. This would eliminate a ton of server load and make servers vastly
> easier to scale.
>
> If we limit ourselves to simple data retrieval commands on the server, it
> changes the "architecture" of clone/pull substantially. In the "getbundle"
> world, the server is doing a lot of work. The first thing it does is figure
> out what changeset revisions to send. Then it finds new manifests
> associated with those changesets. Then it finds new file revisions
> associated with those manifests. It accumulates all this state and streams
> all that data. If all you have is simple data retrieval commands, most of
> this work shifts to the client. This does have its advantages.
>
> Again, an advantage is complexity is moved from server to client. This
> keeps servers simple and easier to implement, debug, and scale.
>
> One component that shifts to clients is link nodes. Link nodes (or
> linkrevs since they are stored as an integer in revlogs) are pointers to
> the first changeset that introduced a revision. They allow you to go from
> e.g. an arbitrary file node to a changeset very efficiently. Because we
> index each file separately and because each file revision has a pointer to
> a changeset, we can look at the history of an individual file and map that
> history back to changesets without having to scan all changesets or open
> manifests. Link nodes have their own problems in the presence of hidden
> changesets. But in the context of the wire protocol today, the server is
> computing link nodes as part of emitting revision data. This model kind of
> falls apart in a partial clone world because the server doesn't know what
> changesets the client has. The client is in the best position to determine
> what changeset a file revision should be linked to. Anyway, if you are only
> using simple data retrieval commands, this problem of file node mapping
> (and the corresponding problem of adjustment that arises when hidden
> changesets are in play) can be fully shifted to the client: the client can
> keep track of which changeset/manifest introduced a file revision as part
> of its file node discovery process and set the link node accordingly.
>
> Another component that largely shifts to clients is "narrow" logic. In the
> narrow extension today, the client tells "getbundle" what file patterns are
> relevant and what nodes it already has and the server has to do a lot of
> work around determining what revisions to send. If all you have is
> primitive data retrieval APIs, you would probably add a "path filter"
> argument to the "get changeset data" command, retrieve the relevant
> changesets, then incrementally retrieve manifests and files revisions until
> you have all the data you need. This drastically reduces the server-side
> complexity and cost of narrow.
>
> Another problem that seemingly becomes simpler is large file storage. I
> argue that largefiles and LFS today is effectively a hack to facilitate
> non-partial clones despite the presence of large files. We store and
> transfer flagged large files specially. But if your method of accessing
> files data is through a dedicated "get file data" command, when you squint
> hard enough you realize that this is logically very similar to "all files
> are using largefiles/LFS." This leads to questions like "if we have a
> dedicated 'get file data' API, why do we need a special store / endpoint
> for large files?" And if we communicate the sizes of files before file data
> is retrieved or don't transfer revision data over a size threshold unless
> the client asks, this puts clients in the driver's seat about whether to
> fetch large files revisions. We could implement all the benefits of
> largefiles / LFS without it having to be a feature that repositories and
> servers opt in to! i.e. clients could dynamically apply special storage
> settings on large file revisions as they see fit.
>
> But this architectural shift would have its disadvantages.
>
> Assuming you only have primitive data retrieval commands, you are now
> issuing a lot more commands. This introduces the potential for receiving
> non-atomic state - a regression from "getbundle"/bundle2. This introduces
> more round trips to the server, which could add significant overhead. If
> sending thousands of command requests, this could contribute significant
> overhead for both client and server. Your Mercurial server could be
> processing >10,000x more commands than before relatively easily!
>
> If clients must specify the nodes of all requested data, this requires
> clients to transfer nodes up to the server. Many network connections have
> limited upload bandwidth and such uploads could make data retrieval
> extremely slow.
>
> If clients need to scan manifests to find new file revisions so they can
> be retrieved explicitly, this will add considerable client-side overhead.
> (Today, changegroup generation cheats by using linkrevs to determine what
> file revisions to send and this is considerably faster than reading and
> walking manifests.)
>
> While there are disadvantages to a completely primitive set of data
> retrieval commands, having this set of commands (fetch changeset, manifest,
> and file data) offers a host of benefits. If nothing else, merely having
> the commands will foster client-side experimentation because pretty much
> any data retrieval strategy can be derived from this set of primitives.
>
> So, I will soon be sending patches that implement the new commands:
> "changesetdata," "manifestdata," "filedata." These commands allow the
> retrieval of data for individual changeset, manifest, and file revisions.
> And "data" here is a very loose term. The commands are all designed such
> that the client specifies exactly what "fields" to retrieve. Example fields
> include "parents" and "revision" to fetch the parent nodes and revision
> fulltext, respectively. This allows a client to request just the DAG/index
> data or just the revision data. And on "changesetdata," the fields
> "bookmarks" and "phases" are also recognized and result in the
> corresponding data being attached to relevant changeset revisions. Allowing
> the set of retrieved data to be dynamic introduces flexibility in clients.
> Clients could e.g. retrieve and store index data for everything while
> lazily fetching revision data on demand. We could also do things like
> expose new data primitives easily. For example, "changesetdata" could grow
> a "filechanges" field that returned a list of manifest mutations/diff in
> that changeset. This could allow bypassing the need to transfer and store
> manifest revisions explicitly. I believe this design to be similar to and
> compatible with Mononoke's concept of "bonsai changesets."
>
> I will also be sending patches that implement clone/pull using these new
> commands.
>
> While my initial experimentation with a totally overhauled set of commands
> for facilitating clone/pull is very promising, it's only a start. The
> simple commands as implemented are too simple and there's too much
> overhead. Full clones are substantially slower and the client has to do a
> lot of work and transfer a lot of data to the server. It is obvious we will
> need to supplement these basic commands with either specialized commands or
> special query modes. e.g. we likely want a way to request file revision
> data for multiple files in a given changeset or manifest rather than having
> to request the revision data for each file separately. At the end of the
> day, the wire protocol command set will be driven by practical needs, not
> by ivory tower architecting. We'll see what shortcuts we need to employ in
> the name of performance and we'll implement them.
>
> Let's talk a bit about performance.
>
> In the 4.6 release cycle, I started implementing a new wire protocol
> transport format. The overarching goal here was to devise an RPC protocol
> that was consistent across transports (namely SSH and HTTP) and had
> desirable scaling characteristics. The protocol is far from finished and
> will likely change substantially before it is marked as non-experimental.
> But it is already delivering on some of its promises with the new data
> access commands I described above. For example, instead of issuing N HTTP
> requests to invoke the "filedata" command N times, we can send 10,000 file
> data requests in a single HTTP request. This drastically cuts down on
> overhead. Any command using this wire protocol can be batched. Whereas the
> existing wire protocol pushes us towards monolithic commands due to wire
> protocol overhead, the new wire protocol allows us to have more, smaller
> commands with minimal overhead.
>
> One of the aces up my sleeve in the new wire protocol is support for
> "content redirects" for any command. Essentially, it will be clone/pull
> bundles built into the RPC protocol itself. The server will advertise a
> list of potential redirect targets. When the client makes a request, it
> will tell the server which redirect targets are appropriate. Then in the
> course of processing a request, the server can send a response that
> redirects the client to another location. For example, client A could make
> a request for "all revision data for all files in changeset X." The server
> will generate the response data for that request and simultaneously stream
> it to both the client and to a blob store, say Amazon S3. A CDN is
> configured to access that S3 bucket and the Mercurial server advertises the
> CDN as a "redirect target." Client B comes along and makes the same request
> for file data, advertising that the CDN is an appropriate "redirect
> target." The Mercurial server sees that there is a cached response to this
> command in S3 and it tells the client "fetch the response from this
> CDN-hosted URL."
>
> I plan to make aggressive caching and content redirects 1st class citizens
> in the new RPC protocol and server implementation. I want it to be possible
> to cache the results of commands by adding a one-liner to the Python
> decorator declaring the wire protocol command. I want there to be a simple
> caching interface so that extensions can implement their own caching
> providers. I want server operators to be able to add "CDN acceleration" to
> their Mercurial servers by activating an extension and adding <10 lines to
> an hgrc file. Put another way, I want to make it as easy as possible to
> scale Mercurial servers. I don't want to hear stories about companies
> complaining how resource intensive running their Mercurial server is. If
> the ideas I have are implemented, I'm pretty certain we'll be able to
> deliver on that promise. (And, yes, I'm considering the needs of private
> organizations who will want things like access control on their
> cache/content store.)
>
> The new wire protocol and proposed command set represents a massive
> change. There is absolutely no backwards compatibility. I believe Kevin
> said something like "the wire protocol defines the interchange format of a
> VCS and therefore it *is* the VCS: so any new wire protocol is tantamount
> to inventing a new VCS." There is truth to that statement. And I fully
> recognize that this work could be characterized as inventing a new VCS. It
> will be the first new VCS that Mercurial invented since bundle2 :) But,
> having spent a lot of time thinking about the wire protocol, it is obvious
> to me that the existing wire protocol is a liability to the future of the
> project. I postulate that if we had a well-designed wire protocol with
> flexible data retrieval commands, partial clone would have shipped years
> ago. As it stands, I think we've incurred years of people time devising
> partial and somewhat hacky solutions that work around limitations in the
> existing wire protocol and command set and the architecture it forces us to
> have. I believe a new wire protocol and command set will alleviate most of
> these road blocks and allow us to have much nicer things.
>
> Since we are effectively talking about a new VCS at the wire protocol
> level, let's talk about other crazy ideas. As Augie likes to say, once we
> decide to incur a backwards compatibility break, we can drive a truck
> through it.
>
> Let's talk about hashes.
>
> Mercurial uses SHA-1 for content indexing. We know we want to transition
> off of SHA-1 eventually due to security weaknesses. One of the areas
> affected by that is the wire protocol. Changegroups use a fixed-width 20
> byte field to hold node values. That means we need to incur some kind of BC
> break in order to not use SHA-1 over the wire protocol. That's either
> truncating a longer hashing algorithm output to 20 bytes or expanding the
> fixed-width field to accommodate a different hash (likely 32 bytes). Either
> way, it requires a BC break because old clients would barf if they saw data
> with the new format.
>
> In addition, Mercurial has 2 ways to store manifests: flat and tree.
> Unfortunately, any given repository can only use a single manifest type at
> a time. If you switch manifest formats, you change the manifest node
> referenced in the changeset and that changes the changeset hash.
>
> The traditional way we've thought about this problem is incurring some
> kind of flag day. A server/repo operator makes the decision to one day
> transition to a new format that hashes differently. Clients start pulling
> the new data for all new revisions. Every time we talk about this, we get
> uncomfortable because it is a painful transition to inflict.
>
> I think we can do better.
>
> One of the ideas I'm exploring in the new wire protocol is the idea of
> "hash namespaces." Essentially, the server's capabilities will advertise
> which hash flavors are supported. Example hash flavors could be
> "hg-sha1-flat" for flat manifests using SHA-1 and "hg-blake2b-tree" for
> tree manifests using blake2b. When a client makes a request, that request
> will be associated with a "hash namespace" such that any nodes referenced
> by that command are in the requested "hash namespace."
>
> This feature, if implemented, would allow a server/repository to index and
> serve data under multiple hashing methodologies simultaneously. For
> example, pushes to the repository would be indexed under SHA-1 flat, SHA-1
> tree, blake2b flat, and blake2b tree. Assuming the server operator opts
> into this feature, new clones would use whatever format is
> supported/recommended at that time. Existing clones would continue to
> receive SHA-1 flat manifests. New clones would receive blake2b tree
> manifests. No forced transition flag day would be required. Server
> operators could choose to keep around support for legacy formats for as
> long as they deemed necessary. And the "changesetdata" command I'm
> proposing could allow querying the hashes for other namespaces, allowing
> clients to map between hashes.
>
> I think "hash namespaces" are important because they provide future
> compatibility against any format changes. We already have an example of a
> hash algorithm change (SHA-1) and a data format change (flat versus tree
> manifests). But there are other future changes we may not know of. For
> example, we may decide to change how files are hashed so copy metadata
> isn't part of the hash. Or we may choose to express manifest diffs as part
> of the changeset object and do away with manifests as a content-indexed
> primitive. These would all necessitate a new "hash namespace" and I think
> having the flexibility to experiment with new formats and hashing
> techniques will ultimately be good for the long-term health of Mercurial.
>
> There's also a potentially killer feature that could be derived from "hash
> namespaces:" Git integration. We know that it is possible to perform
> bi-directional conversions between Mercurial and Git. One could envision a
> "hash namespace" that stores Git hashes. When a push comes in, we could
> compute the Git hashes for its files (blobs), manifests (trees), and
> changesets (commits). Using the low-level "changesetdata," "manifestdata,"
> and "filedata" commands, you could request revision data by Git hash. Or
> you could request the Git hash from a Mercurial hash or vice-versa. From
> here, you could build a Git client that speaks the Mercurial wire protocol
> to access the Git-indexed data. (I imagine git-cinnabar would do this so it
> doesn't have to perform expensive hash conversion and tracking on the
> client.) And because Mercurial's wire protocol will have things like
> "content redirects" built-in, you will get scaling out-of-the-box. In other
> words, we can make the Mercurial server a pseudo-Git server by exposing the
> Git-indexed data via Mercurial's wire protocol commands. Of course, if you
> have Git hashes for revision data, it should be possible to run the actual
> Git wire protocol server. Either of these features would go a long way
> towards ending the Mercurial vs Git holy war for server operators: we tell
> people to run a Mercurial server that maintains a Git index of the data and
> call it a day.
>
> So where are we today and where is this going?
>
> We have the basis of a new wire protocol transport in core Mercurial. It
> still needs a lot of love and will undergo several BC breaks before it
> ships as non-experimental. But that's fine for an experimental feature. The
> editor for the HTTP/2 specification has offered to provide a spec review
> when the time comes and I fully intend on taking him up on that before we
> promote the protocol to non-experimental.
>
> The client/peer interface is in a pretty good state and we can issue
> commands and handle responses for the new protocol over HTTP. It may not do
> things optimally under the hood. But it works and is usable enough that we
> can start calling into wireproto v2-only commands.
>
> I have a handful of patches queued up to remove a bunch of warts/bugs with
> the existing wire protocol version 2 code. I'll start sending those soon.
>
> I also have a handful of patches queued up to implement new wire protocol
> commands "changesetdata," "manifestdata," and "filedata." These commands
> aren't complete. But they are enough to implement clone/pull without
> "getbundle"/bundle2. Regardless of the final set of commands we need in
> order to support efficient clones (we may even port "getbundle" to wire
> protocol version 2), I'd like to get these primitive commands landed
> because all clone/pull strategies should be implementable in terms of them
> and they will make very useful arrows in our quiver.
>
> I have designs and some preliminary code for robust caching and content
> redirection on the server. I'm pretty confident in stating that it will
> work. And I'm committed to making it work, as Mozilla will want to leverage
> this feature.
>
> I have ongoing work around formalizing everything related to repository
> storage. I want to formalize interfaces for accessing the storage
> primitives. The goal here is to make it possible to implement non-revlog
> repository storage. There are benefits to both clients and servers for this
> work. On servers, I'd like it to be possible to use e.g. generic key-value
> stores for storage so we don't rely on local filesystems. On clients, I'd
> like to experiment with alternate storage that doesn't require writing so
> many files. This will help with clone times, especially on Windows. I think
> SQLite is a good place to start. But I'm open to alternatives.
>
> My goal is for 4.8 to ship a version of partial clone that we can use on
> hg.mozilla.org on our existing infrastructure. This means no substantial
> increase in server load. Since we currently offload ~97% of bytes via clone
> bundles, I'm guessing this is going to be difficult to impossible without
> transparent command caching. And I don't think we can have that with
> "getbundle" because that command is too complicated. So I really want to
> land the new commands for data access and have a mechanism in core for
> doing a partial clone with them. I would also like to land an experimental
> client storage backend that doesn't require per-file revlogs for file
> storage. All of these things can be experimental and not subject to BC: I'm
> willing to deal with that pain on Mozilla's end until things stabilize
> upstream. I don't expect any of this work will stabilize before a release
> or 2 into 2019 anyway.
>
> If you want to help, there are tons of ways to do that.
>
> Foremost, if you have feedback about this post, say something! I'm
> proposing some radical things. People should question changes that are this
> radical! I think I've demonstrated or will demonstrate some significant
> value to this work. But just because you can do a thing doesn't mean you
> must do a thing.
>
> There is no shortage of work around adding interfaces to storage and
> refactoring storage APIs so they aren't revlog specific. There are entire
> features like bundlerepo, unionrepo, repair, and repo upgrading that make
> heavy assumptions about the existence of revlogs and current file formats.
> Auditing the existing interfaces in repository.py and removing things that
> don't belong would also be a good use of time. While I've been focused on
> the revlog primitives so far, we will also need to add interfaces for
> everything that writes to .hg/. e.g. bookmarks, phases, locks, and
> transactions. We need to figure out a way to make these things code to an
> interface so implementation details of the existing .hg/ storage format
> don't bleed out into random callers. The tests/simplestorerepo.py extension
> implements things with custom storage and running the tests with that
> extension flushes out places in code that make assumptions about how
> storage works.
>
> Once the new wire protocol commands and exchange code lands, I'll need
> help adding features to support partial clone. There are still some parts I
> don't fully grok, such as ellipsis nodes and widening/narrowing. I /think/
> my radical shift of pull logic from server to client makes these problems
> more tractable. But I don't understand the space as well as others.
>
> If you have a wish list for other features to add to the wire protocol,
> now would be the time to say something.
>
> When the time comes, I think it would be rad to experiment with the
> multiple hash storage ideas I outlined above. I'd be particularly
> interested in multi-storage of flat and tree manifests as well as Git
> indexing of revisions. Both features would be very useful for Mozilla.
>
> Whew. That was a long email. And I didn't even say everything I could have
> said on the subject. If you made it here, congratulations. Hopefully you
> now have a better understanding of the work I'm doing and where I hope this
> all leads. If you want to help, ping me here or on IRC (I'm indygreg) and
> we'll figure something out.
>
> Gregory
>
> [1]
> https://gregoryszorc.com/blog/2018/07/27/benefits-of-clone-offload-on-version-control-hosting/
> [2] https://github.com/glandium/git-cinnabar/issues/192
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180918/4e26dc97/attachment.html>