Wire protocol futures

Augie Fackler raf at durin42.com
Thu Sep 13 17:59:45 EDT 2018


I have one specific question (search for "QUESTION" below) that might result in a short-term action item for me, rest is mostly commentary I think you could anticipate from me. :)

> On Aug 31, 2018, at 18:47, Gregory Szorc <gregory.szorc at gmail.com> wrote:
> 

[...]

> The new wire protocol and proposed command set represents a massive change. There is absolutely no backwards compatibility. I believe Kevin said something like "the wire protocol defines the interchange format of a VCS and therefore it *is* the VCS: so any new wire protocol is tantamount to inventing a new VCS." There is truth to that statement. And I fully recognize that this work could be characterized as inventing a new VCS. It will be the first new VCS that Mercurial invented since bundle2 :) But, having spent a lot of time thinking about the wire protocol, it is obvious to me that the existing wire protocol is a liability to the future of the project. I postulate that if we had a well-designed wire protocol with flexible data retrieval commands, partial clone would have shipped years ago. As it stands, I think we've incurred years of people time devising partial and somewhat hacky solutions that work around limitations in the existing wire protocol and command set and the architecture it forces us to have. I believe a new wire protocol and command set will alleviate most of these road blocks and allow us to have much nicer things.
> 
> Since we are effectively talking about a new VCS at the wire protocol level, let's talk about other crazy ideas. As Augie likes to say, once we decide to incur a backwards compatibility break, we can drive a truck through it.

+1

> Let's talk about hashes.
> 
> Mercurial uses SHA-1 for content indexing. We know we want to transition off of SHA-1 eventually due to security weaknesses. One of the areas affected by that is the wire protocol. Changegroups use a fixed-width 20 byte field to hold node values. That means we need to incur some kind of BC break in order to not use SHA-1 over the wire protocol. That's either truncating a longer hashing algorithm output to 20 bytes or expanding the fixed-width field to accommodate a different hash (likely 32 bytes). Either way, it requires a BC break because old clients would barf if they saw data with the new format.
> 
> In addition, Mercurial has 2 ways to store manifests: flat and tree. Unfortunately, any given repository can only use a single manifest type at a time. If you switch manifest formats, you change the manifest node referenced in the changeset and that changes the changeset hash.
> 
> The traditional way we've thought about this problem is incurring some kind of flag day. A server/repo operator makes the decision to one day transition to a new format that hashes differently. Clients start pulling the new data for all new revisions. Every time we talk about this, we get uncomfortable because it is a painful transition to inflict.
> 
> I think we can do better.
> 
> One of the ideas I'm exploring in the new wire protocol is the idea of "hash namespaces." Essentially, the server's capabilities will advertise which hash flavors are supported. Example hash flavors could be "hg-sha1-flat" for flat manifests using SHA-1 and "hg-blake2b-tree" for tree manifests using blake2b. When a client makes a request, that request will be associated with a "hash namespace" such that any nodes referenced by that command are in the requested "hash namespace."
> 
> This feature, if implemented, would allow a server/repository to index and serve data under multiple hashing methodologies simultaneously. For example, pushes to the repository would be indexed under SHA-1 flat, SHA-1 tree, blake2b flat, and blake2b tree. Assuming the server operator opts into this feature, new clones would use whatever format is supported/recommended at that time. Existing clones would continue to receive SHA-1 flat manifests. New clones would receive blake2b tree manifests. No forced transition flag day would be required. Server operators could choose to keep around support for legacy formats for as long as they deemed necessary. And the "changesetdata" command I'm proposing could allow querying the hashes for other namespaces, allowing clients to map between hashes.
> 
> I think "hash namespaces" are important because they provide future compatibility against any format changes. We already have an example of a hash algorithm change (SHA-1) and a data format change (flat versus tree manifests). But there are other future changes we may not know of. For example, we may decide to change how files are hashed so copy metadata isn't part of the hash. Or we may choose to express manifest diffs as part of the changeset object and do away with manifests as a content-indexed primitive. These would all necessitate a new "hash namespace" and I think having the flexibility to experiment with new formats and hashing techniques will ultimately be good for the long-term health of Mercurial.
> 
> There's also a potentially killer feature that could be derived from "hash namespaces:" Git integration. We know that it is possible to perform bi-directional conversions between Mercurial and Git. One could envision a "hash namespace" that stores Git hashes. When a push comes in, we could compute the Git hashes for its files (blobs), manifests (trees), and changesets (commits). Using the low-level "changesetdata," "manifestdata," and "filedata" commands, you could request revision data by Git hash. Or you could request the Git hash from a Mercurial hash or vice-versa. From here, you could build a Git client that speaks the Mercurial wire protocol to access the Git-indexed data. (I imagine git-cinnabar would do this so it doesn't have to perform expensive hash conversion and tracking on the client.) And because Mercurial's wire protocol will have things like "content redirects" built-in, you will get scaling out-of-the-box. In other words, we can make the Mercurial server a pseudo-Git server by exposing the Git-indexed data via Mercurial's wire protocol commands. Of course, if you have Git hashes for revision data, it should be possible to run the actual Git wire protocol server. Either of these features would go a long way towards ending the Mercurial vs Git holy war for server operators: we tell people to run a Mercurial server that maintains a Git index of the data and call it a day.

Hash namespaces are very similar (at least at a high level) to an approach taken in Veracity that allowed multiple hash algorithms to live side-by-side. I don't remember the details there, but it sounded painful. I'm not saying we shouldn't do this, just that it's likely to be rough. Bonsai changesets do seem like they help, to an extent.

I agree that we should at least reserve space for new hash(es) in the new format.

[...]

> I have ongoing work around formalizing everything related to repository storage. I want to formalize interfaces for accessing the storage primitives. The goal here is to make it possible to implement non-revlog repository storage. There are benefits to both clients and servers for this work. On servers, I'd like it to be possible to use e.g. generic key-value stores for storage so we don't rely on local filesystems. On clients, I'd like to experiment with alternate storage that doesn't require writing so many files. This will help with clone times, especially on Windows. I think SQLite is a good place to start. But I'm open to alternatives.
> 
> My goal is for 4.8 to ship a version of partial clone that we can use on hg.mozilla.org <http://hg.mozilla.org/> on our existing infrastructure. This means no substantial increase in server load. Since we currently offload ~97% of bytes via clone bundles, I'm guessing this is going to be difficult to impossible without transparent command caching. And I don't think we can have that with "getbundle" because that command is too complicated. So I really want to land the new commands for data access and have a mechanism in core for doing a partial clone with them. I would also like to land an experimental client storage backend that doesn't require per-file revlogs for file storage. All of these things can be experimental and not subject to BC: I'm willing to deal with that pain on Mozilla's end until things stabilize upstream. I don't expect any of this work will stabilize before a release or 2 into 2019 anyway.
> 
> If you want to help, there are tons of ways to do that.

I've mentioned this privately, but want to state it on the list: I'm now feeling enough pain on remotefilelog upgrades that I want to figure out what the minimal viable remotefilelog looks like for us. The hard constraints I know about for us:

1) lazy-fetch file contents
2) periodically build an efficient pack of loose files (but not too often, because it'll make some of our storage layers upset: I asked if we could just dump blobs in a sqlite database and that'll be really bad for us, so some bespoke-ish data-packing mechanism is going to be a must, sadly)
3) push works and includes everything in the bundle
4) viable migration path from existing remotefilelog (doesn't have to be in core, but has to be _doable_)

My strongly preferred approach to this would be to essentially fork and rewrite-in-place the existing remotefilelog codebase, with an eye towards being able to land it as extra-experimental. Much of what I've seen in there can be made less invasive these days, either through more targeted extensions.wrapfunction() use or minor tweaks to core. A lot of the confusing bits appear to be layered cooperative hacks for FB's treemanifest migration code or fastannotate or something or cruft that's hanging around to support older versions of hg.

QUESTION: I know you're favoring more of a "big bang" approach to a partial clone tool. Do you envision lazy-fetching files as something that's of use to you, and if so would it be plausibly productive for me to try and produce the "cleaned up remotefilelog pseudo-fork" I describe, at least in part? I could time box it at a day (or half a day) so it's not a ton of time investment, but I think that'd be enough to give you an idea of what the result might look like and maybe convince you the incremental approach can arrive at your desired goal while simultaneously making my life easier. ;)

> Foremost, if you have feedback about this post, say something! I'm proposing some radical things. People should question changes that are this radical! I think I've demonstrated or will demonstrate some significant value to this work. But just because you can do a thing doesn't mean you must do a thing.
> 
> There is no shortage of work around adding interfaces to storage and refactoring storage APIs so they aren't revlog specific. There are entire features like bundlerepo, unionrepo, repair, and repo upgrading that make heavy assumptions about the existence of revlogs and current file formats. Auditing the existing interfaces in repository.py and removing things that don't belong would also be a good use of time. While I've been focused on the revlog primitives so far, we will also need to add interfaces for everything that writes to .hg/. e.g. bookmarks, phases, locks, and transactions. We need to figure out a way to make these things code to an interface so implementation details of the existing .hg/ storage format don't bleed out into random callers. The tests/simplestorerepo.py extension implements things with custom storage and running the tests with that extension flushes out places in code that make assumptions about how storage works.

One thing I've noticed as I've been doing this remotefilelog upgrade is that revnums appear in too many places. Architecturally I think we can't avoid having it on the top-level changelog, but I think for lower layers we can probably hide it without too much loss of efficiency.

> Once the new wire protocol commands and exchange code lands, I'll need help adding features to support partial clone. There are still some parts I don't fully grok, such as ellipsis nodes and widening/narrowing. I /think/ my radical shift of pull logic from server to client makes these problems more tractable. But I don't understand the space as well as others.
> 
> If you have a wish list for other features to add to the wire protocol, now would be the time to say something.
> 
> When the time comes, I think it would be rad to experiment with the multiple hash storage ideas I outlined above. I'd be particularly interested in multi-storage of flat and tree manifests as well as Git indexing of revisions. Both features would be very useful for Mozilla.
> 
> Whew. That was a long email. And I didn't even say everything I could have said on the subject. If you made it here, congratulations. Hopefully you now have a better understanding of the work I'm doing and where I hope this all leads. If you want to help, ping me here or on IRC (I'm indygreg) and we'll figure something out.




> 
> Gregory
> 
> [1] https://gregoryszorc.com/blog/2018/07/27/benefits-of-clone-offload-on-version-control-hosting/ <https://gregoryszorc.com/blog/2018/07/27/benefits-of-clone-offload-on-version-control-hosting/>
> [2] https://github.com/glandium/git-cinnabar/issues/192 <https://github.com/glandium/git-cinnabar/issues/192>
> 
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180913/73c335bc/attachment.html>


More information about the Mercurial-devel mailing list