Review of bundle2: other stuff

Pierre-Yves David pierre-yves.david at ens-lyon.org
Mon Dec 22 15:58:24 CST 2014



On 12/22/2014 08:17 AM, Gregory Szorc wrote:
> Here are my thoughts on some other parts of bundle2.
>
> text vs binary nodes
> --------------------
>
> Things like b2x:listkeys and b2x:pushkey are using their existing
> text-based encoding in bundle2. I'd *really* like to see these switched
> to binary. This will result in smaller payload exchanges between peers.
> ~20 bytes for hex vs bin node representation adds up across hundreds or
> thousands of heads. On slow connections, transfer overhead for hex
> encoding can be non-trivial. Transfer of binary encodings is essentially
> a free performance win. I hope we take the opportunity with bundle2 to
> realize it.
>
> I guess we can introduce binary versions of these parts later. But I
> would think it would be a low-hanging fruit to just do it from day 0 of
> bundle2.

Pushkey is mostly agnostic, but every existing pushkey function uses 
hex. This cannot be "just" changed. The pushkey/listkey part are just 
simple way to include existing interface in a bundle2. The way to go is 
to introduce new, more modern part for each of these.

> multiple purposes of bundle2
> ----------------------------
>
> As is currently implemented, bundle2 is both a repository data
> representation format and a communications protocol. We have parts like
> b2x:changegroup existing alongside parts like b2x:output.
>
> The current implementation works. But I'm not convinced it is the best
> approach. Everything seems so parts-centric and assumes parts are these
> large, self-contained, non-interruptable units of data/work. But we also
> have "side-channels" (as special parts) for communicating out-of-band
> data such as user output.

The main usecase for side-channel is exception//interruption handling. 
The ability to use it for output or progress is (1) not done yet (2) a 
nice extra candy.

> I think the concept of a stream with parts is great. However, the
> implementation of the parts is a bit worrying. I'd like to see more
> smaller, chunked parts and fewer large, monolithic parts.

It is up to anyone to roll up its sleeves and implement such smaller 
part. My initial intend is to setup the minimal amount of effort to get 
a new framework to exchange data and move back to my main goal: getting 
changeset evolution done. The main reason we use changegroup and pushkey 
is because they pre-existed.

> I think that if you view bundle2 as a protocol with lightweight frames
> and the ability to have multiple "channels" per connections (think
> HTTP/2), a lot more possibilities open up.

I'm not sure what we would do with such channel. In any case, we could 
have the format improved later. This is why we have stream level parameter.

> If you start going down this road, you start asking questions like "why
> is bundle2 only implemented for getbundle/unbundle - shouldn't it be a
> general framing protocol for all wire protocol commands?"

One half of the answer is: Because nobody migrated the other one.
The other half is: Because some HTTP server will reject any 
unauthenticated POST request.

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list