Review of bundle2: other stuff

Tue Dec 23 16:06:45 CST 2014

On 12/22/14 1:58 PM, Pierre-Yves David wrote:
>
>
> On 12/22/2014 08:17 AM, Gregory Szorc wrote:
>> Here are my thoughts on some other parts of bundle2.

>> multiple purposes of bundle2
>> ----------------------------
>>
>> As is currently implemented, bundle2 is both a repository data
>> representation format and a communications protocol. We have parts like
>> b2x:changegroup existing alongside parts like b2x:output.
>>
>> The current implementation works. But I'm not convinced it is the best
>> approach. Everything seems so parts-centric and assumes parts are these
>> large, self-contained, non-interruptable units of data/work. But we also
>> have "side-channels" (as special parts) for communicating out-of-band
>> data such as user output.
>
> The main usecase for side-channel is exception//interruption handling.
> The ability to use it for output or progress is (1) not done yet (2) a
> nice extra candy.
>
>> I think the concept of a stream with parts is great. However, the
>> implementation of the parts is a bit worrying. I'd like to see more
>> smaller, chunked parts and fewer large, monolithic parts.
>
> It is up to anyone to roll up its sleeves and implement such smaller
> part. My initial intend is to setup the minimal amount of effort to get
> a new framework to exchange data and move back to my main goal: getting
> changeset evolution done. The main reason we use changegroup and pushkey
> is because they pre-existed.
>
>> I think that if you view bundle2 as a protocol with lightweight frames
>> and the ability to have multiple "channels" per connections (think
>> HTTP/2), a lot more possibilities open up.
>
> I'm not sure what we would do with such channel. In any case, we could
> have the format improved later. This is why we have stream level parameter.
>
>> If you start going down this road, you start asking questions like "why
>> is bundle2 only implemented for getbundle/unbundle - shouldn't it be a
>> general framing protocol for all wire protocol commands?"
>
> One half of the answer is: Because nobody migrated the other one.
> The other half is: Because some HTTP server will reject any
> unauthenticated POST request.

First, what HTTP servers reject unauthenticated POSTs? How do these 
servers handle HTML's <form>? Must Mercurial support obviously broken 
HTTP servers? While I'm here, have we considered using the Upgrade 
functionality in HTTP to have HTTP servers talk a more SSH-like 
protocol? Once the connection is upgraded, all bets are off and "HTTP" 
means nothing.

OK, so I'm going to challenge why output and error handling are bundle2 
parts at all. I don't think these are parts: these are special frames or 
side-channels in the protocol. I *think* these parts have 0 meaning for 
offline usages of bundle2 (only relevant for wire protocol / online 
usage). Am I wrong?

I think we should split "bundle2" into a general framing protocol (for 
online usage) and a parts protocol (for offline usage). The parts 
protocol is essentially what we have today. However, we would move 
things like output and error handling and other side-channel foo into 
the framing protocol.

Splitting things this way makes a lot of problems go away. First, 
"continuation events" are just special frames. You don't need 
cooperation in the parts protocol to e.g. interleave changegroup data 
with output. Instead, you just send an extra frame when possible. This 
frame can be injected (and read) any time. No need to disrupt producers 
or consumers. I think this is much more flexible.