Review of bundle2: capabilities exchange

Fri Dec 19 19:29:01 CST 2014

On 12/18/2014 11:43 PM, Gregory Szorc wrote:
> Here are my thoughts on bundle2 capabilities exchange.
>
> b2x: prefix
> -----------
>
> It wasn't entirely obvious to me that the "b2x:" prefix really means
> "available part handler." (Or at least that's what I think it means.)

It does not means what you think it means. the 'b2x:' prefix is here 
because the protocol is experimental and we want to avoid collision in 
case we break backward compatibility.

The content of the capability dict is not strictly aimed at part handler 
either. The goal is to informs about available feature. And it is 
currently maintly used to says "hey I've this new part handler 
available" but there will be case where a capability is just about some 
stream level thing, or a single capability informs about multiple part.

The whole capability dict will probably be cleaned when we drop the 
experimental flag on bundle2.

> I'd really like to see some documentation around what goes in the b2
> capabilities dict. Some more formalization also couldn't hurt. e.g. why
> isn't there a prefix for stream level parameters?

I hope the previous paragraph address this question.

> undefined behavior with unknown part parameters
> -----------------------------------------------
>
> I love how the part type/handlers declare and advertise (via
> capabilities) their parameter names. This ensures peers are in a good
> position to make intelligent choices.

(note that is not strictly what happen, but the result is the same)

> What isn't really well defined is how we're going to version things
> going forward. When a new parameter name is invented, are we going to
> guarantee backwards compatibility for all of eternity like we do now?

Yes, we'll maintain backward compatibility (or at least feature potency 
capability) for ever.

> Or, are we going to freeze each part type and invent a new part with the
> new metadata? Both have advantages and disadvantages. I'm a fan of
> creating new types to represent the different behaviors rather than
> trying to keep tacking things on forever.

Both approach are not incompatible. We can alter existing part when it 
make sense and introduce new one deprecated the other when it is smarter.

> It does result in protocol
> bloat. But it keeps processing semantics well-defined and helps avoid
> ambiguities. I think this would also help with the mandatory/advisory foo.

I do not get this last part.

> mandatory and advisory parameters
> ---------------------------------
>
> Remotes advertise the set of supported parameters on each part type.
> This means that any reasonably behaving client will *never* send a
> payload to the server that the server is incapable of processing. This
> means that mandatory and advisory flags have little to no meaning from
> client -> server.
 >
 > This relegates mandatory and advisory flags to server -> client and "at
 > rest" scenarios.

(so again, the advertising is not strict. This confusion will probably 
keep bluring things for the rest of this email)

Advisory parameters make sens in a lot of case, and handling the 
discovery and conditional handling for all of them would be tedious and 
bloating. Here are quick example of advisory parameters that could be 
blindly added to a part:

  - verbose: "please produce verbose output if possible"
  - lenght of "data": Can be used to produce progress information, but 
non critical if not supported.

> For server -> client, we don't currently advertise which parameters are
> mandatory vs advisory.

The client send its capability when requesting a bundle.

> At least not as far as I can tell: I couldn't
> find a single @parthandler with an upper case parameter name.
Upercase is not involved in the mandatoryness of a parameter.

> For
> extensibility purposes (and this goes back to the previous section), I
> would think that we should have a more well-defined advertisement of
> what exactly the server is expecting with each part. Is a parameter
> mandatory? What's its default value if not?

I think the features + options approach to capability is good enough 
here. And have been mostly good enough for mercurial itself.

> I think it is important for servers to advertise this because it results
> in a better user experience. For example, a client may assume that a
> parameter is advisory and it may leave it out of the generated bundle2
> payload.

We needs to have non-ambiguous discovery and non-buggy client. But his 
can be achieve with the current infrastructure.

> However, the server may logically require the parameter (this
> is slightly different from how mandatory/advisory works today - we would
> throw an exception inside the @parthandler rather than inside the bit
> that parses the part headers).

If the parameter is logically needed, it should be mandatory. If the 
client generate buggy bundle, the user will have a bad experience anyway.

> From the user's perspective, they just
> spent 5 minutes transferring a large payload to the server only to find
> out that a missing parameter was mandatory. That's not cool. Servers
> should advertise exactly what parameters they are expecting.
> Alternatively, we should have well-defined and frozen behaviors for each
> part type and we should version part types when new behavior is wanted.
> Again, see the above section.
>
> lack of forward definition of parts and parameters
> --------------------------------------------------
>
> For at-rest bundles (bundles that are saved and read later - as opposed
> to bundles that are dynamically generated when capabilities are known
> exactly), we don't have the luxury of knowing exactly what the repo
> supports and thus can't avoid the potential of the repo encountering
> "unknown" data.
>
> In this scenario, we could very well have a very large bundle and
> encounter an unknown part at the tail end. This is a bad user experience.
>
> I'd like to see the ability for readers to avoid having to process an
> entire bundle before discovering they are incapable of fully processing
> it. Maybe we need the ability to forward declare parts (and possibly
> their parameters). Maybe we just need a reading mode that does an
> initial pass over the part types and parameters without processing.
>
> This feels like a low priority to me. But I'd like to see it addressed
> because waiting minutes to discover an error is extremely painful.

This could be easily implemented with a part at the beginning of the 
bundle. Such part would list all the mandatory part and parameters in 
the bundle to allow early failure.

> server-required part types
> --------------------------
>
> Similar in vein to my request for servers to advertise which parameters
> are mandatory, I think servers should advertise which part types are
> mandatory. This opens the door to servers requiring clients to meet
> certain minimal requirements before pushing. e.g. servers may want to
> require a "GPG part" with signatures encapsulating the changes they are
> about to make. Or something more realistic, when obsolescence becomes
> more mainstream, we want to try to enforce the requirement that only
> obsolescence aware clients send us data. Having part type presence
> requirements is a way of establishing minimum client requirements very
> early in the push process, before the client has potentially wasted
> minutes sending data to the server.

forward compatibility with future required is hard. However it is easy 
to arrange for the series to issue a special capability saying "I 
request client to send part X" and ensure it is the case at the 
beginning of the stream. Old client would then fail early.

> I know this feature is somewhat against the strong backwards compatible
> nature of Mercurial. But as someone who consistently deals with problems
> reported by people who aren't using a modern and ideal Mercurial
> installation or configuration, I'd really like more levers to enforce
> best practices. (I still want to polish off my "best practices"
> extension so servers can inform clients on how to "comply" with "best
> practices.)

Mercurial is commited to backward compatibility -by-default-. I find it 
perfectly reasonable for an organization to enforce people to upgrade.

-- 
Pierre-Yves David