Review of bundle2: capabilities exchange
Gregory Szorc
gregory.szorc at gmail.com
Fri Dec 19 07:43:47 UTC 2014
Here are my thoughts on bundle2 capabilities exchange.
b2x: prefix
-----------
It wasn't entirely obvious to me that the "b2x:" prefix really means
"available part handler." (Or at least that's what I think it means.)
I'd really like to see some documentation around what goes in the b2
capabilities dict. Some more formalization also couldn't hurt. e.g. why
isn't there a prefix for stream level parameters?
undefined behavior with unknown part parameters
-----------------------------------------------
I love how the part type/handlers declare and advertise (via
capabilities) their parameter names. This ensures peers are in a good
position to make intelligent choices.
What isn't really well defined is how we're going to version things
going forward. When a new parameter name is invented, are we going to
guarantee backwards compatibility for all of eternity like we do now?
Or, are we going to freeze each part type and invent a new part with the
new metadata? Both have advantages and disadvantages. I'm a fan of
creating new types to represent the different behaviors rather than
trying to keep tacking things on forever. It does result in protocol
bloat. But it keeps processing semantics well-defined and helps avoid
ambiguities. I think this would also help with the mandatory/advisory foo.
mandatory and advisory parameters
---------------------------------
Remotes advertise the set of supported parameters on each part type.
This means that any reasonably behaving client will *never* send a
payload to the server that the server is incapable of processing. This
means that mandatory and advisory flags have little to no meaning from
client -> server.
This relegates mandatory and advisory flags to server -> client and "at
rest" scenarios.
For server -> client, we don't currently advertise which parameters are
mandatory vs advisory. At least not as far as I can tell: I couldn't
find a single @parthandler with an upper case parameter name. For
extensibility purposes (and this goes back to the previous section), I
would think that we should have a more well-defined advertisement of
what exactly the server is expecting with each part. Is a parameter
mandatory? What's its default value if not?
I think it is important for servers to advertise this because it results
in a better user experience. For example, a client may assume that a
parameter is advisory and it may leave it out of the generated bundle2
payload. However, the server may logically require the parameter (this
is slightly different from how mandatory/advisory works today - we would
throw an exception inside the @parthandler rather than inside the bit
that parses the part headers). From the user's perspective, they just
spent 5 minutes transferring a large payload to the server only to find
out that a missing parameter was mandatory. That's not cool. Servers
should advertise exactly what parameters they are expecting.
Alternatively, we should have well-defined and frozen behaviors for each
part type and we should version part types when new behavior is wanted.
Again, see the above section.
lack of forward definition of parts and parameters
--------------------------------------------------
For at-rest bundles (bundles that are saved and read later - as opposed
to bundles that are dynamically generated when capabilities are known
exactly), we don't have the luxury of knowing exactly what the repo
supports and thus can't avoid the potential of the repo encountering
"unknown" data.
In this scenario, we could very well have a very large bundle and
encounter an unknown part at the tail end. This is a bad user experience.
I'd like to see the ability for readers to avoid having to process an
entire bundle before discovering they are incapable of fully processing
it. Maybe we need the ability to forward declare parts (and possibly
their parameters). Maybe we just need a reading mode that does an
initial pass over the part types and parameters without processing.
This feels like a low priority to me. But I'd like to see it addressed
because waiting minutes to discover an error is extremely painful.
server-required part types
--------------------------
Similar in vein to my request for servers to advertise which parameters
are mandatory, I think servers should advertise which part types are
mandatory. This opens the door to servers requiring clients to meet
certain minimal requirements before pushing. e.g. servers may want to
require a "GPG part" with signatures encapsulating the changes they are
about to make. Or something more realistic, when obsolescence becomes
more mainstream, we want to try to enforce the requirement that only
obsolescence aware clients send us data. Having part type presence
requirements is a way of establishing minimum client requirements very
early in the push process, before the client has potentially wasted
minutes sending data to the server.
I know this feature is somewhat against the strong backwards compatible
nature of Mercurial. But as someone who consistently deals with problems
reported by people who aren't using a modern and ideal Mercurial
installation or configuration, I'd really like more levers to enforce
best practices. (I still want to polish off my "best practices"
extension so servers can inform clients on how to "comply" with "best
practices.)
More information about the Mercurial-devel
mailing list