This page is primarily intended for Mercurial's developers.
This page is no longer relevant but is kept for historical purposes.
Wire Protocol Unification Plan
This page is mostly of historical interest, as mpm used a slightly different approach to finish this project.
Overview and motivation
This page discusses the low-level parts of the two wire protocols: ssh and http. The high-level parts (command set, arguments etc.) are not the scope of the unification.
Currently, handling the two wire protocols is almost entirely separate in the code; each has an adaption layer consisting of a function for each available command, which interacts with the common backend in the appropriate way. The two adaption layers are vaguely similar, but far from identical. Also, the feature set is different in the two underlying protocols. In particular, the ssh protocol is less flexible when it comes to argument handling. To sum up some observations about the current situation (not all of these are necessarily issues we should address):
- The ssh protocol is less flexible to extension than the http protocol. But ssh can provide out-of-band messages (via stderr) and return codes.
- The transfer formats are dissimilar.
- The two adaption layers carry out a lot of the same work, but have little shared code.
In particular, the first point above means that we can't easily add optional arguments to the ssh protocol to enable feature extensions (such as light-weight copies), client capabilities and similar, whereas this is not a problem for http.
What we should do about it
We could proceed in a few phases as follows:
- Discuss a common feature set we want (and need) the low-level protocols to support, in order to ensure future flexibility and easy backwards compatibility.
- Discuss how this information should be transfered, and if we can somehow at this point (without breakage) make the two transfer formats more similar.
- Change the protocols so the common feature set is properly supported.
- Consider how the common feature set can be unified in the code, write the common layer for this and change the current adaptors to use it.
- Transfer should be reasonably efficient
- Server should be stateless
- Variable number of (preferably named) arguments per command
- Transfer of both text (protocol) and binary (stream/bundle) data
- A way of transferring server capabilities
- A way of transferring client capabilities (for lwcopies?)
- A way of transferring error out-of-band for the http protocol
We already have 1, 2, 4 and 5. Need 3 and 6, and preferably 7.
- Add varargs bits to ssh
- Make ssh more like HTTP (with MIME-headers)
- Use JSON representation for non-binary responses/requests (may or may not be combined with 2)
The capability/protocol version etc. cycle:
- Client queries server for capabilities, i.e. the server says to the client: you MAY use this feature.
- When executing relevant commands, the client sends options (similar to capabilities) using the normal argument transfer feature. So the client says to the server: you MUST use this feature. Of course the client is only allowed to request features the server is capable of.
The reason for using the regular argument transfer mechanism is to avoid introducing yet another element in the common feature set.
7: Out-of-band error stream for HTTP
One way to do this is to packetize the data stream, at least the server-to-client one. It seems natural to use HTTP chunked transfer, combined with a small header (one or a few characters, say) for each chunk denoting its kind and purpose (regular output, errors), and some mechanism to hide this mess from the client.
A new command format is introduced, signaled by a capability (tentatively called protocol=2). The server can easily tell the new and the old formats apart. The new format works like this:
<command name> <arg count> [flags...] <arg 1 name> <arg 1 length> <arg 1 data.... no newline> <arg 2 name> etc.
So quite similar to the old format. The flags are currently unused (space separated boolean flags) for SSH. All commands can use the new format, and the code is structured so everything works the same for the old command not expecting variable arguments, but it is also easy enough to support new optional arguments for them (of course capabilities are needed, as usual, to signal the presence of new features whether in the form of commands or new optional arguments).