Diff for "WireProtocolUnificationPlan"

Differences between revisions 4 and 6 (spanning 2 versions)

Overview and motivation

This page discusses the low-level parts of the two wire protocols: ssh and http. The high-level parts (command set, arguments etc.) are not the scope of the unification.

Currently, handling the two wire protocols is almost entirely separate in the code; each has an adaption layer consisting of a function for each available command, which interacts with the common backend in the appropriate way. The two adaption layers are vaguely similar, but far from identical. Also, the feature set is different in the two underlying protocols. In particular, the ssh protocol is less flexible when it comes to argument handling. To sum up some observations about the current situation (not all of these are necessarily issues we should address):

The ssh protocol is less flexible to extension than the http protocol. But ssh can provide out-of-band messages (via stderr) and return codes.
The transfer formats are dissimilar.
The two adaption layers carry out a lot of the same work, but have little shared code.

In particular, the first point above means that we can't easily add optional arguments to the ssh protocol to enable feature extensions (such as light-weight copies), client capabilities and similar, whereas this is not a problem for http.

What we should do about it

We could proceed in a few phases as follows:

Discuss a common feature set we want (and need) the low-level protocols to support, in order to ensure future flexibility and easy backwards compatibility.
Discuss how this information should be transfered, and if we can somehow at this point (without breakage) make the two transfer formats more similar.
Change the protocols so the common feature set is properly supported.
Consider how the common feature set can be unified in the code, write the common layer for this and change the current adaptors to use it.

Requirements

Transfer should be reasonably efficient
Server should be stateless
Variable number of (preferably named) arguments per command
Transfer of both text (protocol) and binary (stream/bundle) data
A way of transferring server capabilities
A way of transferring client capabilities (for lwcopies?)
A way of transferring error out-of-band for the http protocol

Current features

We already have 1, 2, 4 and 5. Need 3 and 6, and preferably 7.

Possibilities

Add varargs bits to ssh
Make ssh more like HTTP (with MIME-headers)
Use JSON representation for non-binary responses/requests (may or may not be combined with 2)

5-6: Capabilities

The capability/protocol version etc. cycle:

Client queries server for capabilities, i.e. the server says to the client: you MAY use this feature.
When executing relevant commands, the client sends options (similar to capabilities) using the normal argument transfer feature. So the client says to the server: you MUST use this feature. Of course the client is only allowed to request features the server is capable of.

The reason for using the regular argument transfer mechanism is to avoid introducing yet another element in the common feature set.

7: Out-of-band error stream for HTTP

One way to do this is to packetize the data stream, at least the server-to-client one. It seems natural to use HTTP chunked transfer, combined with a small header (one or a few characters, say) for each chunk denoting its kind and purpose (regular output, errors), and some mechanism to hide this mess from the client.

-  ⇤ ← Revision 4 as of 2010-02-09 15:50:44 → 
  Size: 2784
  Editor: tonfa
  Comment: ssh advantages
+   ← Revision 6 as of 2010-02-10 09:24:45 → ⇥
  Size: 3589
  Editor: cyanite
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 20:
-== Feature set ==
+== Requirements ==
 Line 22:
-=== Current common feature set ===
Commands are sent from the client (stateful) to the server (stateless), and consist of a command name and a number of arguments in a key-value format. The number of arguments for each command is fixed due to limitations in the ssh protocol.
+. Transfer should be reasonably efficient
 2. Server should be stateless
 3. Variable number of (preferably named) arguments per command
 4. Transfer of both text (protocol) and binary (stream/bundle) data
 5. A way of transferring server capabilities
 6. A way of transferring client capabilities (for lwcopies?)
 7. A way of transferring error out-of-band for the http protocol

== Current features ==

We already have 1, 2, 4 and 5. Need 3 and 6, and preferably 7.
-Line 26:
+Line 35:
-The easiest is to define the feature set as: We need to be able to send commands (text string) with a dynamic number of named arguments containing primarily text, but potentially arbitrary, data. The number and names of the arguments should be discoverable at the server side.
-Line 28:
+Line 36:
-This only requires a few changes, mainly to ssh. Alternatively, we could move to a JSON (or JSON-based) approach: Each command has a single argument (possibly empty), which consists of a JSON expression.
+. Add varargs bits to ssh
 2. Make ssh more like HTTP (with MIME-headers)
 3. Use JSON representation for non-binary responses/requests (may or may not be combined with 2)
-Line 30:
+Line 40:
-Both alternatives are equally flexible as I see it.
+=== 5-6: Capabilities ===
The capability/protocol version etc. cycle:
 1. Client queries server for capabilities, i.e. the server says to the client: you MAY use this feature.
 2. When executing relevant commands, the client sends options (similar to capabilities) using the normal argument transfer feature. So the client says to the server: you MUST use this feature. Of course the client is only allowed to request features the server is capable of.

The reason for using the regular argument transfer mechanism is to avoid introducing yet another element in the common feature set.

=== 7: Out-of-band error stream for HTTP ===
One way to do this is to packetize the data stream, at least the server-to-client one. It seems natural to use HTTP chunked transfer, combined with a small header (one or a few characters, say) for each chunk denoting its kind and purpose (regular output, errors), and some mechanism to hide this mess from the client.