RFC: Command server protocol

Mon Jun 13 15:36:16 CDT 2011

On Mon, 2011-06-13 at 23:05 +0300, Idan Kamara wrote:
> On Mon, Jun 13, 2011 at 7:03 PM, Matt Mackall <mpm at selenic.com> wrote:
> > On Mon, 2011-06-13 at 18:15 +0300, Idan Kamara wrote:
> >> On Mon, Jun 13, 2011 at 1:50 AM, Matt Mackall <mpm at selenic.com> wrote:
> >> > On Sun, 2011-06-12 at 19:15 +0300, Idan Kamara wrote:
> >> >> Here's an overview of the current protocol used by the command server (also
> >> >> available here http://mercurial.selenic.com/wiki/CommandServer). Feedback is
> >> >> appreciated.
> >> >>
> >> >> All communication with the server is done on stdin/stdout. The byte order
> >> >> used by the server is big-endian.
> >> >
> >> > When is big-endian used?
> >>
> >> In the channel header. And all other length fields which are unsigned ints.
> >
> >> >
> >> >> Data sent from the server is channel based, meaning a (channel [character],
> >> >> length [unsigned int]) pair is sent before the actual data. For example:
> >> >>
> >> >> o
> >> >> 1234
> >> >
> >> > Is this '1234' in text or in binary? If it's binary, how many bytes is
> >> > it?
> >>
> >> It's binary, 4 bytes according to
> >> http://docs.python.org/library/struct.html#format-characters
> >
> > Did you clarify this on the wiki?
> 
> Sort of: Data sent from the server is channel based, meaning a
> (channel [character], length [unsigned int]) pair is sent before the
> actual data.
> 
> I'll make sure it's more clear by linking to the python docs.
> 
> >
> >> >
> >> >> <data: 1234 bytes>
> >> >>
> >> >> that is 1234 bytes sent on channel 'o', with the data following.
> >> >>
> >> >> When starting the server, it will send a new-line separated list of
> >> >> capabilities (on the 'o' channel), in this format:
> >> >>
> >> >> capabilities:\n
> >> >> capability1\n
> >> >> capability2\n
> >> >> ...
> >> >
> >> > There should probably be a blank line or something indicating that
> >> > there's no more data arriving?
> >>
> >> It's one string with all the capabilities being sent on the output channel.
> >> So the client sees this as one chunk.
> >
> > Ok.
> >
> >> >> Channels
> >> >> --------------
> >> >> There are currently 5 channels:
> >> >>
> >> >> * o - Output channel. Most of the communication happens on this channel.
> >> >> When running commands, output Mercurial writes to stdout is written to this
> >> >> channel.
> >> >> * e - Error channel. When running commands, this correlates to stderr.
> >> >> * i - Input channel. The length field here can either be 0, telling the
> >> >> client to send all input, or some positive number telling the client to send
> >> >> at most <length> bytes.
> >> >> * l - Line based input channel. The client should send a single line of
> >> >> input (trimmed if length is not 0). This channel is used when Mercurial
> >> >> interacts with the user or when iterating over stdin.
> >> >
> >> > What should a client do with unexpected channel responses?
> >> >
> >> > For instance, what happens when a progress channel is added? What
> >> > happens if a client gets an unexpected prompt?
> >>
> >> Since progress is considered output, it needs to consume it and ignore it
> >> if it's of no interest to him.
> >
> > If a client written today encounters a progress channel tomorrow, how
> > does it know not to abort? It wasn't written to expect that.
> 
> The client can choose what to do when he gets data on an unexpected channel.
> Unless we mess up with the initial design, I don't see why ignoring
> unexpected data shouldn't be fine.
> (by ignoring I mean just reading the data and doing nothing with it)

Let's say a client library calls a command that wants input, but it
didn't expect it to. Simply discarding the input request from the server
and waiting for the command to complete won't work, as the command will
never complete. So the above is clearly not always correct. Some
channels are already known to be non-ignorable.

> (side note: I'm not sure progress deserves its own channel, since it's
> written to stderr
> it will end up in the 'e'rror channel).

I've mentioned this channel about 30 times now, and you finally bring
this up?

Can you really not imagine any reasons why a client library would want
progress in its own well-defined channel?

> >
> >> >> Input should be sent on stdin in the following format:
> >> >>
> >> >> length
> >> >> data
> >> >
> >> > The input model is interesting: it basically has the server prompting
> >> > the client for input. That probably makes sense, but we should probably
> >> > be explicit about what's required to avoid deadlock.
> >> >
> >> > For instance, if the server is both consuming input and producing
> >> > output, and the client is simply spoolin g input (ie a big patch), it
> >> > will eventually write enough data to the client that its write blocks.
> >> >
> >>
> >> Right. But technically if the server writes output while asking for input,
> >> for the client to know it needs to send more input, it will have to
> >> read the output first.
> >
> > You wrote:
> >
> >>> * i - Input channel. The length field here can either be 0, telling
> > the
> >> >> client to send all input
> >
> > So how does this happen? Does the client simply start writing and write
> > until it's finished? What happens if the client wants to send a 50MB
> > chunk?
> 
> At the moment it will have to send it in one chunk, which is probably bad.
> I think we might want the server in this case ('i'nput channel,
> length=0) to read loop from the client
> until it says it's finished. That way the client can feed it with data
> without having
> the pipe explode.

Huh? Not sure what you mean by pipes exploding. Pipes don't explode,
they simply block.

If the server sends "please send me all your input" and the client sends
50MB in one chunk, then the server either has to buffer all of it, or
have a risk of deadlock.

Deadlock occurs when you have this situation:

- process A is writing a large amount of data to process B
- process B is processing that data and writing results back to A
- pipe from B to A fills up because A is not reading it
- B blocks waiting for room in the B->A pipe
- A keeps writing to A->B pipe, which fills up because B stopped reading
- A blocks waiting for room in A->B pipe

Now we're stuck forever.

This situation is a risk whenever you have two processes communicating
via a pair of pipes. The usual solution is to design one of them to use
either threads or select to make sure that there's always a reader.

But I think the right answer is to simply outlaw the "0" size and
specify a maximum-sized read to buffer (eg 4k). Then the server can
guarantee that it has finished reading the input before it generates
more output, thus freeing up the client to switch back to reading
output.

> >> > The wiki page has a piece about error codes but it's not quite clear how
> >> > a client distinguishes those from the output stream.
> >>
> >> Yeah. This is a problem if the server sends a \0 as part of its 'regular'
> >> output. The client will be misled thinking it's the end.
> >
> > And it definitely can.
> >
> >> Maybe we could use another channel here ('a'dmin?) for the server
> >> to tell the client that a command finished and to send its return code.
> >
> > How about 'r'esult. We can use this generically for client command
> > results.
> 
> Sounds good. I've updated the wiki:
> - result channel: The server uses this channel to tell the client that
> a command finished by writing its return code (a signed integer).

Well this should probably be more generic than that. This should be
where the 'result' of a server-level command such as 'runcommand' is
returned, with a format defined per command. So 'runcommand' returns an
exit code, but some other command like 'getencoding' might return a
string.

-- 
Mathematics is the supreme nostalgia of our time.