RFC: Command server protocol

Matt Mackall mpm at selenic.com
Mon Jun 13 11:03:51 CDT 2011


On Mon, 2011-06-13 at 18:15 +0300, Idan Kamara wrote:
> On Mon, Jun 13, 2011 at 1:50 AM, Matt Mackall <mpm at selenic.com> wrote:
> > On Sun, 2011-06-12 at 19:15 +0300, Idan Kamara wrote:
> >> Here's an overview of the current protocol used by the command server (also
> >> available here http://mercurial.selenic.com/wiki/CommandServer). Feedback is
> >> appreciated.
> >>
> >> All communication with the server is done on stdin/stdout. The byte order
> >> used by the server is big-endian.
> >
> > When is big-endian used?
> 
> In the channel header. And all other length fields which are unsigned ints.

> >
> >> Data sent from the server is channel based, meaning a (channel [character],
> >> length [unsigned int]) pair is sent before the actual data. For example:
> >>
> >> o
> >> 1234
> >
> > Is this '1234' in text or in binary? If it's binary, how many bytes is
> > it?
> 
> It's binary, 4 bytes according to
> http://docs.python.org/library/struct.html#format-characters

Did you clarify this on the wiki?

> >
> >> <data: 1234 bytes>
> >>
> >> that is 1234 bytes sent on channel 'o', with the data following.
> >>
> >> When starting the server, it will send a new-line separated list of
> >> capabilities (on the 'o' channel), in this format:
> >>
> >> capabilities:\n
> >> capability1\n
> >> capability2\n
> >> ...
> >
> > There should probably be a blank line or something indicating that
> > there's no more data arriving?
> 
> It's one string with all the capabilities being sent on the output channel.
> So the client sees this as one chunk.

Ok.

> >> Channels
> >> --------------
> >> There are currently 5 channels:
> >>
> >> * o - Output channel. Most of the communication happens on this channel.
> >> When running commands, output Mercurial writes to stdout is written to this
> >> channel.
> >> * e - Error channel. When running commands, this correlates to stderr.
> >> * i - Input channel. The length field here can either be 0, telling the
> >> client to send all input, or some positive number telling the client to send
> >> at most <length> bytes.
> >> * l - Line based input channel. The client should send a single line of
> >> input (trimmed if length is not 0). This channel is used when Mercurial
> >> interacts with the user or when iterating over stdin.
> >
> > What should a client do with unexpected channel responses?
> >
> > For instance, what happens when a progress channel is added? What
> > happens if a client gets an unexpected prompt?
> 
> Since progress is considered output, it needs to consume it and ignore it
> if it's of no interest to him.

If a client written today encounters a progress channel tomorrow, how
does it know not to abort? It wasn't written to expect that.

> >> Input should be sent on stdin in the following format:
> >>
> >> length
> >> data
> >
> > The input model is interesting: it basically has the server prompting
> > the client for input. That probably makes sense, but we should probably
> > be explicit about what's required to avoid deadlock.
> >
> > For instance, if the server is both consuming input and producing
> > output, and the client is simply spooling input (ie a big patch), it
> > will eventually write enough data to the client that its write blocks.
> >
> 
> Right. But technically if the server writes output while asking for input,
> for the client to know it needs to send more input, it will have to
> read the output first.

You wrote:

>> * i - Input channel. The length field here can either be 0, telling
the
> >> client to send all input

So how does this happen? Does the client simply start writing and write
until it's finished? What happens if the client wants to send a 50MB
chunk?
> > The wiki page has a piece about error codes but it's not quite clear how
> > a client distinguishes those from the output stream.
> 
> Yeah. This is a problem if the server sends a \0 as part of its 'regular'
> output. The client will be misled thinking it's the end.

And it definitely can.

> Maybe we could use another channel here ('a'dmin?) for the server
> to tell the client that a command finished and to send its return code.

How about 'r'esult. We can use this generically for client command
results.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list