cmdserver protocol questions

Thu Jun 30 11:17:05 CDT 2011

2011/6/30 Martin Geisler <mg at aragost.com>
>
> Hi guys,
>
> Jan is writing a Java library for the cmdserver and he had a number of
> questions for me about the protocol. I've tried to sum them up below:
>
> * Right now, you must always run the cmdserver in an existing repository
>  even if you want to run a commands.norepo command like 'hg init'.
>
>  Perhaps we should have both 'hg serve' that does what it used to do
>  and a new 'hg cmdserver' command -- the latter should be in
>  commands.optionalrepo whereas the first cannot be there.

I'm not sure it's worth the extra command.

>
> * Parsing of server's hello block: it would have been nicer if this had
>  been a standard format such as JSON -- now everyone has to implement
>  the parsing by themselves.

I don't know if JSON or anything else is really needed here.
The format is pretty simple, and I don't expect it to really go through
major
changes in the future either.

>
>  This is of course not difficult in itself, but now implementors must
>  decide if whitespace should be trimmed from each line?

What whitespace?

> It is also not clearly specified what a "field" is -- is it something that
matches
>  "[a-z]+:" or can there be other characters in a field? The safe choice
>  must be to split (once!) on ':', but will all implementations do this?
>

I guess that can be the field name pattern, with a space after the ':'
(maybe with digits too).

But for clients I think the best approach would be to parse the hello
message according to
the protocol docs, and ignore any fields they don't know.

>  Field names will presumably always be in ASCII?

Yes, everything related to the protocol should be.

>
> * Termination of server's hello block: how many 'o' channel writes can a
>  client expect? The example shows several writes on the 'o' channel,
>  but in practice it seems that there is only one and I think this was
>  also mentioned somewhere on this mailinglist?

It's a single chunk on the 'o' channel.
It says so on the top of the wiki page but I forgot to update the example,
thanks.

>
> * Input/line channels: what is the precise difference between the two
>  and why do we need both?

When a client sees 'I' it should return raw data (up to length), and
when it sees 'L' it should return a single line of data (also up to length).
They're identical in behavior to Python's read/readline on file-like
objects.

The 'L' channel lets the server to ask the client for a prompt reply, it's
used in places such as ui.prompt(), 'for line in ui.fin', etc.

Also, using the 'I' channel alone would introduce some complexity on the
server side, it'll need to split on new lines and buffer the remainder for
subsequent reads.

>
>  It seems to me that the 'I' channel would be enough, also for reading
>  patches from stdin. Since the 'I' channel is not line-oriented, it
>  does not have the 4096 byte line length maximum.

The 4096 length is just a way for the server to limit the amount of data
the client sends
in one go, giving it a chance to read from the server to avoid a deadlock
(see the protocol
RFC discussion where Matt explained why this is needed).
So it's not specific to the line channel, there can be lines with size >4096

>
> --
> Martin Geisler
>
> aragost Trifork
> Professional Mercurial support
> http://mercurial.aragost.com/kick-start/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20110630/471317e6/attachment.htm>