Command Server Facts: `hg log` might write csets over multiple writes on 'o' channel

Wed Aug 21 08:02:26 CDT 2013

Hi Iulian,
hello dear mercurial-devel,

I am writing this message to put a bunch of things together all in one
place,
so that I can easily refer to it in the forthcoming GSoC meetings w/ Iulian.
Apologies for using your mailbox as my personal cloud storage.

We recently established the following fact:

--------------------------------------------------------------------
When issuing the `hg log` command over the Mercurial command server,
it has to be assumed that:

(1) The data describing a single revision COULD be splitted
    into several 'o' channel writes.
(2) A single 'o' channel write COULD contain data referring
    to more than one revision.

These assumptions has to be maintained even if all evidence show
a more strong relationship between log revisions and 'o' channel writes
(eg: one-to-one correspondence).
The bottom line is that in absence of a specification that enforce
"command server" to bind log revisions to 'o' channel writes,
(1) and (2) is the best we can say.
--------------------------------------------------------------------

I will now provide links to various contributions that helped coming
to the above statement.

ggherdov on Aug 19th[1]:
::::
:::: You are assuming that a log revision
:::: cannot be split across two iteration
:::: of output on the 'o' channel.
::::
:::: Is this assumption safe?

idank on Aug 19th[2]:
::::
:::: No.

iulians on Aug 19th[3]
::::
:::: You want to say that the server will give a revision in more "shots"?
:::: From my test I saw that the cmdserver will send a log entry in a
single call
:::: ( when I am using a template) even if that revision have 20 kB.
:::: The header from cmdserver is something like
::::
:::: channel = 'o'
:::: length = 20000 data = [...]

idank on Aug 19th[4]
::::
:::: The command server is oblivious to the data it writes.
:::: It simply forwards whatever hg normally writes to stdout,
:::: but in the command protocol.
:::: So nothing guarantees what you're seeing,
:::: even though that's what happens in practice.

mg on Aug 19th[5]
::::
:::: That is, 'hg log' output for a single revision
:::: may be broken up into several writes on the 'o' channel.
::::
:::: [...]
:::: The point is that this is not guaranteed by the command server.
::::
:::: Think of the command server as you would think of a
:::: normal child process that you've started using, say,
:::: the subprocess in Python. When you run
::::
::::      proc = subprocess.Popen(['hg, 'log'], stdout=subprocess.PIPE)
::::
:::: then you cannot expect that
::::
::::      proc.stdout.read(1024)
::::
:::: will give you data back in chunks that correspond to the
:::: stdout.write calls done in 'hg log' -- the boundaries between
:::: the write calls disappear.
:::: This is the same situation as when working with TCP sockets:
:::: a client cannot "see" the chunks the server used when writing
:::: the reply to the socket. All the client sees is a stream of data.

mg on Aug 20th[6]
::::
:::: Right now I agree that the output is written to stdou
:::: on a per changeset basis.
:::: But a future change to Mercurial could make that stop
:::: (imagine that it is faster for Mercurial to write the
:::: data for 10 changesets at a time).

[1] http://markmail.org/message/wkpl2vvwrpaniab4
[2] http://markmail.org/message/au2tfw2okw2znnub
[3] http://markmail.org/message/7s3dptaulr2js57e
[4] http://markmail.org/message/2mpqsvg7idmq6j24
[5] http://markmail.org/message/ckrhparroxvpibym
[6] http://markmail.org/message/mlilyf7urn6cw7cu

Cheers,
GGhh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20130821/e7ab3efa/attachment.html>