[PATCH 2 of 3 side-word (+4)] sshpeer: use a 'bufferedinputpipe' for standard output of the ssh process

Pierre-Yves David pierre-yves.david at ens-lyon.org
Sun May 31 14:51:32 CDT 2015



On 05/31/2015 07:10 AM, Yuya Nishihara wrote:
> On Sun, 31 May 2015 00:22:33 -0700, Pierre-Yves David wrote:
>> # HG changeset patch
>> # User Pierre-Yves David <pierre-yves.david at fb.com>
>> # Date 1433055636 25200
>> #      Sun May 31 00:00:36 2015 -0700
>> # Node ID 72344e96acd26c68121d747aeb77fb8e1f9a75e8
>> # Parent  1a883412355af1e057324b4ab2de885473f54432
>> sshpeer: use a 'bufferedinputpipe' for standard output of the ssh process
>>
>> We need this pipe to still be buffered when will switch to unbuffered pipe.
>> (switch motivated by the need of using polling to restore real time output from
>> ssh server). This is the only pipe that needs to be wrapped because this is the
>> one who do extensive usage of 'readline'. The stderr pipe of the process is
>> alway read in non blocking raw chunk, so it won't benefit from the
>> buffering.
>>
>> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
>> --- a/mercurial/sshpeer.py
>> +++ b/mercurial/sshpeer.py
>> @@ -87,10 +87,12 @@ class sshpeer(wireproto.wirepeer):
>>
>>           # while self.subprocess isn't used, having it allows the subprocess to
>>           # to clean up correctly later
>>           self.pipeo, self.pipei, self.pipee, self.subprocess = util.popen4(cmd)
>>
>> +        self.pipei = util.bufferedinputpipe(self.pipei)
>
> Can't we do dup() + fdopen() or PyFile_SetBufSize() to make only pipei buffered?
>
> I don't understand what's going to happen with select(), so maybe I miss the
> point of new "hasbuffer" property.

Not we cannot, we need to have access to the buffer state to do our 
polling IO.

The polling IO are used to wait on both stdout and stderr. This allow to 
detect and process server output (stderr) while waiting on protocol data 
(stdout)

The issue with the buffer+select is that we want to be able to perform 
such operation:

  0) ssh: write "datadata"
  1) hg:  select() -> hey! data to read
  2) hg:  read(4) -> "data"
  3) hg:  select() -> hey! data to read
  4) hg:  read(4) -> "data"

If you add the buffering, the 2) will actually read all available data 
in the buffer the second half. (3) will not see any data to read and 
block for ever (4) will never run (5) the internet kitten will never 
have his bowl filled.

You can see the select usage here: 
http://42.netv6.net/marmoute-wip/mercurial/rev/b9213af62475

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list