[PATCH 2 of 3 side-word (+4)] sshpeer: use a 'bufferedinputpipe' for standard output of the ssh process

Sun May 31 18:01:55 CDT 2015

On Sun, 31 May 2015 12:51:32 -0700, Pierre-Yves David wrote:
> On 05/31/2015 07:10 AM, Yuya Nishihara wrote:
> > On Sun, 31 May 2015 00:22:33 -0700, Pierre-Yves David wrote:
> >> # HG changeset patch
> >> # User Pierre-Yves David <pierre-yves.david at fb.com>
> >> # Date 1433055636 25200
> >> #      Sun May 31 00:00:36 2015 -0700
> >> # Node ID 72344e96acd26c68121d747aeb77fb8e1f9a75e8
> >> # Parent  1a883412355af1e057324b4ab2de885473f54432
> >> sshpeer: use a 'bufferedinputpipe' for standard output of the ssh process
> >>
> >> We need this pipe to still be buffered when will switch to unbuffered pipe.
> >> (switch motivated by the need of using polling to restore real time output from
> >> ssh server). This is the only pipe that needs to be wrapped because this is the
> >> one who do extensive usage of 'readline'. The stderr pipe of the process is
> >> alway read in non blocking raw chunk, so it won't benefit from the
> >> buffering.
> >>
> >> diff --git a/mercurial/sshpeer.py b/mercurial/sshpeer.py
> >> --- a/mercurial/sshpeer.py
> >> +++ b/mercurial/sshpeer.py
> >> @@ -87,10 +87,12 @@ class sshpeer(wireproto.wirepeer):
> >>
> >>           # while self.subprocess isn't used, having it allows the subprocess to
> >>           # to clean up correctly later
> >>           self.pipeo, self.pipei, self.pipee, self.subprocess = util.popen4(cmd)
> >>
> >> +        self.pipei = util.bufferedinputpipe(self.pipei)
> >
> > Can't we do dup() + fdopen() or PyFile_SetBufSize() to make only pipei buffered?
> >
> > I don't understand what's going to happen with select(), so maybe I miss the
> > point of new "hasbuffer" property.
> 
> Not we cannot, we need to have access to the buffer state to do our 
> polling IO.
> 
> The polling IO are used to wait on both stdout and stderr. This allow to 
> detect and process server output (stderr) while waiting on protocol data 
> (stdout)
> 
> The issue with the buffer+select is that we want to be able to perform 
> such operation:
> 
>   0) ssh: write "datadata"
>   1) hg:  select() -> hey! data to read
>   2) hg:  read(4) -> "data"
>   3) hg:  select() -> hey! data to read
>   4) hg:  read(4) -> "data"
> 
> If you add the buffering, the 2) will actually read all available data 
> in the buffer the second half. (3) will not see any data to read and 
> block for ever (4) will never run (5) the internet kitten will never 
> have his bowl filled.
> 
> You can see the select usage here: 
> http://42.netv6.net/marmoute-wip/mercurial/rev/b9213af62475

Thanks, I got it. We want to provide synchronous file-like interface for the
main stream, but still want to do asynchronous read for err stream as possible.