[PATCH STABLE] changegroup: retry read() when partial data returned

Mon Mar 27 11:59:27 EDT 2017

On Mon, 27 Mar 2017 08:25:41 -0700, Gregory Szorc wrote:
> > On Mar 27, 2017, at 08:13, Yuya Nishihara <yuya at tcha.org> wrote:
> >> On Sun, 26 Mar 2017 11:56:59 -0700, Gregory Szorc wrote:
> >> # HG changeset patch
> >> # User Gregory Szorc <gregory.szorc at gmail.com>
> >> # Date 1490554582 25200
> >> #      Sun Mar 26 11:56:22 2017 -0700
> >> # Branch stable
> >> # Node ID 414128ddd876eb33f70b0dd95d110e29a9308c93
> >> # Parent  ed5b25874d998ababb181a939dd37a16ea644435
> >> changegroup: retry read() when partial data returned

[...]

> >> def readexactly(stream, n):
> >>     '''read n bytes from stream.read and abort if less was available'''
> >> -    s = stream.read(n)
> >> -    if len(s) < n:
> >> +    # Most of the time, stream.read() returns exactly the number of bytes
> >> +    # requested. Various Python APIs will perform multiple system calls
> >> +    # to retrieve more bytes if the first call did not return enough. This
> >> +    # includes cases where the system call is interrupted.
> >> +    #
> >> +    # Even with this retry logic in the Python APIs, we still see failure
> >> +    # to retrieve the requested number of bytes. So, we build in our own
> >> +    # retry layer here.
> >> +    left = n
> >> +    chunk = stream.read(n)
> >> +    left -= len(chunk)
> >> +    assert left >= 0
> >> +
> >> +    # Common case is it just works.
> >> +    if not left:
> >> +        return chunk
> >> +
> >> +    chunks = [chunk]
> >> +    # Retry to get remaining data. In cases where the stream has errored or
> >> +    # is at EOF, this will do a bit of redundant work. But it helps prevent
> >> +    # intermittent failures and isn't common. So the overhead is acceptable.
> >> +    for i in range(3):
> >> +        chunk = stream.read(left)
> >> +        chunks.append(chunk)
> >> +        left -= len(chunk)
> >> +        assert left >= 0
> >> +
> >> +        if not left:
> >> +            break
> > 
> > Do you have some number that supports this change actually mitigates the
> > problem?
> 
> No. I could try to get that this week.

Many thanks.

> > Suppose underlying functions have bugs, I don't think we can always get
> > contiguous chunks with no data loss. If a few bytes lost for example, we
> > could read arbitrary part as a length field, and call readexactly() with
> > that invalid length value.
> 
> Hmm. My assumption is that any error that would result in data loss effectively poisons the stream and prevents future reads.

I was thinking e.g. some intermediate layer might not handle EINTR
appropriately and EINTR might be caught by some upper layer. If it had
a temporary buffer, its content would be lost.