Adding HTTP headers to HTTP protocol responses

Mike Hommey mh at glandium.org
Fri May 6 19:11:33 EDT 2016


On Fri, May 06, 2016 at 10:46:52AM -0700, Gregory Szorc wrote:
> On Fri, May 6, 2016 at 5:46 AM, Mike Hommey <mh at glandium.org> wrote:
> 
> > Hi,
> >
> > I've been looking at the wire protocol recently and found two things,
> > one that is wrong and one that is suboptimal, in the HTTP protocol.
> >
> > The first is that the HTTP client sends a Vary header when it uses HTTP
> > headers to convey the protocol parameters. But Vary is not an HTTP
> > request header. It is a *response* header.
> > [ https://tools.ietf.org/html/rfc2616#section-14.44 ]
> >
> 
> Wow. Even I missed that one when I was looking at Mercurial's HTTP protocol
> a few months ago.
> 
> 
> >
> > Whether it's worth setting is questionable. Arguably, it should be
> > Vary: *. Or even Cache-control: no-cache.
> > [ https://tools.ietf.org/html/rfc2616#section-14.9 ]
> >
> > The second is that the response to a getbundle request is a chunked (per
> > https://tools.ietf.org/html/rfc2616#section-3.6.1 ) raw zlib stream. But
> > it's not marked as such with a Content-Encoding header (with the value
> > "deflate"). The difference that it makes is that a conforming user agent
> > can then see it can decompress the stream on its own after dechunking
> > it. So for instance, opening https://repo/path/?cmd=getbundle with a
> > browser would allow to save a raw changegroup1 or bundle2, instead of
> > a raw zlib stream. Or curl --compressed could be used. That would make
> > some kinds of debugging easier.
> >
> 
> I've noticed this before. There are additional problems with the current
> implementation:
> 
> * zlib is happening in Python instead of the underlying HTTP server. This
> almost certainly introduces more overhead and contributes to performance
> loss
> * We can't efficiently send streaming/uncompressed bundles with that wire
> protocol command
> * Repositories using alternate storage compression (like lz4) use zlib over
> the wire, offsetting performance wins
> 
> 
> > Now, while I'd be interested in fixing those issues, the way the wire
> > protocol interacts with the http server doesn't have much leeway to give
> > HTTP response headers.
> >
> > Ideas?
> >
> 
> I have vague recollections of convincing some subset of {mpm, durin42,
> marmoute} that we need to introduce a new wire protocol command for
> retrieving bundles - one that does Content-Encoding and Transfer-Encoding
> more sanely. In theory you could do this by introducing a server capability
> and having the client pass parameters to the existing getbundle command.
> However, IMO we've shoehorned enough features into getbundle (like bundle2
> and POST parameters): it's time to start with a clean slate. So +1 to
> improving the HTTP protocol.

I'm not convinced. The problem here is entirely a transport one. The
HTTP client is expecting things to be zlib-deflated for getbundle,
changegroup and changegroupsubset. The SSH client expects things
uncompressed, and leaves it to SSH to do any compression. Handling a
HTTP-specific thing with a capability seems wrong. It seems to me this
should be handled entirely in the HTTP layer, not the wire protocol.

Mike


More information about the Mercurial-devel mailing list