Adding HTTP headers to HTTP protocol responses

Gregory Szorc gregory.szorc at gmail.com
Fri May 6 20:14:07 EDT 2016


On Fri, May 6, 2016 at 4:11 PM, Mike Hommey <mh at glandium.org> wrote:

> On Fri, May 06, 2016 at 10:46:52AM -0700, Gregory Szorc wrote:
> > On Fri, May 6, 2016 at 5:46 AM, Mike Hommey <mh at glandium.org> wrote:
> >
> > > Hi,
> > >
> > > I've been looking at the wire protocol recently and found two things,
> > > one that is wrong and one that is suboptimal, in the HTTP protocol.
> > >
> > > The first is that the HTTP client sends a Vary header when it uses HTTP
> > > headers to convey the protocol parameters. But Vary is not an HTTP
> > > request header. It is a *response* header.
> > > [ https://tools.ietf.org/html/rfc2616#section-14.44 ]
> > >
> >
> > Wow. Even I missed that one when I was looking at Mercurial's HTTP
> protocol
> > a few months ago.
> >
> >
> > >
> > > Whether it's worth setting is questionable. Arguably, it should be
> > > Vary: *. Or even Cache-control: no-cache.
> > > [ https://tools.ietf.org/html/rfc2616#section-14.9 ]
> > >
> > > The second is that the response to a getbundle request is a chunked
> (per
> > > https://tools.ietf.org/html/rfc2616#section-3.6.1 ) raw zlib stream.
> But
> > > it's not marked as such with a Content-Encoding header (with the value
> > > "deflate"). The difference that it makes is that a conforming user
> agent
> > > can then see it can decompress the stream on its own after dechunking
> > > it. So for instance, opening https://repo/path/?cmd=getbundle with a
> > > browser would allow to save a raw changegroup1 or bundle2, instead of
> > > a raw zlib stream. Or curl --compressed could be used. That would make
> > > some kinds of debugging easier.
> > >
> >
> > I've noticed this before. There are additional problems with the current
> > implementation:
> >
> > * zlib is happening in Python instead of the underlying HTTP server. This
> > almost certainly introduces more overhead and contributes to performance
> > loss
> > * We can't efficiently send streaming/uncompressed bundles with that wire
> > protocol command
> > * Repositories using alternate storage compression (like lz4) use zlib
> over
> > the wire, offsetting performance wins
> >
> >
> > > Now, while I'd be interested in fixing those issues, the way the wire
> > > protocol interacts with the http server doesn't have much leeway to
> give
> > > HTTP response headers.
> > >
> > > Ideas?
> > >
> >
> > I have vague recollections of convincing some subset of {mpm, durin42,
> > marmoute} that we need to introduce a new wire protocol command for
> > retrieving bundles - one that does Content-Encoding and Transfer-Encoding
> > more sanely. In theory you could do this by introducing a server
> capability
> > and having the client pass parameters to the existing getbundle command.
> > However, IMO we've shoehorned enough features into getbundle (like
> bundle2
> > and POST parameters): it's time to start with a clean slate. So +1 to
> > improving the HTTP protocol.
>
> I'm not convinced. The problem here is entirely a transport one. The
> HTTP client is expecting things to be zlib-deflated for getbundle,
> changegroup and changegroupsubset. The SSH client expects things
> uncompressed, and leaves it to SSH to do any compression. Handling a
> HTTP-specific thing with a capability seems wrong. It seems to me this
> should be handled entirely in the HTTP layer, not the wire protocol.


That was the point I was trying to make.

Although we need to give considerations to server operators or clients who
may find themselves behind misbehaving proxies or other network devices
that may do silly things like strip or change Content-Encoding. I don't
think Mercurial can completely bury its head in the sand. But I like the
default relying on HTTP for the compression.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20160506/78079a6b/attachment.html>


More information about the Mercurial-devel mailing list