[PATCH] httprepo: long arguments support (issue2126)
Laurens Holst
laurens.nospam at grauw.nl
Tue Mar 29 08:12:55 CDT 2011
Op 29-03-11 13:47, Steven Brown schreef:
> On 29 March 2011 01:59, Augie Fackler<durin42 at gmail.com> wrote:
>> On Mon, Mar 28, 2011 at 12:58 PM, Laurens Holst<laurens.nospam at grauw.nl> wrote:
>>> Op 28-3-2011 19:38, Augie Fackler schreef:
>>>> +1, I've suggested this in the past and it sounds reasonable. We
>>>> should also make sure that any response to a request that used headers
>>>> sets appropriate cache-control headers to avoid potential GET caching
>>>> issues.
>>> Actually I think you should use the Vary header for that:
>>>
>>> Vary: X-Hg-Changesets
>>>
>>> Should do the trick.
>> Ah yes. Always forget about that one.
> I don't understand why we want caching. In general, Mercurial could return
> a different response for the same request. For example:
> 1) Client requests heads.
> 2) New head is created on the server.
> 3) Client requests heads again.
>
> What am I missing?
Short answer:
A reason to want caching is to reduce server load and improve response
time. HTTP caching is flexible enough to deal with the above scenario.
Also it is not suggested to add caching in the above message, just that
a Vary header should be added to make sure that if any caching is
configured by the server admin, it will happen correctly.
Long answer:
HTTP caches don’t just blindly return the last result for the same URL.
First, it only returns a cached copy if one of three caching headers
were set by the server (so no unsolicited caching is done by HTTP, see
section 13.4), and if the method and certain headers match. Which
headers match is indicated by the Vary header on the cached response,
and iirc some headers are also included by default.
Now Mercurial itself does not currently cache responses to its http
requests at all. However inbetween Mercurial and the repository there
may be caching proxies. These may be either provided by say, the ISP,
but more typically they are installed on the server to reduce load. For
highly trafficked Mercurial repository servers this is useful functionality.
There are three caching mechanisms in HTTP:
1. An expiry time based one, where you say ‘for the next 5 minutes use a
cached copy’. This one you definitely don’t want to use for the wire
protocol, but for hgweb browsing or feeds it is useful.
2. A last modified time based one, where it asks the server if it has a
newer copy before using the cached one. This one is less efficient than
the first, as it does not prevent the request entirely, but it does
avoid generating and transferring the payload. The Mercurial wire
protocol could use the mtime of the .hg directory for this.
3. A mechanism called ‘etags’ which is pretty similar to 2, except that
it is more general-purpose and more powerful, e.g. it can encode server
version or configuration information.
Using the 2nd caching mechanism, Mercurial server load could be severely
reduced, because for many requests all it has to do is check a time
stamp instead of going into the repository storage and comparing heads etc.
If a server admin wants to configure this kind of caching, if Mercurial
does not set a Vary header correctly, this will fail. Of course the
admin could also manually add that header, and there may be other
reasons that make it fail, but I think it’s good practice to send this
Vary header to make it easier to add caching.
~Laurens
More information about the Mercurial-devel
mailing list