[RFO] clonecache proof of concept

Augie Fackler durin42 at gmail.com
Sat Nov 26 12:38:35 CST 2011


On Nov 26, 2011, at 12:04 PM, Matt Mackall wrote:
> 
> Here's a little extension I wrote last night that caches the last clone
> bundle to reduce load on a server.
> 
> In my testing between two unloaded machines, there's little to no effect
> as the client has to use just about as much time unpacking the bundle as
> the server does packing it. But in theory, this should allow a machine
> with lots of bandwidth to service many more clones before running out of
> CPU.
> 
> Caveats:
> 
> - this currently caches the uncompressed bundle stream, which can be
> much bigger than the repo
> - it should use atomic file I/O
> 
> I'm hoping someone who actually has a use for this trick (ie because
> they're hosting lots of repos) will take this ball and run with it.

I do something similar with hg-on-bigtable, but instead of caching whole bundles (which is actually something I want to tinker with to try and fix the next bottleneck), I cache deltas - I found that we were burning tons of CPU time computing deltas, and now we're bound by throughput from bigtable. I should see if I can extract what we're doing into something that'd usefully poke deltas into memcached or something similar.

> diff -r ad686c818e1c hgext/clonecache.py
> --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
> +++ b/hgext/clonecache.py	Sat Nov 26 11:50:55 2011 -0600
> @@ -0,0 +1,38 @@
> +from mercurial import extensions, node, changegroup
> +import os, errno
> +
> +def cachebundle(orig, self, commonrevs, csets, heads, source):
> +    if commonrevs == set([-1]):
> +        headstr = " ".join(node.hex(h) for h in sorted(heads))
> +        try:
> +            cacheheads = self.opener.read("cache/clonecache.heads").strip()
> +        except EnvironmentError:
> +            cacheheads = ""
> +        if headstr == cacheheads:
> +            # read mode
> +            stream = self.opener("cache/clonecache.hg")
> +            cg = changegroup.unbundle10(stream, 'UN')
> +        else:
> +            cg = orig(self, commonrevs, csets, heads, source)
> +            stream = cg._stream
> +            streamread = stream.read
> +            try:
> +                os.unlink(self.join("cache/clonecache.heads"))
> +            except OSError, inst:
> +                if inst.errno == errno.ENOENT:
> +                    pass
> +            cf = self.opener("cache/clonecache.hg", "w")
> +            def saveread(l):
> +                r = streamread(l)
> +                cf.write(r)
> +                if not r:
> +                    self.opener.write("cache/clonecache.heads", headstr + "\n")
> +                return r
> +            stream.read = saveread
> +    else:
> +        cg = orig(self, commonrevs, csets, heads, source)
> +
> +    return cg
> +
> +def reposetup(ui, repo):
> +    extensions.wrapfunction(repo.__class__, '_changegroupsubset', cachebundle)





More information about the Mercurial-devel mailing list