<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_quote"><div dir="ltr">On Wed, Sep 26, 2018 at 11:13 AM Boris FELD <<a href="mailto:boris.feld@octobus.net">boris.feld@octobus.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<div id="gmail-m_-2337673632773901044magicdomid3883" class="gmail-m_-2337673632773901044ace-line"><span>Hi
everyone,</span></div>
<div id="gmail-m_-2337673632773901044magicdomid7185" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7189" class="gmail-m_-2337673632773901044ace-line"><span>Pulling
from a server involves expensive server</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">-side
computation that we wish to cache. However, since the client can
pull any arbitrary set of revision, grouping and dispatching the
data to be cached is a</span><span> hard problem.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid7187" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7188" class="gmail-m_-2337673632773901044ace-line"><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">When
we implemented the new discovery for obsolescence markers, we
developed a "stablerange" method to build an efficient way to
slice the changesets graph into ranges. In addition to solving
the obsolescence markers discovery problem, this "stablerange"
principle seemed to be useful for more usages, in particular,</span><span> the caching of pulls.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid4888" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7190" class="gmail-m_-2337673632773901044ace-line"><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">Right
now, with the current pull bundle implementation, here is how it
work: you manually create and manually declare bundles
containing either all changesets (that could also be used for
clone bundles) or more specific ones. When the client request
some changesets, the server searches a bundle containing the
needed range and send it. This often involves more than the
requested data. The client needs to filter out the extraneous
data. Then the client does a discovery to catch any missing
changesets from the bundle. If the server doesn't find a valid
pull bundle, a normal discovery is done.</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
The manual bundle managements is suboptimal, the search for
appropriate bundles has a bad complexity and the extra roundtrip
and discovery adds extra slowness.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid10"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7191" class="gmail-m_-2337673632773901044ace-line"><span>This week</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">end,
we build a "simple" prototype that use "stablerange" to slice
changegroup request in "getbundle" into multiple bundles</span><span> that can be reused from one pull to another. That
slicing happens as part of a normal pull, during the getbundle
call and after the normal discovery happens. There are no needs
for an extra discovery and getbundle call after it.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid5430" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7192" class="gmail-m_-2337673632773901044ace-line"><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">With
th</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">is</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">
"stablerange"</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
based strategy</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">,</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
we start from the set of requested changesets to generate </span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">a set
of "standard" range covering all of them. This slicing has a
good algorithmic complexity that depends on the size of the
selected "missing" set of changesets. So the associated cost of
will scale well with the size of the associated pull. In
addition, we no longer have to do an expensive search into a
list existing bundles. This helps to scale small pulls and
increase the number of bundles we can cache, as the time we
spend selecting bundle no longer depends on the numbers of
cached ones. Since we can exactly cover the client request, we
also no longer need to issue an</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
extra pull roundtrip after the cache retrieval.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid3772" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3887" class="gmail-m_-2337673632773901044ace-line"><span>That
slicing focus on producing ranges that:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid7197" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">Have
a high chance to be reusable in </span><span>a
pull selecting similar changesets,</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7198" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Gather most of the changesets in large
bundles.</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid6459" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7209" class="gmail-m_-2337673632773901044ace-line"><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">This
caching strategy inherits the nice "stablerange" properties
regarding repository growth</span></div>
<div id="gmail-m_-2337673632773901044magicdomid7210" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">When
a few changesets are appended to a repository, only a few
ranges have</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
to be added.</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7211" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">The
overall number of ranges (and associated bundles) to create
to represent all possible ranges has an</span><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
O(N log(N)) complexity.</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3239" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3890" class="gmail-m_-2337673632773901044ace-line"><span>For
example, here are the 15 ranges selected for a full clone of
mozilla-central:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid3276" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3891" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>262114 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3892" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>30 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3893" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>65536 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3894" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>32741 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3895" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>20 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3896" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>7 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3897" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>8192 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3898" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>243 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3899" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>13 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3900" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>114 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3901" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>14 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3902" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>32 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3903" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>16 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3904" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>8 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3905" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>1 changesets</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3366" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><br>
</li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7212" class="gmail-m_-2337673632773901044ace-line"><span>If we only
clone a subset of the repository, the larger ranges get reused
(hg clone --rev -5000):</span></div>
<div id="gmail-m_-2337673632773901044magicdomid3907" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>262114 changesets found in caches</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3908" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>30 changesets found in caches</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3909" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>65536 changesets found in caches</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3910" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>32741 changesets found in caches</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3911" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>20 changesets found in caches</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3912" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>7 changesets found in caches</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3913" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>2048 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3914" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>1024 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3915" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>482 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3916" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>30 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3917" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>32 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3918" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>1 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3919" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>7 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3920" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>4 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3921" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>2 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7111" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-indent1">
<li><code><span>1 changesets found</span></code></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7213" class="gmail-m_-2337673632773901044ace-line"><span class="gmail-m_-2337673632773901044author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">As
you can see, the larger ranges of this second pull are common
with the previous pull, allowing to reuse cached bundles.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid14"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7214" class="gmail-m_-2337673632773901044ace-line"><span>The
prototype is available in a small "pullbundle" extension. It
focus</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">es on
the slicing itself and we did not implement anything fancy for
the cache storage and delivery. We simply store generated bundle
on disk and we read it from disk when it is needed again.
Others, like Joerg Sonnenberger or Gre</span><span>gory
Szorc, are already working on the "cache delivery" problem.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid16"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7215" class="gmail-m_-2337673632773901044ace-line"><span>We are
getting good result our of that prototypes when testing it on
clones of mozilla-central and netbsd-src. See "Example Result"
section for detail.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid18"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7216" class="gmail-m_-2337673632773901044ace-line"><span>The
prototype is up and running on our hgweb "mirror" instance </span><span class="gmail-m_-2337673632773901044url"><a href="https://mirror.octobus.net/" target="_blank">https://mirror.octobus.net/</a></span><span>.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid20"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7218" class="gmail-m_-2337673632773901044ace-line"><span>The
extension comes with a small debug command that produce</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s
statistic of the ranges that multiple random pulls would us</span><span>e.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid23"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7219" class="gmail-m_-2337673632773901044ace-line"><span>The
"stablerange" implementation currently still[1] live in the
evolve extensions, so we put the extensions in the same
repository for simplicity as "pullbundle". This is not ideal but
was a simple solution in the time we could dedicate. This
extension is not part of any official release yet. To test it
you have to install it from the repository for now: </span><span class="gmail-m_-2337673632773901044url"><a href="https://www.mercurial-scm.org/repo/evolve/#default" target="_blank">https://www.mercurial-scm.org/repo/evolve/#default</a></span></div>
<div id="gmail-m_-2337673632773901044magicdomid25"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7220" class="gmail-m_-2337673632773901044ace-line"><span>The
extension</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">'</span><span>s code is here: </span><span class="gmail-m_-2337673632773901044url"><a href="https://www.mercurial-scm.org/repo/evolve/file/tip/hgext3rd/pullbundle.py" target="_blank">https://www.mercurial-scm.org/repo/evolve/file/tip/hgext3rd/pullbundle.py</a></span></div>
<div id="gmail-m_-2337673632773901044magicdomid27"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7221" class="gmail-m_-2337673632773901044ace-line"><span>The
prototype performance </span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">is not
stellar, but good enough to give useful result in a reasonable
amount of time. A production-grade implementation of stablerange
algorithm and storage will fix that. There is also room for
improvement progression in the algorithm themselves, multiple
sub-problem can be improved. We started having regular meeting
with University researcher working on graph theory, they are
interested in the problem space</span><span>.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid29"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7222" class="gmail-m_-2337673632773901044ace-line"><span>[1] The
stablerange principle has been validating in the field and is
ready to get upstreamed.</span></div></div></blockquote><div><br></div><div>This experimentation is very useful and shows great promise!</div><div><br></div><div>Intelligent "slicing" of data in order to facilitate high cache hit rates and lower server work and client overhead is definitely worth pursuing.</div><div><br></div><div>FWIW I submitted my caching series for wire protocol version 2 a few hours ago. The meaningful patches are at <a href="https://phab.mercurial-scm.org/D4773">https://phab.mercurial-scm.org/D4773</a> and <a href="https://phab.mercurial-scm.org/D4774">https://phab.mercurial-scm.org/D4774</a>. This work focuses on transparent whole-response caching, however, and does not (yet) implement a caching layer for e.g. storage access. I will note that I *really* want to enable Mercurial servers to offload as much data transfer to static file servers as possible.<br></div><div><br></div><div>An open item to solve is how we're going to facilitate bulk data transfer using fewer roundtrips in wire protocol version 2. Right now, the client has to send a *lot* of commands and data to the server (all the manifest and file nodes). We know we must do something better. I think this slicing idea could definitely be employed to facilitate higher cache hit rates. How, exactly, I'm not sure.</div><div><br></div><div>Something else to consider is that right now, an `hg clone` will preserve changelog order from server to client. This is trivial to change. And it is arguably an implementation detail of revlogs, which preserve insertion order. (And until <a href="https://www.mercurial-scm.org/repo/hg-committed/rev/db5501d93bcf">https://www.mercurial-scm.org/repo/hg-committed/rev/db5501d93bcf</a> it was the behavior for manifests and filelogs as well.) I'm not sure how important it is to preserve this property. But if we start slicing changesets, we could easily break this behavior.<br></div><div><br></div><div>This email caused me to consider the possibility that the "content redirects" feature proposed in D4774 should (eventually) support multiple locations. Then, a server could slice a large response into multiple payloads and have the client retrieve them separately.</div><div><br></div><div>Alternatively, since wire protocol version 2 currently performs a lot of granular requests, perhaps the server could inform the client of preferred slicing for data fetching and the client could follow up accordingly. How this would interact with bulk data transfer commands, I'm not sure.</div><div><br></div><div>There's definitely a lot to think about and I hope this slicing experimentation finds a home in core someday!<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#FFFFFF">
<div id="gmail-m_-2337673632773901044magicdomid31"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3931" class="gmail-m_-2337673632773901044ace-line">
<h2><span>Example results.</span></h2>
</div>
<div id="gmail-m_-2337673632773901044magicdomid33"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7223" class="gmail-m_-2337673632773901044ace-line"><span>The
extensions come</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z"> with
a command to simulate multiple pulls</span><span> of a
random set of revisions (from a larger set of revision we
define). This starts with a cold cache for simplicity.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid35"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3933" class="gmail-m_-2337673632773901044ace-line">
<h3><span>Mozilla Central</span></h3>
</div>
<div id="gmail-m_-2337673632773901044magicdomid37"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3934" class="gmail-m_-2337673632773901044ace-line"><span>We
gathering 100 sample pulls within 20443 revisions</span></div>
<div id="gmail-m_-2337673632773901044magicdomid3935" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median pull size: 18 338</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3936" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median number of changegroup used: 132
changegroups.</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3937" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median changeset not cached: 88</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3938" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median ratio of changeset already in the
cache: 99.5%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid44"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7224" class="gmail-m_-2337673632773901044ace-line"><span>The number
of different ranges stay</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s</span><span> under control as expected:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid3940" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset served: 1 817 955</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3941" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset cache hit ratio: 96% (from a
cold cache)</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3942" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Distinct range cached: 12 983, Most of them
very small (90% ≤ 32 changesets)</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3943" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>A small number of (larger) ranges get most of
the cache hit.</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid49"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7225" class="gmail-m_-2337673632773901044ace-line"><span>Providing
the smaller range from cache might not be a good tradeoff. If we
skip using the cache for smaller range</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s we
still get interesting results</span><span>:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid51"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3945" class="gmail-m_-2337673632773901044ace-line"><span>Only
caching range containing 256 changeset or more:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2029" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3946" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median pull size: 18 940</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3947" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median number of changegroup used: 12</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3948" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median changeset not cached: 1 949</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3949" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median ratio of changeset already in the
cache: 90%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3950" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset served: 1 850 243</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3951" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset cache hit ratio: 87%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3952" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Distinct range cached: 1 150</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3179" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7226" class="gmail-m_-2337673632773901044ace-line"><span>Another way
to reduce the number of server bundle would be </span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">to d</span><span>o some "over serving": using bundle containing some
common changesets.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2795" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3954" class="gmail-m_-2337673632773901044ace-line"><span>(See the
end of the email for full details)</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2037" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3955" class="gmail-m_-2337673632773901044ace-line">
<h3><span>netbsd</span></h3>
</div>
<div id="gmail-m_-2337673632773901044magicdomid2460" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3956" class="gmail-m_-2337673632773901044ace-line"><span>This time,
we issues more random pull (1000) within a set a bit smaller set
of 10 000 changesets.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2622" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3957" class="gmail-m_-2337673632773901044ace-line"><span>This
resulted in smaller pulls, that also show good results:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2297" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3958" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median pull size: 1673</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3959" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median number of changegroup used: 51</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3960" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median changeset not cached: 50</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3961" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median ratio of changeset already in the
cache: 99%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3962" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset served: 1601087</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3963" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset cache hit ratio: 96%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3964" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Distinct range cached: 51751</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid2509" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3965" class="gmail-m_-2337673632773901044ace-line"><span>Trying the
same 256+ changesets limit on caching, we see a stronger impact.
Probably because of the smaller pulls:</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2287" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3966" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median pull size: 1592</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3967" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median number of changegroup used: 2</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3968" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median changeset not cached: 663</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3969" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median ratio of changeset already in the
cache: 46%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3970" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset served: 1 554 227</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3971" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset cache hit ratio: 56%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3972" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Distinct range cached: 1 914</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid2820" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3973" class="gmail-m_-2337673632773901044ace-line"><span>(See the
end of the email for full details)</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2247" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3974" class="gmail-m_-2337673632773901044ace-line">
<h3><span>pypy</span></h3>
</div>
<div id="gmail-m_-2337673632773901044magicdomid2848" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3975" class="gmail-m_-2337673632773901044ace-line"><span>pypy
testing was done using 1 000 pulls within 16687 changesets.</span></div>
<div id="gmail-m_-2337673632773901044magicdomid3083" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid2138" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3976" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median pull size: 11375</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3977" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median number of changegroup used: 1206</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3978" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median changeset not cached: 12</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3979" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median ratio of changeset already in the
cache: 100%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3980" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset served: 11 167 863</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3981" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset cache hit ratio: 99%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3982" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Distinct range cached: 1 139 537</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3084" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid7227" class="gmail-m_-2337673632773901044ace-line"><span>Installing
the 256+ changeset limits give less good result</span><span class="gmail-m_-2337673632773901044author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s.
This is probably the result of a shallower pull space and the
amount of merge in the pypy repository. The pypy repository is
significantly more branchy than the other ones, there is</span><span> some known way we could improve stablerange
partitioning in this cases (to produce larger ranges).</span></div>
<div id="gmail-m_-2337673632773901044magicdomid2314" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3984" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median pull size: 11457</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3985" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median number of changegroup used: 9</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3986" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median changeset not cached: 7276</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3987" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Median ratio of changeset already in the
cache: 37%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3988" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset served: 11211093</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3989" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Total changeset cache hit ratio: 36%</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3990" class="gmail-m_-2337673632773901044ace-line">
<ul class="gmail-m_-2337673632773901044list-bullet1">
<li><span>Distinct range cached: 8964</span></li>
</ul>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3082" class="gmail-m_-2337673632773901044ace-line"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3991" class="gmail-m_-2337673632773901044ace-line">
<h2><span>Full details:</span></h2>
</div>
<div id="gmail-m_-2337673632773901044magicdomid71"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid3992" class="gmail-m_-2337673632773901044ace-line">
<h3><span>mozilla central, 100 pull in 20553 revisions,
no limit</span></h3>
<pre> mozilla-central> hg debugpullbundlecacheoverlap --count 100 -- 'tip~10000:' </pre>
<pre> gathering 100 sample pulls within 20443 revisions</pre>
<pre> pull size:</pre>
<pre> min: 13176</pre>
<pre> 10%: 16425</pre>
<pre> 25%: 17344</pre>
<pre> 50%: 18338</pre>
<pre> 75%: 19192</pre>
<pre> 90%: 19719</pre>
<pre> 95%: 19902</pre>
<pre> max: 20272</pre>
<pre> non-cached changesets:</pre>
<pre> min: 4</pre>
<pre> 10%: 10</pre>
<pre> 25%: 24</pre>
<pre> 50%: 88</pre>
<pre> 75%: 237</pre>
<pre> 90%: 956</pre>
<pre> 95%: 3152</pre>
<pre> max: 17440</pre>
<pre> ratio of cached changesets:</pre>
<pre> min: 0.0</pre>
<pre> 10%: 0.947941624918</pre>
<pre> 25%: 0.987343800064</pre>
<pre> 50%: 0.995036297078</pre>
<pre> 75%: 0.998774313882</pre>
<pre> 90%: 0.999421798208</pre>
<pre> 95%: 0.999634750848</pre>
<pre> max: 0.999795354548</pre>
<pre> bundle count:</pre>
<pre> min: 74</pre>
<pre> 10%: 99</pre>
<pre> 25%: 113</pre>
<pre> 50%: 132</pre>
<pre> 75%: 146</pre>
<pre> 90%: 158</pre>
<pre> 95%: 169</pre>
<pre> max: 186</pre>
<pre> ratio of cached bundles:</pre>
<pre> min: 0.0</pre>
<pre> 10%: 0.685082872928</pre>
<pre> 25%: 0.810810810811</pre>
<pre> 50%: 0.911392405063</pre>
<pre> 75%: 0.953020134228</pre>
<pre> 90%: 0.974683544304</pre>
<pre> 95%: 0.98125</pre>
<pre> max: 0.993377483444</pre>
<pre> changesets served:</pre>
<pre> total: 1817955</pre>
<pre> from cache: 1752642 (96%)</pre>
<pre> bundle: 12983</pre>
<pre> size of cached bundles:</pre>
<pre> min: 1</pre>
<pre> 10%: 1</pre>
<pre> 25%: 2</pre>
<pre> 50%: 4</pre>
<pre> 75%: 9</pre>
<pre> 90%: 32</pre>
<pre> 95%: 64</pre>
<pre> max: 8165</pre>
<pre> hit on cached bundles:</pre>
<pre> min: 1</pre>
<pre> 10%: 1</pre>
<pre> 25%: 1</pre>
<pre> 50%: 2</pre>
<pre> 75%: 4</pre>
<pre> 90%: 14</pre>
<pre> 95%: 20</pre>
<pre> max: 100</pre>
</div>
<div id="gmail-m_-2337673632773901044magicdomid4062" class="gmail-m_-2337673632773901044ace-line">
<h3><span>mozilla central, 100 pull in 20443 revision,
only caching ranges of 256 changeset and above</span></h3>
</div>
<pre> mozilla-central > hg debugpullbundlecacheoverlap --count 100 --min-cache=256 -- 'tip~10000:'
gathering 100 sample pulls within 20443 revisions
not caching ranges smaller than 256 changesets
pull size:
min: 14060
10%: 16910
25%: 17923
50%: 18940
75%: 19471
90%: 19884
95%: 20029
max: 20309
non-cached changesets:
min: 973
10%: 1398
25%: 1707
50%: 1949
75%: 2246
90%: 2590
95%: 3448
max: 19551
ratio of cached changesets:
min: 0.0
10%: 0.839908649729
25%: 0.884512085944
50%: 0.897293365889
75%: 0.91018907563
90%: 0.926944971537
95%: 0.935139218649
max: 0.946839315959
bundle count:
min: 4
10%: 10
25%: 11
50%: 12
75%: 12
90%: 13
95%: 15
max: 16
ratio of cached bundles:
min: 0.0
10%: 0.909090909091
25%: 1.0
50%: 1.0
75%: 1.0
90%: 1.0
95%: 1.0
max: 1.0
changesets served:
total: 1850243
from cache: 1617379 (87%)
bundle: 1150
size of cached bundles:
min: 256
10%: 256
25%: 256
50%: 512
75%: 1024
90%: 1024
95%: 1024
max: 8165
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 7
75%: 44
90%: 44
95%: 59
max: 98
</pre>
<div id="gmail-m_-2337673632773901044magicdomid216"><br>
</div>
<div id="gmail-m_-2337673632773901044magicdomid4133" class="gmail-m_-2337673632773901044ace-line">
<h3><span>netbsd-src 1000 pull within 10000 revisions:</span></h3>
</div>
<pre> netbsd-src > hg debugpullbundlecacheoverlap --count 1000 -- '-10000:'
gathering 1000 sample pulls within 10000 revisions
pull size:
min: 10
10%: 339
25%: 865
50%: 1673
75%: 2330
90%: 2752
95%: 2893
max: 3466
non-cached changesets:
min: 0
10%: 3
25%: 7
50%: 16
75%: 50
90%: 137
95%: 239
max: 2787
ratio of cached changesets:
min: 0.0
10%: 0.781553398058
25%: 0.940663176265
50%: 0.987631184408
75%: 0.996178830722
90%: 0.99843688941
95%: 0.998939554613
max: 1.0
bundle count:
min: 10
10%: 28
25%: 37
50%: 51
75%: 65
90%: 78
95%: 88
max: 121
ratio of cached bundles:
min: 0.0
10%: 0.446808510638
25%: 0.673076923077
50%: 0.826086956522
75%: 0.901639344262
90%: 0.942857142857
95%: 0.96
max: 1.0
changesets served:
total: 1601087
from cache: 1539491 (96%)
bundle: 51751
size of cached bundles:
min: 1
10%: 1
25%: 1
50%: 1
75%: 4
90%: 8
95%: 16
max: 2048
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 2
75%: 3
90%: 8
95%: 13
max: 291
</pre>
<div id="gmail-m_-2337673632773901044magicdomid4203" class="gmail-m_-2337673632773901044ace-line">
<h3><span>netbsd-src 1000 pull within 10000 revisions,
not caching range smaller than 256:</span></h3>
</div>
<pre> netbsd-src > hg debugpullbundlecacheoverlap --count 1000 --min-cache=256 -- '-10000:'
gathering 1000 sample pulls within 10000 revisions
not caching ranges smaller than 256 changesets
pull size:
min: 10
10%: 329
25%: 813
50%: 1592
75%: 2271
90%: 2745
95%: 2922
max: 3719
non-cached changesets:
min: 10
10%: 265
25%: 440
50%: 663
75%: 911
90%: 1111
95%: 1229
max: 2852
ratio of cached changesets:
min: 0.0
10%: 0.0
25%: 0.136752136752
50%: 0.461261261261
75%: 0.686327077748
90%: 0.792263056093
95%: 0.829552819183
max: 0.959700093721
bundle count:
min: 0
10%: 0
25%: 1
50%: 2
75%: 3
90%: 4
95%: 4
max: 6
ratio of cached bundles:
min: 0.0
10%: 1.0
25%: 1.0
50%: 1.0
75%: 1.0
90%: 1.0
95%: 1.0
max: 1.0
changesets served:
total: 1554227
from cache: 871680 (56%)
bundle: 1914
size of cached bundles:
min: 256
10%: 256
25%: 256
50%: 256
75%: 512
90%: 512
95%: 512
max: 2048
hit on cached bundles:
min: 2
10%: 3
25%: 7
50%: 61
75%: 113
90%: 113
95%: 117
max: 267
</pre>
<div id="gmail-m_-2337673632773901044magicdomid4274" class="gmail-m_-2337673632773901044ace-line">
<h3><span>pypy 1000 pulls within 16687 changesets:</span></h3>
</div>
<pre> pypy > time hg debugpullbundlecacheoverlap --count 1000 -- 'tip~2000:'
gathering 1000 sample pulls within 16687 revisions
pull size:
min: 5835
10%: 9165
25%: 10323
50%: 11375
75%: 12248
90%: 12904
95%: 13181
max: 14221
non-cached changesets:
min: 0
10%: 1
25%: 4
50%: 12
75%: 39
90%: 142
95%: 453
max: 12046
ratio of cached changesets:
min: 0.0
10%: 0.986539780521
25%: 0.99640167364
50%: 0.99889963059
75%: 0.99964598637
90%: 0.999911496593
95%: 1.0
max: 1.0
bundle count:
min: 183
10%: 762
25%: 1045
50%: 1206
75%: 1308
90%: 1387
95%: 1427
max: 1631
ratio of cached bundles:
min: 0.0
10%: 0.972310126582
25%: 0.990171990172
50%: 0.995529061103
75%: 0.997882851094
90%: 0.999203821656
95%: 1.0
max: 1.0
changesets served:
total: 11167863
from cache: 11060195 (99%)
bundle: 1139537
size of cached bundles:
min: 1
10%: 1
25%: 1
50%: 2
75%: 4
90%: 15
95%: 30
max: 2041
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 3
75%: 20
90%: 245
95%: 848
max: 999
</pre>
<div id="gmail-m_-2337673632773901044magicdomid4344" class="gmail-m_-2337673632773901044ace-line">
<h3><span>pypy 1000 pulls within 16687 changesets:,
caching above 256 changesets only:</span></h3>
</div>
<code><br>
time hg debugpullbundlecacheoverlap --count 1000
--min-cache=256 -- 'tip~2000:'<br>
<br>
gathering 1000 sample pulls within 16687 revisions<br>
<br>
not caching ranges smaller than 256 changesets<br>
<br>
pull size:<br>
<br>
min: 3629<br>
<br>
10%: 9075<br>
<br>
25%: 10278<br>
<br>
50%: 11457<br>
<br>
75%: 12325<br>
<br>
90%: 12961<br>
<br>
95%: 13245<br>
<br>
max: 14330<br>
<br>
non-cached changesets:<br>
<br>
min: 2605<br>
<br>
10%: 5885<br>
<br>
25%: 6619<br>
<br>
50%: 7276<br>
<br>
75%: 7813<br>
<br>
90%: 8319<br>
<br>
95%: 8577<br>
<br>
max: 11815<br>
<br>
ratio of cached changesets:<br>
<br>
min: 0.0<br>
<br>
10%: 0.296190172981<br>
<br>
25%: 0.344310129221<br>
<br>
50%: 0.368110984417<br>
<br>
75%: 0.391840607211<br>
<br>
90%: 0.417344173442<br>
<br>
95%: 0.445362934971<br>
<br>
max: 0.544580009385<br>
<br>
bundle count:<br>
<br>
min: 1<br>
<br>
10%: 7<br>
<br>
25%: 8<br>
<br>
50%: 9<br>
<br>
75%: 10<br>
<br>
90%: 11<br>
<br>
95%: 11<br>
<br>
max: 12<br>
<br>
ratio of cached bundles:<br>
<br>
min: 0.0<br>
<br>
10%: 1.0<br>
<br>
25%: 1.0<br>
<br>
50%: 1.0<br>
<br>
75%: 1.0<br>
<br>
90%: 1.0<br>
<br>
95%: 1.0<br>
<br>
max: 1.0<br>
<br>
changesets served:<br>
<br>
total: 11211093<br>
<br>
from cache: 4066703 (36%)<br>
<br>
bundle: 8964<br>
<br>
size of cached bundles:<br>
<br>
min: 256<br>
<br>
10%: 256<br>
<br>
25%: 256<br>
<br>
50%: 468<br>
<br>
75%: 510<br>
<br>
90%: 1019<br>
<br>
95%: 512<br>
<br>
max: 1916<br>
<br>
hit on cached bundles:<br>
<br>
min: 1<br>
<br>
10%: 1<br>
<br>
25%: 3<br>
<br>
50%: 13<br>
<br>
75%: 63<br>
<br>
90%: 823<br>
<br>
95%: 153<br>
<br>
max: 958</code>
</div>
</blockquote></div></div></div></div></div>