<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div id="magicdomid3883" class="ace-line"><span class="">Hi
everyone,</span></div>
<div id="magicdomid7185" class="ace-line"><br>
</div>
<div id="magicdomid7189" class="ace-line"><span class="">Pulling
from a server involves expensive server</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">-side
computation that we wish to cache. However, since the client can
pull any arbitrary set of revision, grouping and dispatching the
data to be cached is a</span><span class=""> hard problem.</span></div>
<div id="magicdomid7187" class="ace-line"><br>
</div>
<div id="magicdomid7188" class="ace-line"><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">When
we implemented the new discovery for obsolescence markers, we
developed a "stablerange" method to build an efficient way to
slice the changesets graph into ranges. In addition to solving
the obsolescence markers discovery problem, this "stablerange"
principle seemed to be useful for more usages, in particular,</span><span
class=""> the caching of pulls.</span></div>
<div id="magicdomid4888" class="ace-line"><br>
</div>
<div id="magicdomid7190" class="ace-line"><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">Right
now, with the current pull bundle implementation, here is how it
work: you manually create and manually declare bundles
containing either all changesets (that could also be used for
clone bundles) or more specific ones. When the client request
some changesets, the server searches a bundle containing the
needed range and send it. This often involves more than the
requested data. The client needs to filter out the extraneous
data. Then the client does a discovery to catch any missing
changesets from the bundle. If the server doesn't find a valid
pull bundle, a normal discovery is done.</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
The manual bundle managements is suboptimal, the search for
appropriate bundles has a bad complexity and the extra roundtrip
and discovery adds extra slowness.</span></div>
<div id="magicdomid10" class=""><br>
</div>
<div id="magicdomid7191" class="ace-line"><span class="">This week</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">end,
we build a "simple" prototype that use "stablerange" to slice
changegroup request in "getbundle" into multiple bundles</span><span
class=""> that can be reused from one pull to another. That
slicing happens as part of a normal pull, during the getbundle
call and after the normal discovery happens. There are no needs
for an extra discovery and getbundle call after it.</span></div>
<div id="magicdomid5430" class="ace-line"><br>
</div>
<div id="magicdomid7192" class="ace-line"><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">With
th</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">is</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">
"stablerange"</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
based strategy</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">,</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
we start from the set of requested changesets to generate </span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">a set
of "standard" range covering all of them. This slicing has a
good algorithmic complexity that depends on the size of the
selected "missing" set of changesets. So the associated cost of
will scale well with the size of the associated pull. In
addition, we no longer have to do an expensive search into a
list existing bundles. This helps to scale small pulls and
increase the number of bundles we can cache, as the time we
spend selecting bundle no longer depends on the numbers of
cached ones. Since we can exactly cover the client request, we
also no longer need to issue an</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
extra pull roundtrip after the cache retrieval.</span></div>
<div id="magicdomid3772" class="ace-line"><br>
</div>
<div id="magicdomid3887" class="ace-line"><span class="">That
slicing focus on producing ranges that:</span></div>
<div id="magicdomid7197" class="ace-line">
<ul class="list-bullet1">
<li><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">Have
a high chance to be reusable in </span><span class="">a
pull selecting similar changesets,</span></li>
</ul>
</div>
<div id="magicdomid7198" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Gather most of the changesets in large
bundles.</span></li>
</ul>
</div>
<div id="magicdomid6459" class="ace-line"><br>
</div>
<div id="magicdomid7209" class="ace-line"><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">This
caching strategy inherits the nice "stablerange" properties
regarding repository growth</span></div>
<div id="magicdomid7210" class="ace-line">
<ul class="list-bullet1">
<li><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">When
a few changesets are appended to a repository, only a few
ranges have</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
to be added.</span></li>
</ul>
</div>
<div id="magicdomid7211" class="ace-line">
<ul class="list-bullet1">
<li><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">The
overall number of ranges (and associated bundles) to create
to represent all possible ranges has an</span><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">
O(N log(N)) complexity.</span></li>
</ul>
</div>
<div id="magicdomid3239" class="ace-line"><br>
</div>
<div id="magicdomid3890" class="ace-line"><span class="">For
example, here are the 15 ranges selected for a full clone of
mozilla-central:</span></div>
<div id="magicdomid3276" class="ace-line"><br>
</div>
<div id="magicdomid3891" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">262114 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3892" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">30 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3893" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">65536 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3894" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">32741 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3895" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">20 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3896" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">7 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3897" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">8192 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3898" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">243 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3899" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">13 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3900" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">114 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3901" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">14 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3902" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">32 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3903" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">16 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3904" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">8 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3905" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">1 changesets</span></code></li>
</ul>
</div>
<div id="magicdomid3366" class="ace-line">
<ul class="list-indent1">
<li><br>
</li>
</ul>
</div>
<div id="magicdomid7212" class="ace-line"><span class="">If we only
clone a subset of the repository, the larger ranges get reused
(hg clone --rev -5000):</span></div>
<div id="magicdomid3907" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">262114 changesets found in caches</span></code></li>
</ul>
</div>
<div id="magicdomid3908" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">30 changesets found in caches</span></code></li>
</ul>
</div>
<div id="magicdomid3909" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">65536 changesets found in caches</span></code></li>
</ul>
</div>
<div id="magicdomid3910" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">32741 changesets found in caches</span></code></li>
</ul>
</div>
<div id="magicdomid3911" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">20 changesets found in caches</span></code></li>
</ul>
</div>
<div id="magicdomid3912" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">7 changesets found in caches</span></code></li>
</ul>
</div>
<div id="magicdomid3913" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">2048 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3914" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">1024 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3915" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">482 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3916" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">30 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3917" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">32 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3918" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">1 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3919" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">7 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3920" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">4 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid3921" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">2 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid7111" class="ace-line">
<ul class="list-indent1">
<li><code><span class="">1 changesets found</span></code></li>
</ul>
</div>
<div id="magicdomid7213" class="ace-line"><span
class="author-a-vz70zz69zkz67zz89zz83zz83ztpz122zbhz70zdz88z">As
you can see, the larger ranges of this second pull are common
with the previous pull, allowing to reuse cached bundles.</span></div>
<div id="magicdomid14" class=""><br>
</div>
<div id="magicdomid7214" class="ace-line"><span class="">The
prototype is available in a small "pullbundle" extension. It
focus</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">es on
the slicing itself and we did not implement anything fancy for
the cache storage and delivery. We simply store generated bundle
on disk and we read it from disk when it is needed again.
Others, like Joerg Sonnenberger or Gre</span><span class="">gory
Szorc, are already working on the "cache delivery" problem.</span></div>
<div id="magicdomid16" class=""><br>
</div>
<div id="magicdomid7215" class="ace-line"><span class="">We are
getting good result our of that prototypes when testing it on
clones of mozilla-central and netbsd-src. See "Example Result"
section for detail.</span></div>
<div id="magicdomid18" class=""><br>
</div>
<div id="magicdomid7216" class="ace-line"><span class="">The
prototype is up and running on our hgweb "mirror" instance </span><span
class=" url"><a href="https://mirror.octobus.net/">https://mirror.octobus.net/</a></span><span
class="">.</span></div>
<div id="magicdomid20" class=""><br>
</div>
<div id="magicdomid7218" class="ace-line"><span class="">The
extension comes with a small debug command that produce</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s
statistic of the ranges that multiple random pulls would us</span><span
class="">e.</span></div>
<div id="magicdomid23" class=""><br>
</div>
<div id="magicdomid7219" class="ace-line"><span class="">The
"stablerange" implementation currently still[1] live in the
evolve extensions, so we put the extensions in the same
repository for simplicity as "pullbundle". This is not ideal but
was a simple solution in the time we could dedicate. This
extension is not part of any official release yet. To test it
you have to install it from the repository for now: </span><span
class=" url"><a
href="https://www.mercurial-scm.org/repo/evolve/#default">https://www.mercurial-scm.org/repo/evolve/#default</a></span></div>
<div id="magicdomid25" class=""><br>
</div>
<div id="magicdomid7220" class="ace-line"><span class="">The
extension</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">'</span><span
class="">s code is here: </span><span class=" url"><a
href="https://www.mercurial-scm.org/repo/evolve/file/tip/hgext3rd/pullbundle.py">https://www.mercurial-scm.org/repo/evolve/file/tip/hgext3rd/pullbundle.py</a></span></div>
<div id="magicdomid27" class=""><br>
</div>
<div id="magicdomid7221" class="ace-line"><span class="">The
prototype performance </span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">is not
stellar, but good enough to give useful result in a reasonable
amount of time. A production-grade implementation of stablerange
algorithm and storage will fix that. There is also room for
improvement progression in the algorithm themselves, multiple
sub-problem can be improved. We started having regular meeting
with University researcher working on graph theory, they are
interested in the problem space</span><span class="">.</span></div>
<div id="magicdomid29" class=""><br>
</div>
<div id="magicdomid7222" class="ace-line"><span class="">[1] The
stablerange principle has been validating in the field and is
ready to get upstreamed.</span></div>
<div id="magicdomid31" class=""><br>
</div>
<div id="magicdomid3931" class="ace-line">
<h2><span class="">Example results.</span></h2>
</div>
<div id="magicdomid33" class=""><br>
</div>
<div id="magicdomid7223" class="ace-line"><span class="">The
extensions come</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z"> with
a command to simulate multiple pulls</span><span class=""> of a
random set of revisions (from a larger set of revision we
define). This starts with a cold cache for simplicity.</span></div>
<div id="magicdomid35" class=""><br>
</div>
<div id="magicdomid3933" class="ace-line">
<h3><span class="">Mozilla Central</span></h3>
</div>
<div id="magicdomid37" class=""><br>
</div>
<div id="magicdomid3934" class="ace-line"><span class="">We
gathering 100 sample pulls within 20443 revisions</span></div>
<div id="magicdomid3935" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median pull size: 18 338</span></li>
</ul>
</div>
<div id="magicdomid3936" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median number of changegroup used: 132
changegroups.</span></li>
</ul>
</div>
<div id="magicdomid3937" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median changeset not cached: 88</span></li>
</ul>
</div>
<div id="magicdomid3938" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median ratio of changeset already in the
cache: 99.5%</span></li>
</ul>
</div>
<div id="magicdomid44" class=""><br>
</div>
<div id="magicdomid7224" class="ace-line"><span class="">The number
of different ranges stay</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s</span><span
class=""> under control as expected:</span></div>
<div id="magicdomid3940" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset served: 1 817 955</span></li>
</ul>
</div>
<div id="magicdomid3941" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset cache hit ratio: 96% (from a
cold cache)</span></li>
</ul>
</div>
<div id="magicdomid3942" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Distinct range cached: 12 983, Most of them
very small (90% ≤ 32 changesets)</span></li>
</ul>
</div>
<div id="magicdomid3943" class="ace-line">
<ul class="list-bullet1">
<li><span class="">A small number of (larger) ranges get most of
the cache hit.</span></li>
</ul>
</div>
<div id="magicdomid49" class=""><br>
</div>
<div id="magicdomid7225" class="ace-line"><span class="">Providing
the smaller range from cache might not be a good tradeoff. If we
skip using the cache for smaller range</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s we
still get interesting results</span><span class="">:</span></div>
<div id="magicdomid51" class=""><br>
</div>
<div id="magicdomid3945" class="ace-line"><span class="">Only
caching range containing 256 changeset or more:</span></div>
<div id="magicdomid2029" class="ace-line"><br>
</div>
<div id="magicdomid3946" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median pull size: 18 940</span></li>
</ul>
</div>
<div id="magicdomid3947" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median number of changegroup used: 12</span></li>
</ul>
</div>
<div id="magicdomid3948" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median changeset not cached: 1 949</span></li>
</ul>
</div>
<div id="magicdomid3949" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median ratio of changeset already in the
cache: 90%</span></li>
</ul>
</div>
<div id="magicdomid3950" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset served: 1 850 243</span></li>
</ul>
</div>
<div id="magicdomid3951" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset cache hit ratio: 87%</span></li>
</ul>
</div>
<div id="magicdomid3952" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Distinct range cached: 1 150</span></li>
</ul>
</div>
<div id="magicdomid3179" class="ace-line"><br>
</div>
<div id="magicdomid7226" class="ace-line"><span class="">Another way
to reduce the number of server bundle would be </span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">to d</span><span
class="">o some "over serving": using bundle containing some
common changesets.</span></div>
<div id="magicdomid2795" class="ace-line"><br>
</div>
<div id="magicdomid3954" class="ace-line"><span class="">(See the
end of the email for full details)</span></div>
<div id="magicdomid2037" class="ace-line"><br>
</div>
<div id="magicdomid3955" class="ace-line">
<h3><span class="">netbsd</span></h3>
</div>
<div id="magicdomid2460" class="ace-line"><br>
</div>
<div id="magicdomid3956" class="ace-line"><span class="">This time,
we issues more random pull (1000) within a set a bit smaller set
of 10 000 changesets.</span></div>
<div id="magicdomid2622" class="ace-line"><br>
</div>
<div id="magicdomid3957" class="ace-line"><span class="">This
resulted in smaller pulls, that also show good results:</span></div>
<div id="magicdomid2297" class="ace-line"><br>
</div>
<div id="magicdomid3958" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median pull size: 1673</span></li>
</ul>
</div>
<div id="magicdomid3959" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median number of changegroup used: 51</span></li>
</ul>
</div>
<div id="magicdomid3960" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median changeset not cached: 50</span></li>
</ul>
</div>
<div id="magicdomid3961" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median ratio of changeset already in the
cache: 99%</span></li>
</ul>
</div>
<div id="magicdomid3962" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset served: 1601087</span></li>
</ul>
</div>
<div id="magicdomid3963" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset cache hit ratio: 96%</span></li>
</ul>
</div>
<div id="magicdomid3964" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Distinct range cached: 51751</span></li>
</ul>
</div>
<div id="magicdomid2509" class="ace-line"><br>
</div>
<div id="magicdomid3965" class="ace-line"><span class="">Trying the
same 256+ changesets limit on caching, we see a stronger impact.
Probably because of the smaller pulls:</span></div>
<div id="magicdomid2287" class="ace-line"><br>
</div>
<div id="magicdomid3966" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median pull size: 1592</span></li>
</ul>
</div>
<div id="magicdomid3967" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median number of changegroup used: 2</span></li>
</ul>
</div>
<div id="magicdomid3968" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median changeset not cached: 663</span></li>
</ul>
</div>
<div id="magicdomid3969" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median ratio of changeset already in the
cache: 46%</span></li>
</ul>
</div>
<div id="magicdomid3970" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset served: 1 554 227</span></li>
</ul>
</div>
<div id="magicdomid3971" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset cache hit ratio: 56%</span></li>
</ul>
</div>
<div id="magicdomid3972" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Distinct range cached: 1 914</span></li>
</ul>
</div>
<div id="magicdomid2820" class="ace-line"><br>
</div>
<div id="magicdomid3973" class="ace-line"><span class="">(See the
end of the email for full details)</span></div>
<div id="magicdomid2247" class="ace-line"><br>
</div>
<div id="magicdomid3974" class="ace-line">
<h3><span class="">pypy</span></h3>
</div>
<div id="magicdomid2848" class="ace-line"><br>
</div>
<div id="magicdomid3975" class="ace-line"><span class="">pypy
testing was done using 1 000 pulls within 16687 changesets.</span></div>
<div id="magicdomid3083" class="ace-line"><br>
</div>
<div id="magicdomid2138" class="ace-line"><br>
</div>
<div id="magicdomid3976" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median pull size: 11375</span></li>
</ul>
</div>
<div id="magicdomid3977" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median number of changegroup used: 1206</span></li>
</ul>
</div>
<div id="magicdomid3978" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median changeset not cached: 12</span></li>
</ul>
</div>
<div id="magicdomid3979" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median ratio of changeset already in the
cache: 100%</span></li>
</ul>
</div>
<div id="magicdomid3980" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset served: 11 167 863</span></li>
</ul>
</div>
<div id="magicdomid3981" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset cache hit ratio: 99%</span></li>
</ul>
</div>
<div id="magicdomid3982" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Distinct range cached: 1 139 537</span></li>
</ul>
</div>
<div id="magicdomid3084" class="ace-line"><br>
</div>
<div id="magicdomid7227" class="ace-line"><span class="">Installing
the 256+ changeset limits give less good result</span><span
class="author-a-jez81zxsz86z6z88zz76zz80zuz90z2tz79zz84z">s.
This is probably the result of a shallower pull space and the
amount of merge in the pypy repository. The pypy repository is
significantly more branchy than the other ones, there is</span><span
class=""> some known way we could improve stablerange
partitioning in this cases (to produce larger ranges).</span></div>
<div id="magicdomid2314" class="ace-line"><br>
</div>
<div id="magicdomid3984" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median pull size: 11457</span></li>
</ul>
</div>
<div id="magicdomid3985" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median number of changegroup used: 9</span></li>
</ul>
</div>
<div id="magicdomid3986" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median changeset not cached: 7276</span></li>
</ul>
</div>
<div id="magicdomid3987" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Median ratio of changeset already in the
cache: 37%</span></li>
</ul>
</div>
<div id="magicdomid3988" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset served: 11211093</span></li>
</ul>
</div>
<div id="magicdomid3989" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Total changeset cache hit ratio: 36%</span></li>
</ul>
</div>
<div id="magicdomid3990" class="ace-line">
<ul class="list-bullet1">
<li><span class="">Distinct range cached: 8964</span></li>
</ul>
</div>
<div id="magicdomid3082" class="ace-line"><br>
</div>
<div id="magicdomid3991" class="ace-line">
<h2><span class="">Full details:</span></h2>
</div>
<div id="magicdomid71" class=""><br>
</div>
<div id="magicdomid3992" class="ace-line">
<h3><span class="">mozilla central, 100 pull in 20553 revisions,
no limit</span></h3>
<pre> mozilla-central> hg debugpullbundlecacheoverlap --count 100 -- 'tip~10000:' </pre>
<pre>
gathering 100 sample pulls within 20443 revisions</pre>
<pre>
pull size:</pre>
<pre>
min: 13176</pre>
<pre>
10%: 16425</pre>
<pre>
25%: 17344</pre>
<pre>
50%: 18338</pre>
<pre>
75%: 19192</pre>
<pre>
90%: 19719</pre>
<pre>
95%: 19902</pre>
<pre>
max: 20272</pre>
<pre>
non-cached changesets:</pre>
<pre>
min: 4</pre>
<pre>
10%: 10</pre>
<pre>
25%: 24</pre>
<pre>
50%: 88</pre>
<pre>
75%: 237</pre>
<pre>
90%: 956</pre>
<pre>
95%: 3152</pre>
<pre>
max: 17440</pre>
<pre>
ratio of cached changesets:</pre>
<pre>
min: 0.0</pre>
<pre>
10%: 0.947941624918</pre>
<pre>
25%: 0.987343800064</pre>
<pre>
50%: 0.995036297078</pre>
<pre>
75%: 0.998774313882</pre>
<pre>
90%: 0.999421798208</pre>
<pre>
95%: 0.999634750848</pre>
<pre>
max: 0.999795354548</pre>
<pre>
bundle count:</pre>
<pre>
min: 74</pre>
<pre>
10%: 99</pre>
<pre>
25%: 113</pre>
<pre>
50%: 132</pre>
<pre>
75%: 146</pre>
<pre>
90%: 158</pre>
<pre>
95%: 169</pre>
<pre>
max: 186</pre>
<pre>
ratio of cached bundles:</pre>
<pre>
min: 0.0</pre>
<pre>
10%: 0.685082872928</pre>
<pre>
25%: 0.810810810811</pre>
<pre>
50%: 0.911392405063</pre>
<pre>
75%: 0.953020134228</pre>
<pre>
90%: 0.974683544304</pre>
<pre>
95%: 0.98125</pre>
<pre>
max: 0.993377483444</pre>
<pre>
changesets served:</pre>
<pre>
total: 1817955</pre>
<pre>
from cache: 1752642 (96%)</pre>
<pre>
bundle: 12983</pre>
<pre>
size of cached bundles:</pre>
<pre>
min: 1</pre>
<pre>
10%: 1</pre>
<pre>
25%: 2</pre>
<pre>
50%: 4</pre>
<pre>
75%: 9</pre>
<pre>
90%: 32</pre>
<pre>
95%: 64</pre>
<pre>
max: 8165</pre>
<pre>
hit on cached bundles:</pre>
<pre>
min: 1</pre>
<pre>
10%: 1</pre>
<pre>
25%: 1</pre>
<pre>
50%: 2</pre>
<pre>
75%: 4</pre>
<pre>
90%: 14</pre>
<pre>
95%: 20</pre>
<pre>
max: 100</pre>
</div>
<div id="magicdomid4062" class="ace-line">
<h3><span class="">mozilla central, 100 pull in 20443 revision,
only caching ranges of 256 changeset and above</span></h3>
</div>
<pre> mozilla-central > hg debugpullbundlecacheoverlap --count 100 --min-cache=256 -- 'tip~10000:'
gathering 100 sample pulls within 20443 revisions
not caching ranges smaller than 256 changesets
pull size:
min: 14060
10%: 16910
25%: 17923
50%: 18940
75%: 19471
90%: 19884
95%: 20029
max: 20309
non-cached changesets:
min: 973
10%: 1398
25%: 1707
50%: 1949
75%: 2246
90%: 2590
95%: 3448
max: 19551
ratio of cached changesets:
min: 0.0
10%: 0.839908649729
25%: 0.884512085944
50%: 0.897293365889
75%: 0.91018907563
90%: 0.926944971537
95%: 0.935139218649
max: 0.946839315959
bundle count:
min: 4
10%: 10
25%: 11
50%: 12
75%: 12
90%: 13
95%: 15
max: 16
ratio of cached bundles:
min: 0.0
10%: 0.909090909091
25%: 1.0
50%: 1.0
75%: 1.0
90%: 1.0
95%: 1.0
max: 1.0
changesets served:
total: 1850243
from cache: 1617379 (87%)
bundle: 1150
size of cached bundles:
min: 256
10%: 256
25%: 256
50%: 512
75%: 1024
90%: 1024
95%: 1024
max: 8165
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 7
75%: 44
90%: 44
95%: 59
max: 98
</pre>
<div id="magicdomid216" class=""><br>
</div>
<div id="magicdomid4133" class="ace-line">
<h3><span class="">netbsd-src 1000 pull within 10000 revisions:</span></h3>
</div>
<pre>
netbsd-src > hg debugpullbundlecacheoverlap --count 1000 -- '-10000:'
gathering 1000 sample pulls within 10000 revisions
pull size:
min: 10
10%: 339
25%: 865
50%: 1673
75%: 2330
90%: 2752
95%: 2893
max: 3466
non-cached changesets:
min: 0
10%: 3
25%: 7
50%: 16
75%: 50
90%: 137
95%: 239
max: 2787
ratio of cached changesets:
min: 0.0
10%: 0.781553398058
25%: 0.940663176265
50%: 0.987631184408
75%: 0.996178830722
90%: 0.99843688941
95%: 0.998939554613
max: 1.0
bundle count:
min: 10
10%: 28
25%: 37
50%: 51
75%: 65
90%: 78
95%: 88
max: 121
ratio of cached bundles:
min: 0.0
10%: 0.446808510638
25%: 0.673076923077
50%: 0.826086956522
75%: 0.901639344262
90%: 0.942857142857
95%: 0.96
max: 1.0
changesets served:
total: 1601087
from cache: 1539491 (96%)
bundle: 51751
size of cached bundles:
min: 1
10%: 1
25%: 1
50%: 1
75%: 4
90%: 8
95%: 16
max: 2048
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 2
75%: 3
90%: 8
95%: 13
max: 291
</pre>
<div id="magicdomid4203" class="ace-line">
<h3><span class="">netbsd-src 1000 pull within 10000 revisions,
not caching range smaller than 256:</span></h3>
</div>
<pre>
netbsd-src > hg debugpullbundlecacheoverlap --count 1000 --min-cache=256 -- '-10000:'
gathering 1000 sample pulls within 10000 revisions
not caching ranges smaller than 256 changesets
pull size:
min: 10
10%: 329
25%: 813
50%: 1592
75%: 2271
90%: 2745
95%: 2922
max: 3719
non-cached changesets:
min: 10
10%: 265
25%: 440
50%: 663
75%: 911
90%: 1111
95%: 1229
max: 2852
ratio of cached changesets:
min: 0.0
10%: 0.0
25%: 0.136752136752
50%: 0.461261261261
75%: 0.686327077748
90%: 0.792263056093
95%: 0.829552819183
max: 0.959700093721
bundle count:
min: 0
10%: 0
25%: 1
50%: 2
75%: 3
90%: 4
95%: 4
max: 6
ratio of cached bundles:
min: 0.0
10%: 1.0
25%: 1.0
50%: 1.0
75%: 1.0
90%: 1.0
95%: 1.0
max: 1.0
changesets served:
total: 1554227
from cache: 871680 (56%)
bundle: 1914
size of cached bundles:
min: 256
10%: 256
25%: 256
50%: 256
75%: 512
90%: 512
95%: 512
max: 2048
hit on cached bundles:
min: 2
10%: 3
25%: 7
50%: 61
75%: 113
90%: 113
95%: 117
max: 267
</pre>
<div id="magicdomid4274" class="ace-line">
<h3><span class="">pypy 1000 pulls within 16687 changesets:</span></h3>
</div>
<pre>
pypy > time hg debugpullbundlecacheoverlap --count 1000 -- 'tip~2000:'
gathering 1000 sample pulls within 16687 revisions
pull size:
min: 5835
10%: 9165
25%: 10323
50%: 11375
75%: 12248
90%: 12904
95%: 13181
max: 14221
non-cached changesets:
min: 0
10%: 1
25%: 4
50%: 12
75%: 39
90%: 142
95%: 453
max: 12046
ratio of cached changesets:
min: 0.0
10%: 0.986539780521
25%: 0.99640167364
50%: 0.99889963059
75%: 0.99964598637
90%: 0.999911496593
95%: 1.0
max: 1.0
bundle count:
min: 183
10%: 762
25%: 1045
50%: 1206
75%: 1308
90%: 1387
95%: 1427
max: 1631
ratio of cached bundles:
min: 0.0
10%: 0.972310126582
25%: 0.990171990172
50%: 0.995529061103
75%: 0.997882851094
90%: 0.999203821656
95%: 1.0
max: 1.0
changesets served:
total: 11167863
from cache: 11060195 (99%)
bundle: 1139537
size of cached bundles:
min: 1
10%: 1
25%: 1
50%: 2
75%: 4
90%: 15
95%: 30
max: 2041
hit on cached bundles:
min: 1
10%: 1
25%: 1
50%: 3
75%: 20
90%: 245
95%: 848
max: 999
</pre>
<div id="magicdomid4344" class="ace-line">
<h3><span class="">pypy 1000 pulls within 16687 changesets:,
caching above 256 changesets only:</span></h3>
</div>
<code><br>
time hg debugpullbundlecacheoverlap --count 1000
--min-cache=256 -- 'tip~2000:'<br>
<br>
gathering 1000 sample pulls within 16687 revisions<br>
<br>
not caching ranges smaller than 256 changesets<br>
<br>
pull size:<br>
<br>
min: 3629<br>
<br>
10%: 9075<br>
<br>
25%: 10278<br>
<br>
50%: 11457<br>
<br>
75%: 12325<br>
<br>
90%: 12961<br>
<br>
95%: 13245<br>
<br>
max: 14330<br>
<br>
non-cached changesets:<br>
<br>
min: 2605<br>
<br>
10%: 5885<br>
<br>
25%: 6619<br>
<br>
50%: 7276<br>
<br>
75%: 7813<br>
<br>
90%: 8319<br>
<br>
95%: 8577<br>
<br>
max: 11815<br>
<br>
ratio of cached changesets:<br>
<br>
min: 0.0<br>
<br>
10%: 0.296190172981<br>
<br>
25%: 0.344310129221<br>
<br>
50%: 0.368110984417<br>
<br>
75%: 0.391840607211<br>
<br>
90%: 0.417344173442<br>
<br>
95%: 0.445362934971<br>
<br>
max: 0.544580009385<br>
<br>
bundle count:<br>
<br>
min: 1<br>
<br>
10%: 7<br>
<br>
25%: 8<br>
<br>
50%: 9<br>
<br>
75%: 10<br>
<br>
90%: 11<br>
<br>
95%: 11<br>
<br>
max: 12<br>
<br>
ratio of cached bundles:<br>
<br>
min: 0.0<br>
<br>
10%: 1.0<br>
<br>
25%: 1.0<br>
<br>
50%: 1.0<br>
<br>
75%: 1.0<br>
<br>
90%: 1.0<br>
<br>
95%: 1.0<br>
<br>
max: 1.0<br>
<br>
changesets served:<br>
<br>
total: 11211093<br>
<br>
from cache: 4066703 (36%)<br>
<br>
bundle: 8964<br>
<br>
size of cached bundles:<br>
<br>
min: 256<br>
<br>
10%: 256<br>
<br>
25%: 256<br>
<br>
50%: 468<br>
<br>
75%: 510<br>
<br>
90%: 1019<br>
<br>
95%: 512<br>
<br>
max: 1916<br>
<br>
hit on cached bundles:<br>
<br>
min: 1<br>
<br>
10%: 1<br>
<br>
25%: 3<br>
<br>
50%: 13<br>
<br>
75%: 63<br>
<br>
90%: 823<br>
<br>
95%: 153<br>
<br>
max: 958</code>
</body>
</html>