caching pull - stable partitioning of bundle requests

Gregory Szorc gregory.szorc at gmail.com
Thu Sep 27 12:15:50 EDT 2018


On Wed, Sep 26, 2018 at 11:13 AM Boris FELD <boris.feld at octobus.net> wrote:

> Hi everyone,
>
> Pulling from a server involves expensive server-side computation that we
> wish to cache. However, since the client can pull any arbitrary set of
> revision, grouping and dispatching the data to be cached is a hard
> problem.
>
> When we implemented the new discovery for obsolescence markers, we
> developed a "stablerange" method to build an efficient way to slice the
> changesets graph into ranges. In addition to solving the obsolescence
> markers discovery problem, this "stablerange" principle seemed to be useful
> for more usages, in particular, the caching of pulls.
>
> Right now, with the current pull bundle implementation, here is how it
> work: you manually create and manually declare bundles containing either
> all changesets (that could also be used for clone bundles) or more specific
> ones. When the client request some changesets, the server searches a bundle
> containing the needed range and send it. This often involves more than the
> requested data. The client needs to filter out the extraneous data. Then
> the client does a discovery to catch any missing changesets from the
> bundle. If the server doesn't find a valid pull bundle, a normal discovery
> is done. The manual bundle managements is suboptimal, the search for
> appropriate bundles has a bad complexity and the extra roundtrip and
> discovery adds extra slowness.
>
> This weekend, we build a "simple" prototype that use "stablerange" to
> slice changegroup request in "getbundle" into multiple bundles that can
> be reused from one pull to another. That slicing happens as part of a
> normal pull, during the getbundle call and after the normal discovery
> happens. There are no needs for an extra discovery and getbundle call after
> it.
>
> With this "stablerange" based strategy, we start from the set of
> requested changesets to generate a set of "standard" range covering all
> of them. This slicing has a good algorithmic complexity that depends on the
> size of the selected "missing" set of changesets. So the associated cost of
> will scale well with the size of the associated pull. In addition, we no
> longer have to do an expensive search into a list existing bundles. This
> helps to scale small pulls and increase the number of bundles we can cache,
> as the time we spend selecting bundle no longer depends on the numbers of
> cached ones. Since we can exactly cover the client request, we also no
> longer need to issue an extra pull roundtrip after the cache retrieval.
>
> That slicing focus on producing ranges that:
>
>    - Have a high chance to be reusable in a pull selecting similar
>    changesets,
>
>
>    - Gather most of the changesets in large bundles.
>
>
> This caching strategy inherits the nice "stablerange" properties regarding
> repository growth
>
>    - When a few changesets are appended to a repository, only a few
>    ranges have to be added.
>
>
>    - The overall number of ranges (and associated bundles) to create to
>    represent all possible ranges has an O(N log(N)) complexity.
>
>
> For example, here are the 15 ranges selected for a full clone of
> mozilla-central:
>
>
>    - 262114 changesets
>
>
>    - 30 changesets
>
>
>    - 65536 changesets
>
>
>    - 32741 changesets
>
>
>    - 20 changesets
>
>
>    - 7 changesets
>
>
>    - 8192 changesets
>
>
>    - 243 changesets
>
>
>    - 13 changesets
>
>
>    - 114 changesets
>
>
>    - 14 changesets
>
>
>    - 32 changesets
>
>
>    - 16 changesets
>
>
>    - 8 changesets
>
>
>    - 1 changesets
>
>
>    -
>
> If we only clone a subset of the repository, the larger ranges get reused
> (hg clone --rev -5000):
>
>    - 262114 changesets found in caches
>
>
>    - 30 changesets found in caches
>
>
>    - 65536 changesets found in caches
>
>
>    - 32741 changesets found in caches
>
>
>    - 20 changesets found in caches
>
>
>    - 7 changesets found in caches
>
>
>    - 2048 changesets found
>
>
>    - 1024 changesets found
>
>
>    - 482 changesets found
>
>
>    - 30 changesets found
>
>
>    - 32 changesets found
>
>
>    - 1 changesets found
>
>
>    - 7 changesets found
>
>
>    - 4 changesets found
>
>
>    - 2 changesets found
>
>
>    - 1 changesets found
>
> As you can see, the larger ranges of this second pull are common with the
> previous pull, allowing to reuse cached bundles.
>
> The prototype is available in a small "pullbundle" extension. It focuses
> on the slicing itself and we did not implement anything fancy for the cache
> storage and delivery. We simply store generated bundle on disk and we read
> it from disk when it is needed again. Others, like Joerg Sonnenberger or Gregory
> Szorc, are already working on the "cache delivery" problem.
>
> We are getting good result our of that prototypes when testing it on
> clones of mozilla-central and netbsd-src. See "Example Result" section for
> detail.
>
> The prototype is up and running on our hgweb "mirror" instance
> https://mirror.octobus.net/.
>
> The extension comes with a small debug command that produces statistic of
> the ranges that multiple random pulls would use.
>
> The "stablerange" implementation currently still[1] live in the evolve
> extensions, so we put the extensions in the same repository for simplicity
> as "pullbundle". This is not ideal but was a simple solution in the time we
> could dedicate. This extension is not part of any official release yet. To
> test it you have to install it from the repository for now:
> https://www.mercurial-scm.org/repo/evolve/#default
>
> The extension's code is here:
> https://www.mercurial-scm.org/repo/evolve/file/tip/hgext3rd/pullbundle.py
>
> The prototype performance is not stellar, but good enough to give useful
> result in a reasonable amount of time. A production-grade implementation of
> stablerange algorithm and storage will fix that. There is also room for
> improvement progression in the algorithm themselves, multiple sub-problem
> can be improved. We started having regular meeting with University
> researcher working on graph theory, they are interested in the problem space
> .
>
> [1] The stablerange principle has been validating in the field and is
> ready to get upstreamed.
>

This experimentation is very useful and shows great promise!

Intelligent "slicing" of data in order to facilitate high cache hit rates
and lower server work and client overhead is definitely worth pursuing.

FWIW I submitted my caching series for wire protocol version 2 a few hours
ago. The meaningful patches are at https://phab.mercurial-scm.org/D4773 and
https://phab.mercurial-scm.org/D4774. This work focuses on transparent
whole-response caching, however, and does not (yet) implement a caching
layer for e.g. storage access. I will note that I *really* want to enable
Mercurial servers to offload as much data transfer to static file servers
as possible.

An open item to solve is how we're going to facilitate bulk data transfer
using fewer roundtrips in wire protocol version 2. Right now, the client
has to send a *lot* of commands and data to the server (all the manifest
and file nodes). We know we must do something better. I think this slicing
idea could definitely be employed to facilitate higher cache hit rates.
How, exactly, I'm not sure.

Something else to consider is that right now, an `hg clone` will preserve
changelog order from server to client. This is trivial to change. And it is
arguably an implementation detail of revlogs, which preserve insertion
order. (And until
https://www.mercurial-scm.org/repo/hg-committed/rev/db5501d93bcf it was the
behavior for manifests and filelogs as well.) I'm not sure how important it
is to preserve this property. But if we start slicing changesets, we could
easily break this behavior.

This email caused me to consider the possibility that the "content
redirects" feature proposed in D4774 should (eventually) support multiple
locations. Then, a server could slice a large response into multiple
payloads and have the client retrieve them separately.

Alternatively, since wire protocol version 2 currently performs a lot of
granular requests, perhaps the server could inform the client of preferred
slicing for data fetching and the client could follow up accordingly. How
this would interact with bulk data transfer commands, I'm not sure.

There's definitely a lot to think about and I hope this slicing
experimentation finds a home in core someday!


>
> Example results.
>
> The extensions come with a command to simulate multiple pulls of a random
> set of revisions (from a larger set of revision we define). This starts
> with a cold cache for simplicity.
>
> Mozilla Central
>
> We gathering 100 sample pulls within 20443 revisions
>
>    - Median pull size: 18 338
>
>
>    - Median number of changegroup used: 132 changegroups.
>
>
>    - Median changeset not cached: 88
>
>
>    - Median ratio of changeset already in the cache: 99.5%
>
>
> The number of different ranges stays under control as expected:
>
>    - Total changeset served: 1 817 955
>
>
>    - Total changeset cache hit ratio: 96% (from a cold cache)
>
>
>    - Distinct range cached: 12 983, Most of them very small (90% ≤ 32
>    changesets)
>
>
>    - A small number of (larger) ranges get most of the cache hit.
>
>
> Providing the smaller range from cache might not be a good tradeoff. If we
> skip using the cache for smaller ranges we still get interesting results:
>
> Only caching range containing 256 changeset or more:
>
>
>    - Median pull size: 18 940
>
>
>    - Median number of changegroup used: 12
>
>
>    - Median changeset not cached: 1 949
>
>
>    - Median ratio of changeset already in the cache: 90%
>
>
>    - Total changeset served: 1 850 243
>
>
>    - Total changeset cache hit ratio: 87%
>
>
>    - Distinct range cached: 1 150
>
>
> Another way to reduce the number of server bundle would be to do some
> "over serving": using bundle containing some common changesets.
>
> (See the end of the email for full details)
>
> netbsd
>
> This time, we issues more random pull (1000) within a set a bit smaller
> set of 10 000 changesets.
>
> This resulted in smaller pulls, that also show good results:
>
>
>    - Median pull size: 1673
>
>
>    - Median number of changegroup used: 51
>
>
>    - Median changeset not cached: 50
>
>
>    - Median ratio of changeset already in the cache: 99%
>
>
>    - Total changeset served: 1601087
>
>
>    - Total changeset cache hit ratio: 96%
>
>
>    - Distinct range cached: 51751
>
>
> Trying the same 256+ changesets limit on caching, we see a stronger
> impact. Probably because of the smaller pulls:
>
>
>    - Median pull size: 1592
>
>
>    - Median number of changegroup used: 2
>
>
>    - Median changeset not cached: 663
>
>
>    - Median ratio of changeset already in the cache: 46%
>
>
>    - Total changeset served: 1 554 227
>
>
>    - Total changeset cache hit ratio: 56%
>
>
>    - Distinct range cached: 1 914
>
>
> (See the end of the email for full details)
>
> pypy
>
> pypy testing was done using 1 000 pulls within 16687 changesets.
>
>
>
>    - Median pull size: 11375
>
>
>    - Median number of changegroup used: 1206
>
>
>    - Median changeset not cached: 12
>
>
>    - Median ratio of changeset already in the cache: 100%
>
>
>    - Total changeset served: 11 167 863
>
>
>    - Total changeset cache hit ratio: 99%
>
>
>    - Distinct range cached: 1 139 537
>
>
> Installing the 256+ changeset limits give less good results. This is
> probably the result of a shallower pull space and the amount of merge in
> the pypy repository. The pypy repository is significantly more branchy than
> the other ones, there is some known way we could improve stablerange
> partitioning in this cases (to produce larger ranges).
>
>
>    - Median pull size: 11457
>
>
>    - Median number of changegroup used: 9
>
>
>    - Median changeset not cached: 7276
>
>
>    - Median ratio of changeset already in the cache: 37%
>
>
>    - Total changeset served: 11211093
>
>
>    - Total changeset cache hit ratio: 36%
>
>
>    - Distinct range cached: 8964
>
>
> Full details:
>
> mozilla central, 100 pull in 20553 revisions, no limit
>
>     mozilla-central> hg debugpullbundlecacheoverlap --count 100 -- 'tip~10000:'
>
>     gathering 100 sample pulls within 20443 revisions
>
>     pull size:
>
>       min: 13176
>
>       10%: 16425
>
>       25%: 17344
>
>       50%: 18338
>
>       75%: 19192
>
>       90%: 19719
>
>       95%: 19902
>
>       max: 20272
>
>     non-cached changesets:
>
>       min: 4
>
>       10%: 10
>
>       25%: 24
>
>       50%: 88
>
>       75%: 237
>
>       90%: 956
>
>       95%: 3152
>
>       max: 17440
>
>     ratio of cached changesets:
>
>       min: 0.0
>
>       10%: 0.947941624918
>
>       25%: 0.987343800064
>
>       50%: 0.995036297078
>
>       75%: 0.998774313882
>
>       90%: 0.999421798208
>
>       95%: 0.999634750848
>
>       max: 0.999795354548
>
>     bundle count:
>
>       min: 74
>
>       10%: 99
>
>       25%: 113
>
>       50%: 132
>
>       75%: 146
>
>       90%: 158
>
>       95%: 169
>
>       max: 186
>
>     ratio of cached bundles:
>
>       min: 0.0
>
>       10%: 0.685082872928
>
>       25%: 0.810810810811
>
>       50%: 0.911392405063
>
>       75%: 0.953020134228
>
>       90%: 0.974683544304
>
>       95%: 0.98125
>
>       max: 0.993377483444
>
>     changesets served:
>
>       total:      1817955
>
>       from cache: 1752642 (96%)
>
>       bundle:       12983
>
>     size of cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 2
>
>       50%: 4
>
>       75%: 9
>
>       90%: 32
>
>       95%: 64
>
>       max: 8165
>
>     hit on cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 1
>
>       50%: 2
>
>       75%: 4
>
>       90%: 14
>
>       95%: 20
>
>       max: 100
>
> mozilla central, 100 pull in 20443 revision, only caching ranges of 256
> changeset and above
>
>     mozilla-central > hg debugpullbundlecacheoverlap --count 100 --min-cache=256 -- 'tip~10000:'
>
>     gathering 100 sample pulls within 20443 revisions
>
>       not caching ranges smaller than 256 changesets
>
>     pull size:
>
>       min: 14060
>
>       10%: 16910
>
>       25%: 17923
>
>       50%: 18940
>
>       75%: 19471
>
>       90%: 19884
>
>       95%: 20029
>
>       max: 20309
>
>     non-cached changesets:
>
>       min: 973
>
>       10%: 1398
>
>       25%: 1707
>
>       50%: 1949
>
>       75%: 2246
>
>       90%: 2590
>
>       95%: 3448
>
>       max: 19551
>
>     ratio of cached changesets:
>
>       min: 0.0
>
>       10%: 0.839908649729
>
>       25%: 0.884512085944
>
>       50%: 0.897293365889
>
>       75%: 0.91018907563
>
>       90%: 0.926944971537
>
>       95%: 0.935139218649
>
>       max: 0.946839315959
>
>     bundle count:
>
>       min: 4
>
>       10%: 10
>
>       25%: 11
>
>       50%: 12
>
>       75%: 12
>
>       90%: 13
>
>       95%: 15
>
>       max: 16
>
>     ratio of cached bundles:
>
>       min: 0.0
>
>       10%: 0.909090909091
>
>       25%: 1.0
>
>       50%: 1.0
>
>       75%: 1.0
>
>       90%: 1.0
>
>       95%: 1.0
>
>       max: 1.0
>
>     changesets served:
>
>       total:      1850243
>
>       from cache: 1617379 (87%)
>
>       bundle:        1150
>
>     size of cached bundles:
>
>       min: 256
>
>       10%: 256
>
>       25%: 256
>
>       50%: 512
>
>       75%: 1024
>
>       90%: 1024
>
>       95%: 1024
>
>       max: 8165
>
>     hit on cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 1
>
>       50%: 7
>
>       75%: 44
>
>       90%: 44
>
>       95%: 59
>
>       max: 98
>
>
> netbsd-src 1000 pull within 10000 revisions:
>
>     netbsd-src > hg debugpullbundlecacheoverlap --count 1000 -- '-10000:'
>
>     gathering 1000 sample pulls within 10000 revisions
>
>     pull size:
>
>       min: 10
>
>       10%: 339
>
>       25%: 865
>
>       50%: 1673
>
>       75%: 2330
>
>       90%: 2752
>
>       95%: 2893
>
>       max: 3466
>
>     non-cached changesets:
>
>       min: 0
>
>       10%: 3
>
>       25%: 7
>
>       50%: 16
>
>       75%: 50
>
>       90%: 137
>
>       95%: 239
>
>       max: 2787
>
>     ratio of cached changesets:
>
>       min: 0.0
>
>       10%: 0.781553398058
>
>       25%: 0.940663176265
>
>       50%: 0.987631184408
>
>       75%: 0.996178830722
>
>       90%: 0.99843688941
>
>       95%: 0.998939554613
>
>       max: 1.0
>
>     bundle count:
>
>       min: 10
>
>       10%: 28
>
>       25%: 37
>
>       50%: 51
>
>       75%: 65
>
>       90%: 78
>
>       95%: 88
>
>       max: 121
>
>     ratio of cached bundles:
>
>       min: 0.0
>
>       10%: 0.446808510638
>
>       25%: 0.673076923077
>
>       50%: 0.826086956522
>
>       75%: 0.901639344262
>
>       90%: 0.942857142857
>
>       95%: 0.96
>
>       max: 1.0
>
>     changesets served:
>
>       total:      1601087
>
>       from cache: 1539491 (96%)
>
>       bundle:       51751
>
>     size of cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 1
>
>       50%: 1
>
>       75%: 4
>
>       90%: 8
>
>       95%: 16
>
>       max: 2048
>
>     hit on cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 1
>
>       50%: 2
>
>       75%: 3
>
>       90%: 8
>
>       95%: 13
>
>       max: 291
>
> netbsd-src 1000 pull within 10000 revisions, not caching range smaller
> than 256:
>
>     netbsd-src > hg debugpullbundlecacheoverlap --count 1000 --min-cache=256 -- '-10000:'
>
>     gathering 1000 sample pulls within 10000 revisions
>
>       not caching ranges smaller than 256 changesets
>
>     pull size:
>
>       min: 10
>
>       10%: 329
>
>       25%: 813
>
>       50%: 1592
>
>       75%: 2271
>
>       90%: 2745
>
>       95%: 2922
>
>       max: 3719
>
>     non-cached changesets:
>
>       min: 10
>
>       10%: 265
>
>       25%: 440
>
>       50%: 663
>
>       75%: 911
>
>       90%: 1111
>
>       95%: 1229
>
>       max: 2852
>
>     ratio of cached changesets:
>
>       min: 0.0
>
>       10%: 0.0
>
>       25%: 0.136752136752
>
>       50%: 0.461261261261
>
>       75%: 0.686327077748
>
>       90%: 0.792263056093
>
>       95%: 0.829552819183
>
>       max: 0.959700093721
>
>     bundle count:
>
>       min: 0
>
>       10%: 0
>
>       25%: 1
>
>       50%: 2
>
>       75%: 3
>
>       90%: 4
>
>       95%: 4
>
>       max: 6
>
>     ratio of cached bundles:
>
>       min: 0.0
>
>       10%: 1.0
>
>       25%: 1.0
>
>       50%: 1.0
>
>       75%: 1.0
>
>       90%: 1.0
>
>       95%: 1.0
>
>       max: 1.0
>
>     changesets served:
>
>       total:      1554227
>
>       from cache:  871680 (56%)
>
>       bundle:        1914
>
>     size of cached bundles:
>
>       min: 256
>
>       10%: 256
>
>       25%: 256
>
>       50%: 256
>
>       75%: 512
>
>       90%: 512
>
>       95%: 512
>
>       max: 2048
>
>     hit on cached bundles:
>
>       min: 2
>
>       10%: 3
>
>       25%: 7
>
>       50%: 61
>
>       75%: 113
>
>       90%: 113
>
>       95%: 117
>
>       max: 267
>
> pypy 1000 pulls within 16687 changesets:
>
>     pypy > time hg debugpullbundlecacheoverlap --count 1000 -- 'tip~2000:'
>
>     gathering 1000 sample pulls within 16687 revisions
>
>     pull size:
>
>       min: 5835
>
>       10%: 9165
>
>       25%: 10323
>
>       50%: 11375
>
>       75%: 12248
>
>       90%: 12904
>
>       95%: 13181
>
>       max: 14221
>
>     non-cached changesets:
>
>       min: 0
>
>       10%: 1
>
>       25%: 4
>
>       50%: 12
>
>       75%: 39
>
>       90%: 142
>
>       95%: 453
>
>       max: 12046
>
>     ratio of cached changesets:
>
>       min: 0.0
>
>       10%: 0.986539780521
>
>       25%: 0.99640167364
>
>       50%: 0.99889963059
>
>       75%: 0.99964598637
>
>       90%: 0.999911496593
>
>       95%: 1.0
>
>       max: 1.0
>
>     bundle count:
>
>       min: 183
>
>       10%: 762
>
>       25%: 1045
>
>       50%: 1206
>
>       75%: 1308
>
>       90%: 1387
>
>       95%: 1427
>
>       max: 1631
>
>     ratio of cached bundles:
>
>       min: 0.0
>
>       10%: 0.972310126582
>
>       25%: 0.990171990172
>
>       50%: 0.995529061103
>
>       75%: 0.997882851094
>
>       90%: 0.999203821656
>
>       95%: 1.0
>
>       max: 1.0
>
>     changesets served:
>
>       total:      11167863
>
>       from cache: 11060195 (99%)
>
>       bundle:     1139537
>
>     size of cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 1
>
>       50%: 2
>
>       75%: 4
>
>       90%: 15
>
>       95%: 30
>
>       max: 2041
>
>     hit on cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 1
>
>       50%: 3
>
>       75%: 20
>
>       90%: 245
>
>       95%: 848
>
>       max: 999
>
> pypy 1000 pulls within 16687 changesets:, caching above 256 changesets
> only:
>
>     time hg debugpullbundlecacheoverlap --count 1000  --min-cache=256 --
> 'tip~2000:'
>
>     gathering 1000 sample pulls within 16687 revisions
>
>       not caching ranges smaller than 256 changesets
>
>     pull size:
>
>       min: 3629
>
>       10%: 9075
>
>       25%: 10278
>
>       50%: 11457
>
>       75%: 12325
>
>       90%: 12961
>
>       95%: 13245
>
>       max: 14330
>
>     non-cached changesets:
>
>       min: 2605
>
>       10%: 5885
>
>       25%: 6619
>
>       50%: 7276
>
>       75%: 7813
>
>       90%: 8319
>
>       95%: 8577
>
>       max: 11815
>
>     ratio of cached changesets:
>
>       min: 0.0
>
>       10%: 0.296190172981
>
>       25%: 0.344310129221
>
>       50%: 0.368110984417
>
>       75%: 0.391840607211
>
>       90%: 0.417344173442
>
>       95%: 0.445362934971
>
>       max: 0.544580009385
>
>     bundle count:
>
>       min: 1
>
>       10%: 7
>
>       25%: 8
>
>       50%: 9
>
>       75%: 10
>
>       90%: 11
>
>       95%: 11
>
>       max: 12
>
>     ratio of cached bundles:
>
>       min: 0.0
>
>       10%: 1.0
>
>       25%: 1.0
>
>       50%: 1.0
>
>       75%: 1.0
>
>       90%: 1.0
>
>       95%: 1.0
>
>       max: 1.0
>
>     changesets served:
>
>       total:      11211093
>
>       from cache: 4066703 (36%)
>
>       bundle:        8964
>
>     size of cached bundles:
>
>       min: 256
>
>       10%: 256
>
>       25%: 256
>
>       50%: 468
>
>       75%: 510
>
>       90%: 1019
>
>       95%: 512
>
>       max: 1916
>
>     hit on cached bundles:
>
>       min: 1
>
>       10%: 1
>
>       25%: 3
>
>       50%: 13
>
>       75%: 63
>
>       90%: 823
>
>       95%: 153
>
>       max: 958
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180927/90265d9e/attachment.html>


More information about the Mercurial-devel mailing list