[PATCH 12 of 14] obscache: use the obscache to compute the obsolete set

Augie Fackler raf at durin42.com
Fri Jul 14 14:16:16 EDT 2017


On Sun, Jul 09, 2017 at 07:55:24PM +0200, Boris Feld wrote:
> # HG changeset patch
> # User Boris Feld <boris.feld at octobus.net>
> # Date 1495198021 -7200
> #      Fri May 19 14:47:01 2017 +0200
> # Node ID 3a93e99b0e718befd57a32615a14fd0d3c194456
> # Parent  f8953ed43f8d1b146dcff688030133f0d6123a49
> # EXP-Topic obs-cache
> obscache: use the obscache to compute the obsolete set
>
> Now that we have a cache and that the cache is kept up to date, we can use it to
> speeds up the obsolete set computation. This way, we no longer need to load the
> obsstore for most operation.
>
> On the mercurial-core repository, this provides a significant speed up for
> operation that do not need further obsolescence information:
>
> | command                  | before | after |
> | id -r @                  |  0.670 | 0.093 |
> | log -r @                 |  0.951 | 0.916 |
> | diff                     |  0.070 | 0.069 |
> | diff -r .~1              |  0.705 | 0.120 |
> | export @                 |  0.672 | 0.103 |
> | log -G -r draft()        |  1.117 | 1.084 |
> | log -G -r draft()        |  1.063 | 1.064 |
> |   -T "{node} {troubles}" |        |       |
> | log -G -r tip            |  0.658 | 0.086 |
> |   -T "{node} {desc}"     |        |       |

These timings look pretty good, though I find some of the numbers
underwhelming (eg 'hg log -r @' seems like it shouldn't be anywhere
*near* that slow).

I think given the history here, and that I know Jun has something
related coming in a few weeks, I'm going to put this series on hold
until after 4.3 is done so we can get some comparisons between
approaches, as I believe his solution requires a little less in the
way of creative caching layers.

In more detail: the radixlink approach to speeding up obsmarkers
interests me a lot because it speeds up more than this does, and is
close on the things where they're similar. If I can have one caching
layer instead of two, I'd like that, and given how late in the cycle
it is and the timetable I've heard from Jun I'm happy to let hg 4.3
not get perf wins today and just make 4.4 the release where we get
this sorted out.

>
> The obsstore loading operation usually disappear from execution profile.
>
> The speeds up rely on a couple of different mechanism:
>
> * First, not having to parse the obsstore provides a massive speedup:
>
>   Time spent computing 'obsolete', no obsstore pre-loaded.
>
>     before: wall 0.449502 comb 0.460000 user 0.420000 sys 0.040000 (best of 17)
>     after:  wall 0.005752 comb 0.010000 user 0.010000 sys 0.000000 (best of 340)
>
> * Second keeping the computation fully in the revision space (no usage of node),
>   raise and extra 4x speedup.
>
>   Time spent computing 'obsolete', obsstore preloaded:
>
>     before: wall 0.007684 comb 0.000000 user 0.000000 sys 0.000000 (best of 305)
>     after:  wall 0.001928 comb 0.000000 user 0.000000 sys 0.000000 (best of 1250)
>
> To keep the changeset simple, we assume the cache is up to date (from the last
> transaction). This won't be true when both old and new clients access the
> repository. (with the special case of no new transactions since last upgrade).
> We'll address this issue in the next couple of changesets.
>
> This changesets is a first step to install the basic. There are a couple of easy
> improvement that can further improve this cache:
>
>  * Improving handling of outdated cache on read only operation (see above),
>
>  * Avoid reaading the full obsstore data from disk to check the cache key
>    (about -4ms, 3x speedup)
>
>  * Optimise the python code to reduce attribute lookup
>    (about 25% of the remaining of the time spent there).
>
> diff -r f8953ed43f8d -r 3a93e99b0e71 mercurial/obsolete.py
> --- a/mercurial/obsolete.py	Fri May 19 14:46:46 2017 +0200
> +++ b/mercurial/obsolete.py	Fri May 19 14:47:01 2017 +0200
> @@ -1096,11 +1096,24 @@
>  @cachefor('obsolete')
>  def _computeobsoleteset(repo):
>      """the set of obsolete revisions"""
> -    getnode = repo.changelog.node
>      notpublic = _mutablerevs(repo)
> -    isobs = repo.obsstore.successors.__contains__
> -    obs = set(r for r in notpublic if isobs(getnode(r)))
> -    return obs
> +    if not notpublic or not repo.obsstore:
> +        # all changeset are public, none are obsolete
> +        return set()
> +
> +    # XXX There are a couple of case where the cache could not be up to date:
> +    #
> +    # 1) no transaction happened in the repository since the upgrade,
> +    # 2) both old and new client touches that repository
> +    #
> +    # recomputing the whole cache in these case is a bit slower that using the
> +    # good old version (parsing markers and checking them). We could add some
> +    # logic to fall back to the old way in these cases.
> +    obscache = repo.obsstore.obscache
> +    obscache.update(repo) # ensure it is up to date:
> +    isobs = obscache.get
> +
> +    return set(r for r in notpublic if isobs(r))
>
>  @cachefor('unstable')
>  def _computeunstableset(repo):
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


More information about the Mercurial-devel mailing list