[PATCH 6 of 6 V2] obscache: use the obscache to compute the obsolete set

Jun Wu quark at fb.com
Sat May 20 13:37:23 EDT 2017


While this does speed up commands like "hg id", people more frequently use
"hg log [-G]" which will call "ctx.troubles" by default and that requires a
full "obsolete()" revset in multiple places (unstable, bumped and divergent
calculation), which won't be helped by this cache.

Do you plan to build similar bitmap cache for unstable, bumped, divergent
too?

Besides, even with this series, _computeobsoleteset is still O(N). The time
complexity remains unchanged. This is not a long-term solution.

I believe the long term solution is to make testing O(1) and remove all
enumeration for all revsets.

Excerpts from Pierre-Yves David's message of 2017-05-20 17:30:20 +0200:
> # HG changeset patch
> # User Pierre-Yves David <pierre-yves.david at octobus.net>
> # Date 1495198021 -7200
> #      Fri May 19 14:47:01 2017 +0200
> # Node ID eb7674b12d5a15fc53f10b075dcac7bee91379d2
> # Parent  3c2a082a590aa8b57693c24b8461c2afdb8d5556
> # EXP-Topic obscache
> # Available At https://www.mercurial-scm.org/repo/users/marmoute/mercurial/ 
> #              hg pull https://www.mercurial-scm.org/repo/users/marmoute/mercurial/  -r eb7674b12d5a
> obscache: use the obscache to compute the obsolete set
> 
> Now that we have a cache and that the cache is kept up to date, we can use it to
> speeds up the obsolete set computation. This way, we no longer need to load the
> obsstore for most operation.
> 
> On the mercurial-core repository, this provide a significant speed up:
> 
> Running "hg  id -r ."
> - before: 0.630 second (0.56s user 0.06s system 99% cpu 0.630)
> - after:  0.129 second (0.11s user 0.02s system 98% cpu 0.129)
> 
> And the obsstore loading operation disappear from execution profile.
> 
> (note: time spent inside the command drop from 0.4 to 0.04s)
> 
> To keep the changeset simple it the handling of case were
> the cache has not been kept up to date is pretty simple. That might introduce a
> small performance impact during the transition in some case. This will get
> improved in later changeset.
> 
> In addition the cache still needs to parse the full obsstore when updating.
> There as known way to skip parsing the full obsstore for wrote operation too.
> This will also get improved later.
> 
> diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
> --- a/mercurial/obsolete.py
> +++ b/mercurial/obsolete.py
> @@ -1546,10 +1546,26 @@ def clearobscaches(repo):
>  def _computeobsoleteset(repo):
>      """the set of obsolete revisions"""
>      obs = set()
> -    getnode = repo.changelog.node
>      notpublic = repo._phasecache.getrevset(repo, (phases.draft, phases.secret))
> +    if not notpublic:
> +        # all changeset are public, none are obsolete
> +        return obs
> +
> +    # XXX There are a couple of case where the cache could not be up to date:
> +    #
> +    # 1) no transaction happened in the repository since the upgrade,
> +    # 2) both old and new client touches that repository
> +    #
> +    # recomputing the whole cache in these case is a bit slower that using the
> +    # good old version (parsing markers and checking them). We could add some
> +    # logic to fall back to the old way in these cases.
> +    obscache = repo.obsstore.obscache
> +    obscache.update(repo) # ensure it is up to date:
> +    isobs = obscache.get
> +
> +    # actually compute the obsolete set
>      for r in notpublic:
> -        if getnode(r) in repo.obsstore.successors:
> +        if isobs(r):
>              obs.add(r)
>      return obs
>  


More information about the Mercurial-devel mailing list