[PATCH 8 of 9] localrepo: add findbyhash for convenience

Gregory Szorc gregory.szorc at gmail.com
Sun Mar 29 14:17:00 CDT 2015


On Sun, Mar 29, 2015 at 11:34 AM, FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
wrote:

> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1427653752 -32400
> #      Mon Mar 30 03:29:12 2015 +0900
> # Node ID 8a0e73ddbd401236f42f0afd65f254c4cdcdce08
> # Parent  ce47b26e3c0d6104b5bfb2d15e59b5a2af573956
> localrepo: add findbyhash for convenience
>
> This allows to use "context.findbyhash()" easily, without additional
> importing "context".
>
> Using "repo.findbyhash(hashid)" instead of "hashid in repo" is also
> useful to avoid meaningless matching against "repo.dirstate.p1()",
> reserved names (e.g. "null", "tip", "."), bookmarks, tags, branches
> and so on.
>
> Matching against "repo.dirstate.p1()" is especially inefficient,
> because it may cause useless (re-)loading from ".hg/dirstate" or
> invoking "changelog.rev(bin(hashid))" for validation.
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -1915,6 +1915,20 @@ class localrepository(object):
>          """
>          return context.findbyrevnum(self, revnum, abort=abort)
>
> +    def findbyhash(self, hashid, abort=False):
> +        """Find the revision by hash ID in this repository
> +
> +        ``hashid`` should be ``str``.
> +
> +        This returns ``(node, revnum)`` tuple, if the revision
> +        corresponded to the specified ``hashid`` is found. If that
> +        revision is hidden one, this raises FilteredRepoLookupError.
> +
> +        Other wise, this returns ``None`` (or raises RepoLookupError if
> +        ``abort``).
> +        """
> +        return context.findbyhash(self, hashid, abort=abort)
> +
>  # used to avoid circular references so destructors work
>  def aftertrans(files):
>      renamefiles = [tuple(t) for t in files]
>

This patch is related to tags and names performance issues. When using
"node in repo" or "repo[node]" with a 20 or 40 byte value that must be a
node but isn't present, context.changectx.__init__ falls back to names
lookup [1]. This can be quite expensive for the tags cache, as it means
calculating heads and .hgtags filenodes. It seems like avoidable overhead.

For consumers that know they are operating on node values and not symbolic
names (wire protocol, caches, etc), I think it makes sense to have a "get
context" API that only operates on rev or node inputs. This is very similar
to what's implemented in this patch series. The big difference is I'd be
looking for something that returns a context instance, not just rev and
node. How would such an API fit into what you've implemented here?

[1] http://selenic.com/repo/hg/file/efa094701a05/mercurial/context.py#l430
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150329/2e6a873c/attachment.html>


More information about the Mercurial-devel mailing list