[PATCH 7 of 9] context: enhance findbyhash for partial hash ID

Tue Mar 31 05:00:06 CDT 2015

At Mon, 30 Mar 2015 15:13:41 -0700,
Pierre-Yves David wrote:
> 
> 
> 
> On 03/29/2015 11:34 AM, FUJIWARA Katsunori wrote:
> > # HG changeset patch
> > # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> > # Date 1427653752 -32400
> > #      Mon Mar 30 03:29:12 2015 +0900
> > # Node ID ce47b26e3c0d6104b5bfb2d15e59b5a2af573956
> > # Parent  5070134e83f1a46a55f74e6e6dce6ebd75a5eee9
> > context: enhance findbyhash for partial hash ID
> >
> > Before this patch, there is no easy way to examine existence of the
> > revision by partial hash ID strictly.
> >
> > "hashid in repo" may match against bookmarks, tags and so on.
> >
> > This patch factors out the logic for it in "changectx.__init__()" and
> > enhance "findbyhash()" with it, to avoid duplication of similar code.
> >
> > This is a preparation to fix equivalence problem of revset predicate
> > "id()".
> >
> > diff --git a/mercurial/context.py b/mercurial/context.py
> > --- a/mercurial/context.py
> > +++ b/mercurial/context.py
> > @@ -426,6 +426,12 @@ def _findby40hash(repo, changeid):
> >       except (TypeError, LookupError):
> >           return None
> >
> > +def _findbypartialhash(repo, changeid):
> > +    node = repo.unfiltered().changelog._partialmatch(changeid)
> > +    if node is not None:
> > +        return (node, repo.changelog.rev(node))
> > +    return None
> > +
> >   def findbyhash(repo, hashid, abort=False):
> >       """Find the revision by hash ID
> >
> > @@ -443,8 +449,12 @@ def findbyhash(repo, hashid, abort=False
> >       branches and so on.
> >       """
> >       assert isinstance(hashid, str)
> > -    assert len(hashid) == 40
> > -    return _wraplookup(_findby40hash, repo, hashid, abort)
> > +    assert len(hashid) <= 40
> > +    if len(hashid) == 40:
> > +        lookupfunc = _findby40hash
> > +    else:
> > +        lookupfunc = _findbypartialhash
> 
> Why are we not using _partialmatch in all case ?.

Because "revlog.rev()" is more efficient than "revlog._partialmatch()"
for 40 letter hash ID, which doesn't have any ambiguity.

According to revlog implementation, "revlog._partialmatch()" implies
both "revlog.index.partialmatch()" and "revlog.rev()" (via
"revlog.hasnode()"), if specified id is valid.

Even if C implementation of "index.partialmatch()" is much faster than
"revlog.rev()", it is just overhead of "revlog._partialmatch()" at
comparison with "revlog.rev()".

Do I overlook any reasons (perhaps, around C implementation) to have
to use "revlog._partialmatch()" in both cases ?

> > +    return _wraplookup(lookupfunc, repo, hashid, abort)
> >
> >   class changectx(basectx):
> >       """A changecontext object makes access to data related to a particular
> > @@ -528,9 +538,9 @@ class changectx(basectx):
> >               except error.RepoLookupError:
> >                   pass
> >
> > -            self._node = repo.unfiltered().changelog._partialmatch(changeid)
> > -            if self._node is not None:
> > -                self._rev = repo.changelog.rev(self._node)
> > +            found = _findbypartialhash(repo, changeid)
> > +            if found:
> > +                self._node, self._rev = found
> >                   return
> >
> >               # lookup failed
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel at selenic.com
> > http://selenic.com/mailman/listinfo/mercurial-devel
> >
> 
> -- 
> Pierre-Yves David
> 

----------------------------------------------------------------------
[FUJIWARA Katsunori]                             foozy at lares.dti.ne.jp