[PATCH 2 of 2] revlog: do inclusive descendant testing (API)

Mon Jun 25 12:54:23 EDT 2018

On 21/06/18 18:30, Martin von Zweigbergk wrote:
> On Thu, Jun 21, 2018 at 7:24 AM Paul Morelle <paul.morelle at octobus.net
> <mailto:paul.morelle at octobus.net>> wrote:
>
>     # HG changeset patch
>     # User Boris Feld <boris.feld at octobus.net
>     <mailto:boris.feld at octobus.net>>
>     # Date 1529585081 -3600
>     #      Thu Jun 21 13:44:41 2018 +0100
>     # Node ID 59fea52e54e014722486f7c049e192fa505032d8
>     # Parent  8d20b1b4b6a0297e7f9640d285b15a5d6647369e
>     # EXP-Topic descendant
>     # Available At https://bitbucket.org/octobus/mercurial-devel/
>     #              hg pull
>     https://bitbucket.org/octobus/mercurial-devel/ -r 59fea52e54e0
>     revlog: do inclusive descendant testing (API)
>
>     In many other places, a revision is considered a descendant of itself.
>     We update the behavior of `revlog.descendant()` to match this.
>
>     No test breaks, so it seems safe to do so.
>
>     diff -r 8d20b1b4b6a0 -r 59fea52e54e0 mercurial/revlog.py
>     --- a/mercurial/revlog.py       Thu Jun 21 13:32:07 2018 +0100
>     +++ b/mercurial/revlog.py       Thu Jun 21 13:44:41 2018 +0100
>     @@ -1378,7 +1378,7 @@
>          def descendant(self, start, end):
>              if start == nullrev:
>                  return True
>     -        return start in self.ancestors([end])
>     +        return start in self.ancestors([end], inclusive=True)
>
>
> Is this now equivalent to self.isancestor(start, end)? That method
> relies on commonancestorsheads instead of lazyancestors. What are the
> performance trade-offs? Equivalent both when there are many ancestors
> and when there are many descendants?
Hello Martin,

Interestingly, it turns out that we have the following flock of functions:

  * ancestors: commonancestorsheads(parent_func, *revs)
      o uses revnum
      o any number of arguments
      o written in Python
  * cext/revlog.c: revlog.index.commonancestorsheads(*revs)
      o uses revnum
      o any number of arguments
      o written in C
  * revlog: revlog.commonancestorsheads(node-a, node-b)
      o uses nodes
      o takes exactly two nodes
      o Calls either self.index.c…a…heads  or ancestors.c…a…heads
  * revlog: revlog.isancestor(anc, desc)
      o uses nodes
      o calls revlog.commonancestorsheads
  * revlog: revlog.descendant(rev-a, rev-b)
      o uses revs
      o has it own very slow code
  * revlog: revlog.descendant(rev-a, rev-b)
      o uses revs
      o has it own very slow code
      o non-inclusive
  * context: context.descendant(other)
      o uses contexts
      o calls revlog.descendant
      o non-inclusive

At the algorithm level, `anc in ancestors(desc)` will be faster when anc
is not an ancestor of desc (or they are many gca), since it will finish
sooner. However given `commonancestorheads` benefits from a C
implementation, it is currently the fastest option.

In short terms, I think the following actions would make sense:

 1. Extract a lower level `revlog._commonancestorheads(*revs)` from
    `revlog.commonancestorsheads`
 2. Use it in `revlog.descendant`
 3. Make `revlog.isancestor` use `revlog.descendant`

Does this seems sensible to you?

--
Paul Morelle
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20180625/38552631/attachment.html>