[PATCH] revset: make filteredset.__nonzero__ respect the order of the filteredset
Augie Fackler
raf at durin42.com
Mon Jun 6 12:03:08 EDT 2016
On Thu, Jun 02, 2016 at 10:58:31PM +0100, Kostia Balytskyi wrote:
> # HG changeset patch
> # User Kostia Balytskyi <ikostia at fb.com>
> # Date 1464903541 -3600
> # Thu Jun 02 22:39:01 2016 +0100
> # Node ID ac1d06eeb8558f93d8f013b23ca9ce20892b9fc0
> # Parent 9ac309946df9e69bb73ded75f2fc5b84a4e785c2
> revset: make filteredset.__nonzero__ respect the order of the filteredset
Seems reasonable, and a worthwhile perf fix. Queued.
>
> This fix allows __nonzero__ to respect the direction of iteration of the
> whole filteredset. Here's the case when it matters. Imagine that we have a
> very large repository and we want to execute a command like:
>
> $ hg log --rev '(tip:0) and user(ikostia)' --limit 1
>
> (we want to get the latest commit by me).
>
> Mercurial will evaluate a filteredset lazy data structure, an
> instance of the filteredset class, which will know that it has to iterate
> in a descending order (isdescending() will return True if called). This
> means that when some code iterates over the instance of this filteredset,
> the 'and user(ikostia)' condition will be first checked on the latest
> revision, then on the second latest and so on, allowing Mercurial to
> print matches as it founds them. However, cmdutil.getgraphlogrevs
> contains the following code:
>
> revs = _logrevs(repo, opts)
> if not revs:
> return revset.baseset(), None, None
>
> The "not revs" expression is evaluated by calling filteredset.__nonzero__,
> which in its current implementation will try to iterate the filteredset
> in ascending order until it finds a revision that matches the 'and user(..'
> condition. If the condition is only true on late revisions, a lot of
> useless iterations will be done. These iterations could be avoided if
> __nonzero__ followed the order of the filteredset, which in my opinion
> is a sensible thing to do here.
>
> The problem gets even worse when instead of 'user(ikostia)' some more
> expensive check is performed, like grepping the commit diff.
>
>
> I tested this fix on a very large repo where tip is my commit and my very
> first commit comes fairly late in the revision history. Results of timing
> of the above command on that very large repo.
>
> -with my fix:
> real 0m1.795s
> user 0m1.657s
> sys 0m0.135s
>
> -without my fix:
> real 1m29.245s
> user 1m28.223s
> sys 0m0.929s
>
> I understand that this is a very specific kind of problem that presents
> itself very rarely, only on very big repositories and with expensive
> checks and so on. But I don't see any disadvantages to this kind of fix
> either.
>
> diff --git a/mercurial/revset.py b/mercurial/revset.py
> --- a/mercurial/revset.py
> +++ b/mercurial/revset.py
> @@ -2743,9 +2743,16 @@
> return lambda: self._iterfilter(it())
>
> def __nonzero__(self):
> - fast = self.fastasc
> - if fast is None:
> - fast = self.fastdesc
> + fast = None
> + candidates = [self.fastasc if self.isascending() else None,
> + self.fastdesc if self.isdescending() else None,
> + self.fastasc,
> + self.fastdesc]
> + for candidate in candidates:
> + if candidate is not None:
> + fast = candidate
> + break
> +
> if fast is not None:
> it = fast()
> else:
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
More information about the Mercurial-devel
mailing list