[PATCH 2 of 4] revset: avoid "list(repo)" for efficiency on large repository

FUJIWARA Katsunori foozy at lares.dti.ne.jp
Fri Mar 29 11:16:32 CDT 2013


# HG changeset patch
# User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
# Date 1364573132 -32400
# Node ID bede15731ca703fdf8ca8815793029766c3fc893
# Parent  4cf0465cd64ff196ad83ec2d10b6e13ed89c2913
revset: avoid "list(repo)" for efficiency on large repository

Before this patch, "list(repo)" (or similar one) is used as "subset"
argument of the function returned by "revset.match()", to mean "whole
revisions in repository". This causes immediate list creation object
containing integers corresponding to revisions in repository, even
though repository may have many revisions in itself.

Not only for avoiding immediate list creation, but also for
convenience, this patch chooses making "subset" argument optional,
instead of replacing "list(repo)" with "revset._safesubset(repo)" on
the caller side.

Before this patch, None can't be used as "subset" argument because
"len()", "x in subset" and so on should be applicable on it. So, this
patch can make "None" default value of "subset" argument safely.

This patch also accepts "repo" itself as "subset". It should be more
readable than "subset=None" to mean "whole revisions in repository"
explicitly on the caller side.

Results of "hg perfrevset" (before/after this patch) on the repository
containing 40000 revisions are shown below:

  - "max(tip)":
    ! wall 0.000000 comb 0.000000 user 0.000000 sys 0.000000 (best of 1969)
    ! wall 0.000000 comb 0.000000 user 0.000000 sys 0.000000 (best of 30000)

diff -r 4cf0465cd64f -r bede15731ca7 mercurial/commands.py
--- a/mercurial/commands.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/commands.py	Sat Mar 30 01:05:32 2013 +0900
@@ -2367,7 +2367,7 @@
         if newtree != tree:
             ui.note(revset.prettyformat(newtree), "\n")
     func = revset.match(ui, expr)
-    for c in func(repo, range(len(repo))):
+    for c in func(repo):
         ui.write("%s\n" % c)
 
 @command('debugsetparents', [], _('REV1 [REV2]'))
diff -r 4cf0465cd64f -r bede15731ca7 mercurial/localrepo.py
--- a/mercurial/localrepo.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/localrepo.py	Sat Mar 30 01:05:32 2013 +0900
@@ -403,7 +403,7 @@
         '''Return a list of revisions matching the given revset'''
         expr = revset.formatspec(expr, *args)
         m = revset.match(None, expr)
-        return [r for r in m(self, list(self))]
+        return [r for r in m(self)]
 
     def set(self, expr, *args):
         '''
diff -r 4cf0465cd64f -r bede15731ca7 mercurial/revset.py
--- a/mercurial/revset.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/revset.py	Sat Mar 30 01:05:32 2013 +0900
@@ -1838,7 +1838,9 @@
     if ui:
         tree = findaliases(ui, tree)
     weight, tree = optimize(tree, True)
-    def mfunc(repo, subset):
+    def mfunc(repo, subset=None):
+        if subset is None or repo == subset:
+            subset = _safesubset(repo)
         return getset(repo, subset, tree)
     return mfunc
 
diff -r 4cf0465cd64f -r bede15731ca7 mercurial/scmutil.py
--- a/mercurial/scmutil.py	Sat Mar 30 01:05:32 2013 +0900
+++ b/mercurial/scmutil.py	Sat Mar 30 01:05:32 2013 +0900
@@ -618,7 +618,7 @@
 
         # fall through to new-style queries if old-style fails
         m = revset.match(repo.ui, spec)
-        dl = [r for r in m(repo, list(repo)) if r not in seen]
+        dl = [r for r in m(repo) if r not in seen]
         l.extend(dl)
         seen.update(dl)
 


More information about the Mercurial-devel mailing list