[PATCH 1 of 2] smartset: convert set to list lazily

Jun Wu quark at fb.com
Sat Feb 18 06:02:11 UTC 2017


# HG changeset patch
# User Jun Wu <quark at fb.com>
# Date 1487393969 28800
#      Fri Feb 17 20:59:29 2017 -0800
# Node ID b32af6cafb9b958decb7ae0db7dd754aa5a5f883
# Parent  01eebb65a61d9edcad1665ed747c7092f1ddb8b9
# Available At https://bitbucket.org/quark-zju/hg-draft
#              hg pull https://bitbucket.org/quark-zju/hg-draft -r b32af6cafb9b
smartset: convert set to list lazily

If the caller only wants to construct a baseset via a set, and then do
"__contains__" tests. It's unnecessary to initialize the list.

Testing on my unfiltered hg-committed repo where len(draft()) is 2600, this
patch shows about 6% improvement on set intensive queries:

Before:

  $ for i in `seq 5`; hg perfrevset 'draft() & draft() & draft() & draft()'
  ! wall 0.001196 comb 0.000000 user 0.000000 sys 0.000000 (best of 2011)
  ! wall 0.001191 comb 0.000000 user 0.000000 sys 0.000000 (best of 2099)
  ! wall 0.001186 comb 0.010000 user 0.010000 sys 0.000000 (best of 1953)
  ! wall 0.001182 comb 0.000000 user 0.000000 sys 0.000000 (best of 2135)
  ! wall 0.001193 comb 0.000000 user 0.000000 sys 0.000000 (best of 2177)

After:

  $ for i in `seq 5`; hg perfrevset 'draft() & draft() & draft() & draft()'
  ! wall 0.001128 comb 0.000000 user 0.000000 sys 0.000000 (best of 2247)
  ! wall 0.001119 comb 0.000000 user 0.000000 sys 0.000000 (best of 2317)
  ! wall 0.001115 comb 0.000000 user 0.000000 sys 0.000000 (best of 2244)
  ! wall 0.001131 comb 0.000000 user 0.000000 sys 0.000000 (best of 2093)
  ! wall 0.001124 comb 0.000000 user 0.000000 sys 0.000000 (best of 2134)

It could have bigger impact on larger sets in theory.

diff --git a/mercurial/smartset.py b/mercurial/smartset.py
--- a/mercurial/smartset.py
+++ b/mercurial/smartset.py
@@ -172,6 +172,10 @@ class baseset(abstractsmartset):
                 # set has no order we pick one for stability purpose
                 self._ascending = True
-            data = list(data)
-        self._list = data
+                # converting set to list has a cost, do it lazily
+                data = None
+            else:
+                data = list(data)
+        if data is not None:
+            self._list = data
         self._datarepr = datarepr
 
@@ -186,4 +190,10 @@ class baseset(abstractsmartset):
         return asclist
 
+    @util.propertycache
+    def _list(self):
+        # _list is only lazily constructed if we have _set
+        assert '_set' in self.__dict__
+        return list(self._set)
+
     def __iter__(self):
         if self._ascending is None:
@@ -205,5 +215,5 @@ class baseset(abstractsmartset):
 
     def __nonzero__(self):
-        return bool(self._list)
+        return bool(len(self))
 
     def sort(self, reverse=False):
@@ -219,5 +229,8 @@ class baseset(abstractsmartset):
 
     def __len__(self):
-        return len(self._list)
+        if '_list' in self.__dict__:
+            return len(self._list)
+        else:
+            return len(self._set)
 
     def isascending(self):


More information about the Mercurial-devel mailing list