[PATCH 2 of 4] obsstore: disable garbage collection during initialisation (issue4456)

Pierre-Yves David pierre-yves.david at ens-lyon.org
Sat Nov 29 19:57:42 CST 2014


# HG changeset patch
# User Pierre-Yves David <pierre-yves.david at fb.com>
# Date 1417049911 28800
#      Wed Nov 26 16:58:31 2014 -0800
# Node ID d0f3dac4ea2b4aff51946c7db0834aa4e5c3e82a
# Parent  04eb7e49d2b6f90f71aa85de9ad0b4d70670d688
obsstore: disable garbage collection during initialisation (issue4456)

Python garbage collection is triggered by contained creation. So code that
creates a lot of tuple tends to trigger GC a lot. We disable the gc during
obsolescence marker parsing and associated initialization. The provide and
interesting speedup (25%).

On my 58758 markers repo:
before: 0.468247 seconds
after:  0.344362 seconds

Thanks goes to Siddharth Agarwal for the lead.

diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
--- a/mercurial/obsolete.py
+++ b/mercurial/obsolete.py
@@ -66,10 +66,11 @@ The file starts with a version header:
 The header is followed by the markers. Marker format depend of the version. See
 comment associated with each format for details.
 
 """
 import struct
+import gc
 import util, base85, node
 import phases
 from i18n import _
 
 _pack = struct.pack
@@ -466,12 +467,28 @@ class obsstore(object):
         self.sopener = sopener
         data = sopener.tryread('obsstore')
         self._version = defaultformat
         self._readonly = readonly
         if data:
-            self._version, markers = _readmarkers(data)
-            self._load(markers)
+            # Python's garbage collector triggers a GC each time a certain
+            # number of container objects (the number being defined by
+            # gc.get_threshold()) are allocated. Markers parsing creates
+            # multiple tuples while parsing each markers so the gc is triggered
+            # a lot while parsing an high number of markers. As a workaround,
+            # disable GC during initialisation.
+            #
+            # This would probably marker parsing during exchange but I do not
+            # expect the order of magnitude to matter outside of initialisation
+            # case.
+            gcenabled = gc.isenabled()
+            gc.disable()
+            try:
+                self._version, markers = _readmarkers(data)
+                self._load(markers)
+            finally:
+                if gcenabled:
+                    gc.enable()
 
     def __iter__(self):
         return iter(self._all)
 
     def __len__(self):


More information about the Mercurial-devel mailing list