[PATCH 19 of 22] obsstore: use radixlink to back markerindex
Jun Wu
quark at fb.com
Sun Jun 4 19:59:31 EDT 2017
# HG changeset patch
# User Jun Wu <quark at fb.com>
# Date 1496614546 25200
# Sun Jun 04 15:15:46 2017 -0700
# Node ID 160567b62f74e8e02fa2e9a7b2cb4076b105e528
# Parent 2c2cbefde423ea2537270cf5fecdf5f5d14b19e5
# Available At https://bitbucket.org/quark-zju/hg-draft
# hg pull https://bitbucket.org/quark-zju/hg-draft -r 160567b62f74
obsstore: use radixlink to back markerindex
This replaces Python dict where values are set of markers to radixlink where
values are linked lists of offsets.
hg perfvolatilesets on hg-committed shows (best wall time in seconds):
| --clear-obsstore | no --clear-obsstore
| before | after | before | after
--------------------------------------------------------
bumped | 0.791746 | 0.464715 | 0.013843 | 0.019339
divergent | 0.823026 | 0.524332 | 0.032086 | 0.048416
extinct | 0.597182 | 0.409694 | 0.035425 | 0.035519
obsolete | 0.583754 | 0.378805 | 0.006312 | 0.006591
suspended | 0.613364 | 0.434053 | 0.034716 | 0.035313
unstable | 0.579994 | 0.394650 | 0.016263 | 0.016496
base | 0.001387 | 0.001205 | 0.001048 | 0.001038
immutable | 0.006587 | 0.006494 | 0.005581 | 0.005561
served | 0.469272 | 0.369578 | 0.012776 | 0.012895
visible | 0.573799 | 0.392960 | 0.009748 | 0.010171
Notice that --clear-obsolete reflects real-world usage - you have to load
the obsstore before calculating the sets, and C radixlink has lower overhead
than Python dict + set. Without --clear-obsstore, things are slower because
parsing from offsets is on-demand. But that does not reflect real-world
usage.
Real-world commands on hg-committed become faster as expected:
| before | after
-------------------------------
hg id | 0.73 | 0.55
hg log -r . | 1.05 | 0.74
Note: pure will be slower. But I'd prefer code simplify to optimizing pure
with complex code. Most of our users should use cext version any way.
diff --git a/mercurial/obsolete.py b/mercurial/obsolete.py
--- a/mercurial/obsolete.py
+++ b/mercurial/obsolete.py
@@ -85,4 +85,5 @@ from . import (
parsers = policy.importmod(r'parsers')
+radixlink = policy.importmod(r'radixlink')
_pack = struct.pack
@@ -526,13 +527,15 @@ class marker(object):
return self._data[2]
-class markerindex(dict):
+class markerindex(radixlink.radixlink):
"""key is usually a node, value is a set of markers"""
def __init__(self, obsstore, name, keyfunc):
"""keyfunc: rawmarker -> [key]"""
+ super(markerindex, self).__init__()
self._keyfunc = keyfunc
self._obsstore = weakref.proxy(obsstore)
self.name = name
self.sourceoftruthsize = 0
+ self._readmarker = formats[obsstore._version][0]
self.update()
@@ -544,11 +547,24 @@ class markerindex(dict):
return
keyfunc = self._keyfunc
- setdefault = self.setdefault
markers, offsets = self._obsstore._readmarkers(self.sourceoftruthsize)
- for marker in markers:
+ for i, marker in enumerate(markers):
+ offset = offsets[i]
for k in keyfunc(marker):
- setdefault(k, set()).add(marker)
+ self.insert(k, offset)
self.sourceoftruthsize = len(data)
+ def __getitem__(self, key):
+ offsets = super(markerindex, self).__getitem__(key)
+ # convert offsets to markers
+ data = self._obsstore._data
+ read = self._readmarker
+ return set(read(data, off, off + 1)[0] for off in offsets)
+
+ def get(self, key, default=None):
+ try:
+ return self[key]
+ except KeyError:
+ return default
+
class markerreader(object):
"""read a range of markers with proper caching"""
More information about the Mercurial-devel
mailing list