Testing very long delta chains
Matt Mackall
mpm at selenic.com
Tue Dec 22 23:41:47 CST 2015
On Tue, 2015-12-22 at 17:27 -0800, Gregory Szorc wrote:
> https://www.mercurial-scm.org/wiki/BigRepositories has been updated with a
> link to
> https://hg.mozilla.org/users/gszorc_mozilla.com/mozilla-central-aggressivemerg
> edeltas,
> which is a generaldelta clone of mozilla-central with
> format.aggressivemergedeltas enabled.
>
> The last manifest delta chain in this repo is over 45,000 entries deep and
> it makes for a good benchmark for testing revlog reading performance.
>
> Remember: `hg clone --uncompressed` to preserve the delta chains from the
> server or your client will recompute them as part of applying the
> changegroup.
Without my threaded zlib hack:
$ hg perfmanifest 277045
! wall 0.749929 comb 0.740000 user 0.730000 sys 0.010000 (best of 13)
(25% CPU usage on a CPU with 4 threads)
With my threaded zlib hack (threads = 4):
$ hg perfmanifest 277045
! wall 0.480251 comb 1.090000 user 0.990000 sys 0.100000
(best of 20)
(50% CPU usage on a CPU with 4 threads)
Things we can do better:
- add a C decompress helper
- that works on lists of buffers
- that calls zlib directly
- that uses threads
- that uses larger buffers
- that uses a faster zlib
(For this last, the cloudflare fork of zlib has a faster CRC function that seems to be worth about 20%)
# HG changeset patch
# User Matt Mackall <mpm at selenic.com>
# Date 1450727921 21600
# Mon Dec 21 13:58:41 2015 -0600
# Node ID b56bc1676b5d4a14167be2498921b57f06ddcd69
# Parent 3dea4eae4eebac11741f0c1dc5dcd9c88d8f4554
revlog: thread decompress
diff -r 3dea4eae4eeb -r b56bc1676b5d mercurial/revlog.py
--- a/mercurial/revlog.py Mon Dec 21 14:52:18 2015 -0600
+++ b/mercurial/revlog.py Mon Dec 21 13:58:41 2015 -0600
@@ -17,6 +17,8 @@
import errno
import os
import struct
+import threading
+import Queue
import zlib
# import stuff from node for others to import from revlog
@@ -1132,14 +1134,38 @@
# 2G on Windows
return [self._chunk(rev, df=df) for rev in revs]
- for rev in revs:
+ slots = [None] * len(revs)
+
+ work = []
+ done = Queue.Queue()
+
+ for slot, rev in enumerate(revs):
chunkstart = start(rev)
if inline:
chunkstart += (rev + 1) * iosize
chunklength = length(rev)
- ladd(decompress(buffer(data, chunkstart - offset, chunklength)))
+ buf = buffer(data, chunkstart - offset, chunklength)
+ if buf and buf[0] == 'x':
+ work.append((slot, buf))
+ else:
+ slots[slot] = decompress(buf)
- return l
+ def worker():
+ try:
+ while True:
+ slot, buf = work.pop()
+ slots[slot] = _decompress(buf)
+ except:
+ done.put(1)
+
+ tcount = 4
+ for w in xrange(tcount - 1):
+ threading.Thread(target=worker).start()
+ worker()
+ for w in xrange(tcount):
+ done.get()
+
+ return slots
def _chunkclear(self):
"""Clear the raw chunk cache."""
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list