[PATCH 2 of 2] revlog: add Mercurial config variable for limiting delta-chain length
Augie Fackler
raf at durin42.com
Tue Nov 11 08:49:55 CST 2014
On Mon, Nov 10, 2014 at 11:27:41AM -0800, Mateusz Kwapich wrote:
> # HG changeset patch
> # User Mateusz Kwapich <mitrandir at fb.com>
> # Date 1415312405 28800
> # Thu Nov 06 14:20:05 2014 -0800
> # Node ID 0c2718661ea13e8054ab0d336cb5784b85991a5a
> # Parent 79ae6c4132b5c582ea7dbd1aa4af8e2bcd2f5973
> revlog: add Mercurial config variable for limiting delta-chain length
>
> The current heuristic for deciding between storing delta and full texts
> is based on ratio of (sizeofdeltas)/(sizeoffulltext).
>
> In some cases (for example for mercurial Manifest for huge repo) this approach
> can result in extremely long delta chains (~30,000) which are very slow to
> read. (In case of Manifest ~500ms are added to every hg command because of that).
>
> This commit introduces "revlog.maxchainlength" configuration variable that will
> limit delta chain length.
>
> diff --git a/mercurial/localrepo.py b/mercurial/localrepo.py
> --- a/mercurial/localrepo.py
> +++ b/mercurial/localrepo.py
> @@ -316,6 +316,9 @@
> chunkcachesize = self.ui.configint('format', 'chunkcachesize')
> if chunkcachesize is not None:
> self.sopener.options['chunkcachesize'] = chunkcachesize
> + maxchainlen = self.ui.configint('revlog', 'maxchainlen')
> + if maxchainlen is not None:
> + self.sopener.options['maxchainlen'] = maxchainlen
>
> def _writerequirements(self):
> reqfile = self.opener("requires", "w")
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -204,6 +204,7 @@
> self._basecache = None
> self._chunkcache = (0, '')
> self._chunkcachesize = 65536
> + self._maxchainlen = None
> self.index = []
> self._pcache = {}
> self._nodecache = {nullid: nullrev}
> @@ -219,6 +220,8 @@
> v = 0
> if 'chunkcachesize' in opts:
> self._chunkcachesize = opts['chunkcachesize']
> + if 'maxchainlen' in opts:
> + self._maxchainlen = opts['maxchainlen']
>
> if self._chunkcachesize <= 0:
> raise RevlogError(_('revlog chunk cache size %r is not greater '
> @@ -1216,11 +1219,13 @@
> base = rev
> else:
> base = chainbase
> - return dist, l, data, base, chainbase
> + chainlen = self.chainlen(rev) + 1
> + return dist, l, data, base, chainbase, chainlen
>
> curr = len(self)
> prev = curr - 1
> base = chainbase = curr
> + chainlen = None
> offset = self.end(prev)
> flags = 0
> d = None
> @@ -1240,7 +1245,7 @@
> d = builddelta(prev)
> else:
> d = builddelta(prev)
> - dist, l, data, base, chainbase = d
> + dist, l, data, base, chainbase, chainlen = d
>
> # full versions are inserted when the needed deltas
> # become comparable to the uncompressed text
> @@ -1249,7 +1254,8 @@
> cachedelta[1])
> else:
> textlen = len(text)
> - if d is None or dist > textlen * 2:
> + if (d is None or dist > textlen * 2 or
> + self._maxchainlen and chainlen > self._maxchainlen):
> text = buildtext()
> data = self.compress(text)
> l = len(data[1]) + len(data[0])
> diff --git a/tests/test-debugcommands.t b/tests/test-debugcommands.t
> --- a/tests/test-debugcommands.t
> +++ b/tests/test-debugcommands.t
> @@ -24,6 +24,40 @@
> full revision size (min/max/avg) : 44 / 44 / 44
> delta size (min/max/avg) : 0 / 0 / 0
>
> +Test max chain len
> + $ cat >> $HGRCPATH << EOF
> + > [revlog]
> + > maxchainlen=4
> + > EOF
> +
> + $ echo "This test checks if maxchainlen config value is respected also it can serve as basic test for debugrevlog -d <file>.\n" >> a
> + $ hg ci -m a
> + $ echo "b\n" >> a
> + $ hg ci -m a
> + $ echo "c\n" >> a
> + $ hg ci -m a
> + $ echo "d\n" >> a
> + $ hg ci -m a
> + $ echo "e\n" >> a
> + $ hg ci -m a
> + $ echo "f\n" >> a
> + $ hg ci -m a
> + $ echo 'g\n' >> a
> + $ hg ci -m a
> + $ echo 'h\n' >> a
check-code sends its regards here, though I think it's an overzealous check. Investigating.
> + $ hg ci -m a
> + $ hg debugrevlog -d a
> + # rev p1rev p2rev start end deltastart base p1 p2 rawsize totalsize compression heads chainlen
> + 0 -1 -1 0 ??? 0 0 0 0 ??? ???? ? 1 0 (glob)
> + 1 0 -1 ??? ??? 0 0 0 0 ??? ???? ? 1 1 (glob)
> + 2 1 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 2 (glob)
> + 3 2 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 3 (glob)
> + 4 3 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 4 (glob)
> + 5 4 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 0 (glob)
> + 6 5 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 1 (glob)
> + 7 6 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 2 (glob)
> + 8 7 -1 ??? ??? ??? ??? ??? 0 ??? ???? ? 1 3 (glob)
> + $ cd ..
>
> Test internal debugstacktrace command
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at selenic.com
> http://selenic.com/mailman/listinfo/mercurial-devel
More information about the Mercurial-devel
mailing list