[PATCH 5 of 7] revlog: add a fast path for non-raw revision
Ryan McElroy
rm at fb.com
Wed Mar 29 08:41:21 EDT 2017
On 3/28/17 8:49 AM, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark at fb.com>
> # Date 1490683355 25200
> # Mon Mar 27 23:42:35 2017 -0700
> # Node ID aecce2adbd64b25325e559798baa617e7311e85f
> # Parent a9d87712bec99abe109c155948ee4b7f1f5ec208
> # Available At https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Jw8rundaE7TbmqBYd1txIQ&m=HZJ95FJtSQKfeTym596Dsrw1nj2_uFqbuHYgwPBJDS0&s=0utxjEpjZ6EWkHbcRaaNJmSr6QF_yI6rObSckqL3BII&e=
> # hg pull https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Jw8rundaE7TbmqBYd1txIQ&m=HZJ95FJtSQKfeTym596Dsrw1nj2_uFqbuHYgwPBJDS0&s=0utxjEpjZ6EWkHbcRaaNJmSr6QF_yI6rObSckqL3BII&e= -r aecce2adbd64
> revlog: add a fast path for non-raw revision
>
> If flags are empty, it's equal to "raw=True". Since "raw=True" is less
> common but "flags are empty" is very common, add the flags check. So on
> cache hit, revlog.revision could go through the fast path in the most common
> cases.
>
> This also makes some hash check unnecessary, so the test is changed a bit.
Thanks for explaining this. It's a non-obvious change :-)
>
> diff --git a/mercurial/revlog.py b/mercurial/revlog.py
> --- a/mercurial/revlog.py
> +++ b/mercurial/revlog.py
> @@ -1258,4 +1258,5 @@ class revlog(object):
>
> cachedrev = None
> + flags = None
Why is this initialized to None here? Why not initialize to
self.flags(rev) and avoid the two None checks later? Add a comment if
you keep it this way. The self.flags method looks quite cheap but maybe
I'm missing something.
> if node == nullid:
> return ""
> @@ -1264,4 +1265,9 @@ class revlog(object):
> if raw:
> return self._cache[2]
> + if rev is None:
> + rev = self.rev(node)
> + flags = self.flags(rev)
> + if not flags:
> + return self._cache[2]
I think this needs some comments and perhaps some refactoring.
I'd suggest something like this:
# useful comment
if raw or not self.flags:
This would be made possible if you eagerly initialize flags above.
Upon closer inspection, maybe its self.rev() that you're trying to avoid
here, but is the expense there the radix tree? In this case, it should
always be a full hash so we shouldn't need a radixtree right? Can we
fastpath this somehow instead?
That may be out of the scope of this series but a comment to explain
your choice is important.
> cachedrev = self._cache[1]
>
> @@ -1285,6 +1291,7 @@ class revlog(object):
> text = mdiff.patches(text, bins)
>
> - newtext, validatehash = self._processflags(text, self.flags(rev),
> - 'read', raw=raw)
> + if flags is None:
> + flags = self.flags(rev)
> + newtext, validatehash = self._processflags(text, flags, 'read', raw=raw)
> if validatehash:
> self.checkhash(newtext, node, rev=rev)
> diff --git a/tests/test-revlog-raw.py.out b/tests/test-revlog-raw.py.out
> --- a/tests/test-revlog-raw.py.out
> +++ b/tests/test-revlog-raw.py.out
> @@ -23,5 +23,4 @@ REV RAW DATA CACHE BEFORE -> AFTER
> # 3338e2e7da35 = hash(7bc003d2bcd3, 02T35)
> 2 y 02T35 1 +012T34 2 02T35
> - # 3338e2e7da35 = hash(7bc003d2bcd3, 02T35)
> 2 02T35 2 02T35 2 02T35
>
> @@ -44,5 +43,4 @@ REV RAW DATA CACHE BEFORE -> AFTER
> # 3338e2e7da35 = hash(7bc003d2bcd3, 02T35)
> 2 y 02T35 None 2 02T35
> - # 3338e2e7da35 = hash(7bc003d2bcd3, 02T35)
> 2 02T35 2 02T35 2 02T35
> 1 y +012T34 2 02T35 1 +012T34
>
More information about the Mercurial-devel
mailing list