[PATCH] manifest: improve filesnotin performance by using lazymanifest diff
Tony Tung
tonytung at instagram.com
Wed Apr 27 11:32:56 EDT 2016
Comments inline.
On Apr 27, 2016, at 6:20 AM, Martin von Zweigbergk <martinvonz at google.com<mailto:martinvonz at google.com>> wrote:
On Wed, Apr 27, 2016, 01:07 Sean Farley <sean at farley.io<mailto:sean at farley.io>> wrote:
Tony Tung <ttung at fb.com<mailto:ttung at fb.com>> writes:
> # HG changeset patch
> # User Tony Tung <tonytung at merly.org<mailto:tonytung at merly.org>>
> # Date 1461740718 25200
> # Wed Apr 27 00:05:18 2016 -0700
> # Branch stable
> # Node ID 7f80dce78781f5fe691a23f1b7f5a110ed170f32
> # Parent 97811ff7964710d32cae951df1da8019b46151a2
> manifest: improve filesnotin performance by using lazymanifest diff
>
> lazymanifests can compute diffs significantly faster than taking the set
> of two manifests and calculating the delta.
FYI, we're currently in a feature freeze:
https://www.mercurial-scm.org/wiki/TimeBasedReleasePlan
Will resubmit.
> diff --git a/mercurial/manifest.py b/mercurial/manifest.py
> --- a/mercurial/manifest.py
> +++ b/mercurial/manifest.py
> @@ -211,8 +211,10 @@
>
> def filesnotin(self, m2):
> '''Set of files in this manifest that are not in the other'''
> - files = set(self)
> - files.difference_update(m2)
> + diff = self.diff(m2)
> + files = set(filepath
> + for filepath, hashflags in diff.items()
iteritems() may be noticeably faster on large diffs
I was under the impression that items() had the same performance characteristics as iteritems(), but apparently, that’s only for python 3. Will fix.
> + if hashflags[1][0] is None)
(for after May 1st) Would it be feasible to have a perf test for this
(and some sweet, sweet performance numbers in the commit message)?
As usual, I will request real-world perf numbers (instead or in addition). It would be nice to have them for both a good (small diff) and a bad case (large diff). Thanks.
hg diff -c . on Facebook’s large repo takes 1.5s instead of 2.1s with this change. In the case of large diffs, I suspect the performance regression would be drowned out by the file system operations.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20160427/e3d183ba/attachment.html>
More information about the Mercurial-devel
mailing list