[PATCH V5] copy: add flag for disabling copy tracing
Pierre-Yves David
pierre-yves.david at ens-lyon.org
Tue Aug 4 20:10:38 CDT 2015
On 08/03/2015 11:39 AM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham at fb.com>
> # Date 1422386787 28800
> # Tue Jan 27 11:26:27 2015 -0800
> # Branch stable
> # Node ID 3ec836fca4d2f2b5d18a7e1cbfec6ea8a1d93f7b
> # Parent 79f0cb97d7537a7c2948f8f9b0a89148825a3a1d
> copy: add flag for disabling copy tracing
>
> Copy tracing can be up to 80% of rebase time when rebasing stacks
> of commits in large repos (hundreds of thousands of files).
> This provides the option of turning off the majority of copy tracing. It
> does not turn off _forwardcopies() since that is used to carry copy
> information inside a commit across a rebase.
>
> This will affect the situation where a user edits a file, then rebases on top of
> commits that have moved that file. The move will not be detected and the user
> will have to manually resolve the issue (possibly by redoing the rebase with
> this flag off).
>
> This takes rebasing a 3 commit stack that's a couple weeks old from 65s to 12s.
>
> diff --git a/mercurial/copies.py b/mercurial/copies.py
> --- a/mercurial/copies.py
> +++ b/mercurial/copies.py
> @@ -185,6 +185,9 @@ def _forwardcopies(a, b, match=None):
> return cm
>
> def _backwardrenames(a, b):
> + if a._repo.ui.configbool('experimental', 'disablecopytrace'):
> + return {}
> +
Is there any reason you don't just patch the four public methods in
copies? I suspect there is some case we want to not disable it (like
when preserving data in a rebase or an amend. Can we get details on that
(why some method are patched or not)?
> # Even though we're not taking copies into account, 1:n rename situations
> # can still exist (e.g. hg cp a b; hg mv a c). In those cases we
> # arbitrarily pick one of the renames.
> @@ -258,6 +261,9 @@ def mergecopies(repo, c1, c2, ca):
> if c2.node() is None and c1.node() == repo.dirstate.p1():
> return repo.dirstate.copies(), {}, {}, {}
>
> + if repo.ui.configbool('experimental', 'disablecopytrace'):
> + return {}, {}, {}, {}
> +
(small-nit?) Any reason this is not the very first line of the method
(for consistency?)
> limit = _findlimit(repo, c1.rev(), c2.rev())
> if limit is None:
> # no common ancestor, no copies
> @@ -507,7 +513,8 @@ def duplicatecopies(repo, rev, fromrev,
> copies between fromrev and rev.
> '''
> exclude = {}
> - if skiprev is not None:
> + if (skiprev is not None and
> + not repo.ui.configbool('experimental', 'disablecopytrace')):
> exclude = pathcopies(repo[fromrev], repo[skiprev])
In the same spirit we probably want a comment explaining why we skip the
first one, but not the second.
> for dst, src in pathcopies(repo[fromrev], repo[rev]).iteritems():
> # copies.pathcopies returns backward renames, so dst might not
--
Pierre-Yves David
More information about the Mercurial-devel
mailing list