[PATCH V5] copy: add flag for disabling copy tracing

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Aug 4 20:10:38 CDT 2015



On 08/03/2015 11:39 AM, Durham Goode wrote:
> # HG changeset patch
> # User Durham Goode <durham at fb.com>
> # Date 1422386787 28800
> #      Tue Jan 27 11:26:27 2015 -0800
> # Branch stable
> # Node ID 3ec836fca4d2f2b5d18a7e1cbfec6ea8a1d93f7b
> # Parent  79f0cb97d7537a7c2948f8f9b0a89148825a3a1d
> copy: add flag for disabling copy tracing
>
> Copy tracing can be up to 80% of rebase time when rebasing stacks
> of commits in large repos (hundreds of thousands of files).
> This provides the option of turning off the majority of copy tracing. It
> does not turn off _forwardcopies() since that is used to carry copy
> information inside a commit across a rebase.
>
> This will affect the situation where a user edits a file, then rebases on top of
> commits that have moved that file. The move will not be detected and the user
> will have to manually resolve the issue (possibly by redoing the rebase with
> this flag off).
>
> This takes rebasing a 3 commit stack that's a couple weeks old from 65s to 12s.
>
> diff --git a/mercurial/copies.py b/mercurial/copies.py
> --- a/mercurial/copies.py
> +++ b/mercurial/copies.py
> @@ -185,6 +185,9 @@ def _forwardcopies(a, b, match=None):
>       return cm
>
>   def _backwardrenames(a, b):
> +    if a._repo.ui.configbool('experimental', 'disablecopytrace'):
> +        return {}
> +

Is there any reason you don't just patch the four public methods in 
copies? I suspect there is some case we want to not disable it (like 
when preserving data in a rebase or an amend. Can we get details on that 
(why some method are patched or not)?

>       # Even though we're not taking copies into account, 1:n rename situations
>       # can still exist (e.g. hg cp a b; hg mv a c). In those cases we
>       # arbitrarily pick one of the renames.
> @@ -258,6 +261,9 @@ def mergecopies(repo, c1, c2, ca):
>       if c2.node() is None and c1.node() == repo.dirstate.p1():
>           return repo.dirstate.copies(), {}, {}, {}
>
> +    if repo.ui.configbool('experimental', 'disablecopytrace'):
> +        return {}, {}, {}, {}
> +

(small-nit?) Any reason this is not the very first line of the method 
(for consistency?)

>       limit = _findlimit(repo, c1.rev(), c2.rev())
>       if limit is None:
>           # no common ancestor, no copies
> @@ -507,7 +513,8 @@ def duplicatecopies(repo, rev, fromrev,
>       copies between fromrev and rev.
>       '''
>       exclude = {}
> -    if skiprev is not None:
> +    if (skiprev is not None and
> +        not repo.ui.configbool('experimental', 'disablecopytrace')):
>           exclude = pathcopies(repo[fromrev], repo[skiprev])

In the same spirit we probably want a comment explaining why we skip the 
first one, but not the second.

>       for dst, src in pathcopies(repo[fromrev], repo[rev]).iteritems():
>           # copies.pathcopies returns backward renames, so dst might not

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list