[PATCH STABLE] icasefs: improve rename awareness of case-folding collision detection (issue3452)

Mads Kiilerich mads at kiilerich.com
Fri Feb 1 04:25:42 CST 2013


On 02/01/2013 02:44 AM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1359682698 -32400
> # Branch stable
> # Node ID 3eca4bf6f77f64a83835466f9b07022f5152a219
> # Parent  2a1fac3650a5b4d650198604c82ab59969500374
> icasefs: improve rename awareness of case-folding collision detection (issue3452)
>
> Before this patch, merging at (6) in the example below is aborted for
> case-folding collision unexpectedly, because 'a' at (4) and 'A' at (5)
> aren't recognized as source/destination of renaming.
>
>    (0) -- (2) -- (4) -------
>        \      \             \
>         \      \             \
>           (1) -- (3) -- (5) -- (6)
>
>    0: add file 'a'
>    1: rename from 'a' to 'A'
>    2: add file 'x'
>    3: merge
>    4: modify 'a'
>    5: modify 'A'
>    6: merge
>
> "copies.pathcopies()" used for case-folding collision detection scans
> filelog entries only for the revisions which are grater than common
> ancestor of merged revisions: in the example above, renaming from 'a'
> to 'A' at (1) is not scanned in the merging at (6), because common
> ancestor of merged revisions is (2) in this case.
>
> "copies.mergecopies()" can detect renaming in such case.
>
> But in the other hand, in the case merging branches without
> modification of renamed file (= issue3370 case), like (3) in the
> example above, "copies.mergecopies()" can't detect renaming.
>
> So, this patch uses both "copies.pathcopies()" and
> "copies.mergecopies()".
>
> To prevent "copies.mergecopies()" from being called in
> "manifestmerge()" again, "_checkcollision()" returns result of it, and
> "manifestmerge()" reuses it, if it is invoked in "_checkcollision()".

A tricky one ...

It is also not exactly a one-liner patch. Considering the size of the 
patch, the frequency this case is hit, the impact when it is hit, and 
the timing: I don't think this is for stable now. Doing it right seems 
to require some refactorings - and it would be better to do it right.

Some overall comments:

It is a bit messy that it calls copies.mergecopies from different places 
far from each other. A first improvement to the patch would be to 
calculate it and pass it as a parameter to _checkcollision.

It is also not obvious why we have the separation between 
calculateupdates and manifestmerge. You demonstrate that we sometimes 
need copies.mergecopies in calculateupdates, sometimes in manifestmerge. 
I think it would be better to start with a refactoring that merges 
calculateupdates into manifestmerge. It seems like that would simplify 
things.

And ...

> diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
> --- a/hgext/largefiles/overrides.py
> +++ b/hgext/largefiles/overrides.py
> @@ -360,8 +360,9 @@
>   # Finally, the merge.applyupdates function will then take care of
>   # writing the files into the working copy and lfcommands.updatelfiles
>   # will update the largefiles.
> -def overridemanifestmerge(origfn, repo, p1, p2, pa, overwrite, partial):
> -    actions = origfn(repo, p1, p2, pa, overwrite, partial)
> +def overridemanifestmerge(origfn, repo, p1, p2, pa, overwrite, partial,
> +                          mergecopies=None):
> +    actions = origfn(repo, p1, p2, pa, overwrite, partial, mergecopies)
>       processed = []
>   
>       for action in actions:
> diff --git a/mercurial/merge.py b/mercurial/merge.py
> --- a/mercurial/merge.py
> +++ b/mercurial/merge.py
> @@ -141,21 +141,33 @@
>       if extractxs:
>           wctx, actx = extractxs
>           # class to delay looking up copy mapping
> -        class pathcopies(object):
> +        class copymap(object):
> +            def __init__(self):
> +                self._mergecopies = None
>               @util.propertycache
> -            def map(self):
> +            def pathcopies(self):
>                   # {dst at mctx: src at wctx} copy mapping
>                   return copies.pathcopies(wctx, mctx)
> -        pc = pathcopies()
> +            @util.propertycache
> +            def mergecopies(self):
> +                self._mergecopies = copies.mergecopies(wctx._repo,
> +                                                       wctx, mctx, actx)
> +                return self._mergecopies[0]

Calling the property mergecopies but only returning mergecopies[0] seems 
a bit confusing.

> +            def find(self, f1, f2):
> +                return (self.pathcopies.get(f1) == f2 or
> +                        self.mergecopies.get(f1) == f2 or
> +                        self.mergecopies.get(f2) == f1)
> +        cm = copymap()
>   
>           for fn in wctx:
>               fold = util.normcase(fn)
>               mfn = folded.get(fold, None)
> -            if (mfn and mfn != fn and pc.map.get(mfn) != fn and
> +            if (mfn and mfn != fn and not cm.find(mfn, fn) and
>                   _remains(fn, wctx.manifest(), actx.manifest(), True) and
>                   _remains(mfn, mctx.manifest(), actx.manifest())):
>                   raise util.Abort(_("case-folding collision between %s and %s")
>                                    % (mfn, fn))
> +        return cm._mergecopies

Why make it private when you access it from the outside anyway?

>   
>   def _forgetremoved(wctx, mctx, branchmerge):
>       """
> @@ -185,7 +197,7 @@
>   
>       return actions
>   
> -def manifestmerge(repo, p1, p2, pa, overwrite, partial):
> +def manifestmerge(repo, p1, p2, pa, overwrite, partial, mergecopies=None):
>       """
>       Merge p1 and p2 with ancestor pa and generate merge action list
>   
> @@ -204,7 +216,7 @@
>       elif pa == p2: # backwards
>           pa = p1.p1()
>       elif pa and repo.ui.configbool("merge", "followcopies", True):
> -        ret = copies.mergecopies(repo, p1, p2, pa)
> +        ret = mergecopies or copies.mergecopies(repo, p1, p2, pa)
>           copy, movewithdir, diverge, renamedelete = ret
>           for of, fl in diverge.iteritems():
>               act("divergent renames", "dr", of, fl)
> @@ -440,13 +452,14 @@
>       "Calculate the actions needed to merge mctx into tctx"
>       actions = []
>       folding = not util.checkcase(repo.path)
> +    mergecopies = None
>       if folding:
>           # collision check is not needed for clean update
>           if (not branchmerge and
>               (force or not tctx.dirty(missing=True, branch=False))):
>               _checkcollision(mctx, None)
>           else:
> -            _checkcollision(mctx, (tctx, ancestor))
> +            mergecopies = _checkcollision(mctx, (tctx, ancestor))
>       if not force:
>           _checkunknown(repo, tctx, mctx)
>       if tctx.rev() is None:
> @@ -454,7 +467,8 @@
>       actions += manifestmerge(repo, tctx, mctx,
>                                ancestor,
>                                force and not branchmerge,
> -                             partial)
> +                             partial,
> +                             mergecopies)
>       return actions
>   
>   def recordupdates(repo, actions, branchmerge):
> diff --git a/tests/test-casecollision-merge.t b/tests/test-casecollision-merge.t
> --- a/tests/test-casecollision-merge.t
> +++ b/tests/test-casecollision-merge.t
> @@ -22,33 +22,53 @@
>     $ hg commit -m '#1'
>     $ hg update 0
>     1 files updated, 0 files merged, 1 files removed, 0 files unresolved
> -  $ echo 'modified at #2' > a
> +  $ echo x > x
> +  $ hg add x
>     $ hg commit -m '#2'
>     created new head
>   
> -  $ hg merge
> -  merging a and A to A
> -  0 files updated, 1 files merged, 0 files removed, 0 files unresolved
> -  (branch merge, don't forget to commit)
> +  $ hg merge -q
> +  $ hg status -A
> +  M A
> +  R a
> +  C x
> +
> +  $ hg update -q --clean 1
> +  $ hg merge -q
> +  $ hg status -A
> +  M x
> +  C A
> +
> +additional test for issue3452:
> +
> +  $ hg commit -m '#3'
> +
> +  $ hg update -q --clean 2
> +  $ echo 'a at 4' > a
> +  $ hg commit -m '#4'
> +  created new head
> +
> +  $ hg update -q --clean 3
> +  $ echo 'A at 5' > A
> +  $ hg commit -m '#5'
> +
> +  $ hg merge -q --tool internal:other 4
>     $ hg status -A
>     M A
>       a
> -  R a
> +  C x
>     $ cat A
> -  modified at #2
> +  a at 4
>   
> -  $ hg update --clean 1
> -  1 files updated, 0 files merged, 0 files removed, 0 files unresolved
> -  $ hg merge
> -  merging A and a to A
> -  0 files updated, 1 files merged, 0 files removed, 0 files unresolved
> -  (branch merge, don't forget to commit)
> -  $ hg status -A
> +  $ hg update -q --clean 4
> +  $ hg merge -q --tool internal:other 5
> +  $ hg statu -A

Reusing tests is fine, but refactoring tests while doing a fix makes it 
very hard to see whether the fix actually introduces a regression. 
Please separate them somehow - for example by an initial 'refactor test' 
or 'improve test coverage' patch.

>     M A
>       a
> +  R a
> +  C x
>     $ cat A
> -  modified at #2
> -
> +  A at 5
>     $ cd ..
>   
>   (2) colliding file is not related to collided file

/Mads


More information about the Mercurial-devel mailing list