[PATCH STABLE] icasefs: improve rename awareness of case-folding collision detection (issue3452)
Mads Kiilerich
mads at kiilerich.com
Fri Feb 1 04:25:42 CST 2013
On 02/01/2013 02:44 AM, FUJIWARA Katsunori wrote:
> # HG changeset patch
> # User FUJIWARA Katsunori <foozy at lares.dti.ne.jp>
> # Date 1359682698 -32400
> # Branch stable
> # Node ID 3eca4bf6f77f64a83835466f9b07022f5152a219
> # Parent 2a1fac3650a5b4d650198604c82ab59969500374
> icasefs: improve rename awareness of case-folding collision detection (issue3452)
>
> Before this patch, merging at (6) in the example below is aborted for
> case-folding collision unexpectedly, because 'a' at (4) and 'A' at (5)
> aren't recognized as source/destination of renaming.
>
> (0) -- (2) -- (4) -------
> \ \ \
> \ \ \
> (1) -- (3) -- (5) -- (6)
>
> 0: add file 'a'
> 1: rename from 'a' to 'A'
> 2: add file 'x'
> 3: merge
> 4: modify 'a'
> 5: modify 'A'
> 6: merge
>
> "copies.pathcopies()" used for case-folding collision detection scans
> filelog entries only for the revisions which are grater than common
> ancestor of merged revisions: in the example above, renaming from 'a'
> to 'A' at (1) is not scanned in the merging at (6), because common
> ancestor of merged revisions is (2) in this case.
>
> "copies.mergecopies()" can detect renaming in such case.
>
> But in the other hand, in the case merging branches without
> modification of renamed file (= issue3370 case), like (3) in the
> example above, "copies.mergecopies()" can't detect renaming.
>
> So, this patch uses both "copies.pathcopies()" and
> "copies.mergecopies()".
>
> To prevent "copies.mergecopies()" from being called in
> "manifestmerge()" again, "_checkcollision()" returns result of it, and
> "manifestmerge()" reuses it, if it is invoked in "_checkcollision()".
A tricky one ...
It is also not exactly a one-liner patch. Considering the size of the
patch, the frequency this case is hit, the impact when it is hit, and
the timing: I don't think this is for stable now. Doing it right seems
to require some refactorings - and it would be better to do it right.
Some overall comments:
It is a bit messy that it calls copies.mergecopies from different places
far from each other. A first improvement to the patch would be to
calculate it and pass it as a parameter to _checkcollision.
It is also not obvious why we have the separation between
calculateupdates and manifestmerge. You demonstrate that we sometimes
need copies.mergecopies in calculateupdates, sometimes in manifestmerge.
I think it would be better to start with a refactoring that merges
calculateupdates into manifestmerge. It seems like that would simplify
things.
And ...
> diff --git a/hgext/largefiles/overrides.py b/hgext/largefiles/overrides.py
> --- a/hgext/largefiles/overrides.py
> +++ b/hgext/largefiles/overrides.py
> @@ -360,8 +360,9 @@
> # Finally, the merge.applyupdates function will then take care of
> # writing the files into the working copy and lfcommands.updatelfiles
> # will update the largefiles.
> -def overridemanifestmerge(origfn, repo, p1, p2, pa, overwrite, partial):
> - actions = origfn(repo, p1, p2, pa, overwrite, partial)
> +def overridemanifestmerge(origfn, repo, p1, p2, pa, overwrite, partial,
> + mergecopies=None):
> + actions = origfn(repo, p1, p2, pa, overwrite, partial, mergecopies)
> processed = []
>
> for action in actions:
> diff --git a/mercurial/merge.py b/mercurial/merge.py
> --- a/mercurial/merge.py
> +++ b/mercurial/merge.py
> @@ -141,21 +141,33 @@
> if extractxs:
> wctx, actx = extractxs
> # class to delay looking up copy mapping
> - class pathcopies(object):
> + class copymap(object):
> + def __init__(self):
> + self._mergecopies = None
> @util.propertycache
> - def map(self):
> + def pathcopies(self):
> # {dst at mctx: src at wctx} copy mapping
> return copies.pathcopies(wctx, mctx)
> - pc = pathcopies()
> + @util.propertycache
> + def mergecopies(self):
> + self._mergecopies = copies.mergecopies(wctx._repo,
> + wctx, mctx, actx)
> + return self._mergecopies[0]
Calling the property mergecopies but only returning mergecopies[0] seems
a bit confusing.
> + def find(self, f1, f2):
> + return (self.pathcopies.get(f1) == f2 or
> + self.mergecopies.get(f1) == f2 or
> + self.mergecopies.get(f2) == f1)
> + cm = copymap()
>
> for fn in wctx:
> fold = util.normcase(fn)
> mfn = folded.get(fold, None)
> - if (mfn and mfn != fn and pc.map.get(mfn) != fn and
> + if (mfn and mfn != fn and not cm.find(mfn, fn) and
> _remains(fn, wctx.manifest(), actx.manifest(), True) and
> _remains(mfn, mctx.manifest(), actx.manifest())):
> raise util.Abort(_("case-folding collision between %s and %s")
> % (mfn, fn))
> + return cm._mergecopies
Why make it private when you access it from the outside anyway?
>
> def _forgetremoved(wctx, mctx, branchmerge):
> """
> @@ -185,7 +197,7 @@
>
> return actions
>
> -def manifestmerge(repo, p1, p2, pa, overwrite, partial):
> +def manifestmerge(repo, p1, p2, pa, overwrite, partial, mergecopies=None):
> """
> Merge p1 and p2 with ancestor pa and generate merge action list
>
> @@ -204,7 +216,7 @@
> elif pa == p2: # backwards
> pa = p1.p1()
> elif pa and repo.ui.configbool("merge", "followcopies", True):
> - ret = copies.mergecopies(repo, p1, p2, pa)
> + ret = mergecopies or copies.mergecopies(repo, p1, p2, pa)
> copy, movewithdir, diverge, renamedelete = ret
> for of, fl in diverge.iteritems():
> act("divergent renames", "dr", of, fl)
> @@ -440,13 +452,14 @@
> "Calculate the actions needed to merge mctx into tctx"
> actions = []
> folding = not util.checkcase(repo.path)
> + mergecopies = None
> if folding:
> # collision check is not needed for clean update
> if (not branchmerge and
> (force or not tctx.dirty(missing=True, branch=False))):
> _checkcollision(mctx, None)
> else:
> - _checkcollision(mctx, (tctx, ancestor))
> + mergecopies = _checkcollision(mctx, (tctx, ancestor))
> if not force:
> _checkunknown(repo, tctx, mctx)
> if tctx.rev() is None:
> @@ -454,7 +467,8 @@
> actions += manifestmerge(repo, tctx, mctx,
> ancestor,
> force and not branchmerge,
> - partial)
> + partial,
> + mergecopies)
> return actions
>
> def recordupdates(repo, actions, branchmerge):
> diff --git a/tests/test-casecollision-merge.t b/tests/test-casecollision-merge.t
> --- a/tests/test-casecollision-merge.t
> +++ b/tests/test-casecollision-merge.t
> @@ -22,33 +22,53 @@
> $ hg commit -m '#1'
> $ hg update 0
> 1 files updated, 0 files merged, 1 files removed, 0 files unresolved
> - $ echo 'modified at #2' > a
> + $ echo x > x
> + $ hg add x
> $ hg commit -m '#2'
> created new head
>
> - $ hg merge
> - merging a and A to A
> - 0 files updated, 1 files merged, 0 files removed, 0 files unresolved
> - (branch merge, don't forget to commit)
> + $ hg merge -q
> + $ hg status -A
> + M A
> + R a
> + C x
> +
> + $ hg update -q --clean 1
> + $ hg merge -q
> + $ hg status -A
> + M x
> + C A
> +
> +additional test for issue3452:
> +
> + $ hg commit -m '#3'
> +
> + $ hg update -q --clean 2
> + $ echo 'a at 4' > a
> + $ hg commit -m '#4'
> + created new head
> +
> + $ hg update -q --clean 3
> + $ echo 'A at 5' > A
> + $ hg commit -m '#5'
> +
> + $ hg merge -q --tool internal:other 4
> $ hg status -A
> M A
> a
> - R a
> + C x
> $ cat A
> - modified at #2
> + a at 4
>
> - $ hg update --clean 1
> - 1 files updated, 0 files merged, 0 files removed, 0 files unresolved
> - $ hg merge
> - merging A and a to A
> - 0 files updated, 1 files merged, 0 files removed, 0 files unresolved
> - (branch merge, don't forget to commit)
> - $ hg status -A
> + $ hg update -q --clean 4
> + $ hg merge -q --tool internal:other 5
> + $ hg statu -A
Reusing tests is fine, but refactoring tests while doing a fix makes it
very hard to see whether the fix actually introduces a regression.
Please separate them somehow - for example by an initial 'refactor test'
or 'improve test coverage' patch.
> M A
> a
> + R a
> + C x
> $ cat A
> - modified at #2
> -
> + A at 5
> $ cd ..
>
> (2) colliding file is not related to collided file
/Mads
More information about the Mercurial-devel
mailing list