[PATCH] convert: svn-sink: test if execute bit was actually stored

Maxim Dounin mdounin at mdounin.ru
Sat Dec 29 08:41:02 CST 2007


On Thu, Dec 27, 2007 at 01:23:08PM -0600, Matt Mackall wrote:
>
>On Wed, 2007-12-26 at 01:43 +0300, Maxim Dounin wrote:
>> On Wed, 19 Dec 2007, Matt Mackall wrote:
>> 
>> > On Wed, Dec 19, 2007 at 06:13:43PM +0300, Maxim Dounin wrote:
>> >> Hello!
>> >>
>> >> On Sat, 15 Dec 2007, Maxim Dounin wrote:
>> >>
>> >>> Hello!
>> >>>
>> >>> As of now, the command sequence in tests/test-convert-svn-sink leads
>> >>> to conversion crash while converting revision where execute bit is set. As a
>> >>> result - execute bit isn't stored in converted repo. Attached patch makes it
>> >>> clear.
>> >>>
>> >>> As far as I debugged this, actual problem isn't in subversion sink code - but
>> >>> instead in mercurial conversion source code. It reports extra copy operation
>> >>> when setting execute bit follows copy for the particular file - because
>> >>> execute bit is stored in manifest and file itself "thinks" it was copied in
>> >>> last revision.
>> >>>
>> >>> It would be good if somebody with better knowledge of mercurial internals
>> >>> (BOS?) take a look at this.
>> >>
>> >> I've spent some more time investigating this - and it looks like 'hg log
>> >> -vC' also does wrong thing in this case depite the fact it has much more
>> >> code to correctly detect copies.
>> >>
>> >> Testcase:
>> >>
>> >> hg init a
>> >> cd a
>> >> echo a > a
>> >> hg ci -Ama
>> >> hg cp a b
>> >> hg ci -A -m 'copy a to b'
>> >> chmod +x b
>> >> hg ci -A -m 'set execute bit on b'
>> >> hg log -vC --template '{rev} {file_copies%filecopy}\n'
>> >>
>> >> I can't figure out how to fix this correctly though. Not reporting copies
>> >> when filelog wasn't changed by particular changelog revision doesn't look
>> >> like a way to go - it will break test/test-log, notably issue391 test.
>> >
>> > The problem is that we're using a heuristic (whether the file is in
>> > the list of changed files in a changeset) to decide whether to do a
>> > copy check. But that will list files whose modes have changed as well
>> > as just their contents.
>> >
>> > When we detect a case where a file may have been copied, we can do two
>> > additional checks:
>> >
>> > - does the file revision's linkrev point back to the changeset in
>> >  question? if yes, then the contents were definitely changed in this
>> >  revision, and thus the copy happened in this revision.
>> >
>> > - if the above quick check fails, we can check the file revision in
>> >  the parent changeset, and if it is different, we know the copy
>> >  happened in the child. This test is potentially slow, but should be
>> >  very infrequent.
>> 
>> Attached patch (against crew) implements the above algorithm in log 
>> --copies and convert (mercurial as a source) and adds appropriate 
>> tests.
>
>It'd be better if this intelligence were moved into the filectx object.

I agree that this have to be in something more generic.

As for filectx, currently it have to be fixed to preserve original changeset 
revision for this logic to work. Attached patch "patch-context.txt" does the 
trick. 

Note: This change will broke old behaviour that changectx.filectx(file).rev() 
report last revision where file was modified. I hope nothing in code relies 
on this (and some cases was already broken anyway).

The only test that fails due to this change is test-issue672. As far as I 
see it's minor and completely safe debug output change in filectx priting, 
so I've updated test-issue672.out. 

Second patch, patch-rename-2.txt, adds function renamedhere() to filectx 
object and uses it to fix original problem. Notes:

   - This nukes part of optimization in 'log --copies'. But only for rare case 
     when rev != linkrev.

   - I'm not sure if this should replace original renamed() function in 
     filectx. At least one additional test failed when I tried this, so I've
     decided to leave it separate for now.

Maxim Dounin
-------------- next part --------------
# HG changeset patch
# User Maxim Dounin <mdounin at mdounin.ru>
# Date 1198936663 -10800
# Node ID 7471976f1ffd2626c9a02efb723cfbdb026e1040
# Parent  027264e720aaa1ec77882465f8d2cde415ee6194
context: preserve changeset in filectx if we have one

If we know original changeset we are working with - try hard to
preserve it. Fallback to filelog.linkrev() only if we have no way
to get original changeset, since linkrev() may point to other
changeset.

diff -r 027264e720aa -r 7471976f1ffd mercurial/context.py
--- a/mercurial/context.py	Sat Dec 29 01:14:45 2007 +0100
+++ b/mercurial/context.py	Sat Dec 29 16:57:43 2007 +0300
@@ -159,12 +159,11 @@ class filectx(object):
         if filelog:
             self._filelog = filelog
 
-        if fileid is None:
-            if changectx is None:
-                self._changeid = changeid
-            else:
-                self._changectx = changectx
-        else:
+        if changeid is not None:
+            self._changeid = changeid
+        if changectx is not None:
+            self._changectx = changectx
+        if fileid is not None:
             self._fileid = fileid
 
     def __getattr__(self, name):
@@ -175,7 +174,10 @@ class filectx(object):
             self._filelog = self._repo.file(self._path)
             return self._filelog
         elif name == '_changeid':
-            self._changeid = self._filelog.linkrev(self._filenode)
+            if '_changectx' in self.__dict__:
+                self._changeid = self._changectx.rev()
+            else:
+                self._changeid = self._filelog.linkrev(self._filenode)
             return self._changeid
         elif name == '_filenode':
             if '_fileid' in self.__dict__:
@@ -229,6 +231,8 @@ class filectx(object):
     def rev(self):
         if '_changectx' in self.__dict__:
             return self._changectx.rev()
+        if '_changeid' in self.__dict__:
+            return self._changectx.rev() 
         return self._filelog.linkrev(self._filenode)
 
     def node(self): return self._changectx.node()
diff -r 027264e720aa -r 7471976f1ffd tests/test-issue672.out
--- a/tests/test-issue672.out	Sat Dec 29 01:14:45 2007 +0100
+++ b/tests/test-issue672.out	Sat Dec 29 16:57:43 2007 +0300
@@ -45,7 +45,7 @@ resolving manifests
  1: remote moved to 1a -> m
 copying 1 to 1a
 merging 1 and 1a
-my 1 at 746e9549ea96+ other 1a at 2f8037f47a5c ancestor 1 at 81f4b099af3d
+my 1 at 746e9549ea96+ other 1a at ac7575e3c052 ancestor 1 at 81f4b099af3d
 removing 1
 0 files updated, 1 files merged, 0 files removed, 0 files unresolved
 (branch merge, don't forget to commit)
-------------- next part --------------
# HG changeset patch
# User Maxim Dounin <mdounin at mdounin.ru>
# Date 1198937508 -10800
# Node ID 94e498fd21e829792bd822e035aa361463242509
# Parent  7471976f1ffd2626c9a02efb723cfbdb026e1040
Fix copies reporting in log and convert.

If copy logged in file revision, we report copy for changeset only
if file revisions linkrev points back to the changeset in question
or both changeset parents contain different file revisions.

This fixes extra copies reported when executable bit was changed for
previously copied file.

diff -r 7471976f1ffd -r 94e498fd21e8 hgext/convert/hg.py
--- a/hgext/convert/hg.py	Sat Dec 29 16:57:43 2007 +0300
+++ b/hgext/convert/hg.py	Sat Dec 29 17:11:48 2007 +0300
@@ -238,7 +238,7 @@ class mercurial_source(converter_source)
         copies = {}
         for name in files:
             try:
-                copies[name] = ctx.filectx(name).renamed()[0]
+                copies[name] = ctx.filectx(name).renamedhere()[0]
             except TypeError:
                 pass
         return copies
diff -r 7471976f1ffd -r 94e498fd21e8 mercurial/commands.py
--- a/mercurial/commands.py	Sat Dec 29 16:57:43 2007 +0300
+++ b/mercurial/commands.py	Sat Dec 29 17:11:48 2007 +0300
@@ -1617,8 +1617,7 @@ def log(ui, repo, *pats, **opts):
         endrev = repo.changelog.count()
     rcache = {}
     ncache = {}
-    dcache = []
-    def getrenamed(fn, rev, man):
+    def getrenamed(fn, rev):
         '''looks up all renames for a file (up to endrev) the first
         time the file is given. It indexes on the changerev and only
         parses the manifest if linkrev != changerev.
@@ -1638,13 +1637,14 @@ def log(ui, repo, *pats, **opts):
                     break
         if rev in rcache[fn]:
             return rcache[fn][rev]
-        mr = repo.manifest.rev(man)
-        if repo.manifest.parentrevs(mr) != (mr - 1, nullrev):
-            return ncache[fn].get(repo.manifest.find(man, fn)[0])
-        if not dcache or dcache[0] != man:
-            dcache[:] = [man, repo.manifest.readdelta(man)]
-        if fn in dcache[1]:
-            return ncache[fn].get(dcache[1][fn])
+
+        # If linkrev != rev (i.e. rev not found in rcache) fallback to
+        # filectx logic.
+
+        try:
+            return repo.changectx(rev).filectx(fn).renamedhere()
+        except revlog.LookupError:
+            pass
         return None
 
     df = False
@@ -1681,9 +1681,8 @@ def log(ui, repo, *pats, **opts):
 
             copies = []
             if opts.get('copies') and rev:
-                mf = get(rev)[0]
                 for fn in get(rev)[3]:
-                    rename = getrenamed(fn, rev, mf)
+                    rename = getrenamed(fn, rev)
                     if rename:
                         copies.append((fn, rename[0]))
             displayer.show(rev, changenode, copies=copies)
diff -r 7471976f1ffd -r 94e498fd21e8 mercurial/context.py
--- a/mercurial/context.py	Sat Dec 29 16:57:43 2007 +0300
+++ b/mercurial/context.py	Sat Dec 29 17:11:48 2007 +0300
@@ -250,6 +250,31 @@ class filectx(object):
     def size(self): return self._filelog.size(self._filerev)
 
     def cmp(self, text): return self._filelog.cmp(self._filenode, text)
+
+    def renamedhere(self):
+        """check if file was actually renamed in this changeset revision
+
+        If rename logged in file revision, we report copy for changeset only
+        if file revisions linkrev points back to the changeset in question
+        or both changeset parents contain different file revisions.
+        """
+
+        renamed = self.renamed()
+        if not renamed:
+            return renamed
+
+        if self.rev() == self._filelog.linkrev(self._filenode):
+            return renamed
+
+        name = self.path()
+        fnode = self._filenode
+        for p in self._changectx.parents():
+            try:
+                if fnode == p.filenode(name):
+                    return None
+            except revlog.LookupError:
+                pass
+        return renamed
 
     def parents(self):
         p = self._path
diff -r 7471976f1ffd -r 94e498fd21e8 tests/test-convert-hg-source
--- a/tests/test-convert-hg-source	Sat Dec 29 16:57:43 2007 +0300
+++ b/tests/test-convert-hg-source	Sat Dec 29 17:11:48 2007 +0300
@@ -29,6 +29,9 @@ hg merge 2
 hg merge 2
 hg ci -m 'merge remote copy' -d '4 0'
 
+chmod +x baz
+hg ci -m 'mark baz executable' -d '5 0'
+
 cd ..
 hg convert --datesort orig new 2>&1 | grep -v 'subversion python bindings could not be loaded'
 cd new
diff -r 7471976f1ffd -r 94e498fd21e8 tests/test-convert-hg-source.out
--- a/tests/test-convert-hg-source.out	Sat Dec 29 16:57:43 2007 +0300
+++ b/tests/test-convert-hg-source.out	Sat Dec 29 17:11:48 2007 +0300
@@ -9,11 +9,12 @@ scanning source...
 scanning source...
 sorting...
 converting...
-4 add foo bar
-3 change foo
-2 make bar and baz copies of foo
-1 merge local copy
-0 merge remote copy
+5 add foo bar
+4 change foo
+3 make bar and baz copies of foo
+2 merge local copy
+1 merge remote copy
+0 mark baz executable
 comparing with ../orig
 searching for changes
 no changes found
diff -r 7471976f1ffd -r 94e498fd21e8 tests/test-convert-svn-sink.out
--- a/tests/test-convert-svn-sink.out	Sat Dec 29 16:57:43 2007 +0300
+++ b/tests/test-convert-svn-sink.out	Sat Dec 29 17:11:48 2007 +0300
@@ -167,31 +167,29 @@ d1
 d1
 % executable
 5:f205b3636d77
-svn: Path 'b' does not exist
 assuming destination a-hg
 initializing svn wc 'a-hg-wc'
 scanning source...
 sorting...
 converting...
 0 make a file executable
-abort: svn exited with status 1
-At revision 5.
-                5        5 test         .
- M              5        4 test         c
-                5        1 test         d1
-                5        1 test         d1/d2
-                5        1 test         d1/d2/b
+At revision 6.
+                6        6 test         .
+                6        6 test         c
+                6        1 test         d1
+                6        1 test         d1/d2
+                6        1 test         d1/d2/b
 <?xml version="1.0"?>
 <log>
 <logentry
-   revision="5">
+   revision="6">
 <author>test</author>
 <date/>
 <paths>
 <path
-   action="D">/b</path>
+   action="M">/c</path>
 </paths>
-<msg>remove a file</msg>
+<msg>make a file executable</msg>
 </logentry>
 </log>
 executable
diff -r 7471976f1ffd -r 94e498fd21e8 tests/test-log
--- a/tests/test-log	Sat Dec 29 16:57:43 2007 +0300
+++ b/tests/test-log	Sat Dec 29 17:11:48 2007 +0300
@@ -40,6 +40,11 @@ echo foo > foo
 echo foo > foo
 hg ci -Ame2 -d '6 0'
 hg log -vC --template '{rev} {file_copies%filecopy}\n' -r 5
+
+echo % log copies, execute bit set
+chmod +x e
+hg ci -me3 -d '7 0'
+hg log -vC --template '{rev} {file_copies%filecopy}\n' -r 6
 
 echo '% log -p d'
 hg log -pv d
diff -r 7471976f1ffd -r 94e498fd21e8 tests/test-log.out
--- a/tests/test-log.out	Sat Dec 29 16:57:43 2007 +0300
+++ b/tests/test-log.out	Sat Dec 29 17:11:48 2007 +0300
@@ -86,6 +86,8 @@ 1 files updated, 0 files merged, 1 files
 1 files updated, 0 files merged, 1 files removed, 0 files unresolved
 adding foo
 5 e (dir/b)
+% log copies, execute bit set
+6 
 % log -p d
 changeset:   3:16b60bf3f99a
 user:        test


More information about the Mercurial-devel mailing list