Request for review: bzr import support for Convert Extension

Patrick Mézard pmezard at gmail.com
Tue Sep 23 06:08:30 CDT 2008


Marek Kubica a écrit :
> Hi,
> 
> you probably forgot about me since I posted first exactly one month
> ago, on the 22nd of August.
> 
> Since that time, I continued to work on it in a repository that pmezard
> set up on Bitbucket: <http://www.bitbucket.org/pmezard/hg-bzr/>
> 
> You can clone freely from there and tell me your thoughts - I'd like to
> get this stuff into the mainline to be able to work in the repository
> on the bzr export. So I'm willing to put this up for discussion now.
> 
> I really have to thank Patrick Mézard for his help by answering some of
> my stupid questions on IRC and the periodic reviews which made sure
> that I didn't screw up the import completely.
> 
> Now, why did it take 5 weeks? I tried making the thing as robust as
> possible, I added quite a bit of tests and ran it agains huge
> repositories.
> 
> I was testing mainly on the Python bzr-repository which is available
> form Python.org (it is already a mirror of the svn version), but
> besides taking two days for importing the 69000+ revisions it went
> fine. For the last weeks, I've been busy importing the currently 19640
> revisions of bzr.dev, the main Bazaar development repository, which
> stressed the code a lot more as the history of bzr.dev is quite quirky.
> 
> Fun facts: the bzr repositorys .hg takes up 42MB compared with
> Mercurials 10MB and has nearly three times as many revisions. I
> consider this code stress-tested :) Some of the strange test-cases
> were reproduced by creating small programs which use the bzrlib API, so
> the tests catch about all strange stuff that can happen.
> 
> If you're interested, I can provide a clone of bzr.dev - running hg
> glog on it is quite fun, as there are quite a lot of branches in
> parallel.
> 
> So - I'd be happy if this could be integrated into the mainline, so I
> could take a look at the bzr export.

I have run a not completely up to date version agains bzr.dev and was puzzled by the slowness of the conversion especially wrt merge revisions. Do you know if revisiontree.iter_changes() returns exactly all changes between current and other ? In this case, getchanges() can be simplified into (and maybe more):

diff --git a/hgext/convert/bzr.py b/hgext/convert/bzr.py
--- a/hgext/convert/bzr.py
+++ b/hgext/convert/bzr.py
@@ -89,15 +89,9 @@
             # we still need to get rid of ghost ids
             parentids = self._filter_ghosts(parentids)
 
-        changes = []
-        renames = {}
-        prevtrees = [self.sourcerepo.revision_tree(rev) for rev in parentids]
-        for prevtree in prevtrees:
-            change, rename = self._gettreechanges(self._revtree, prevtree)
-            changes.extend(change)
-            renames.update(rename)
-
-        return (changes, renames)
+        # Diff against first parent only
+        prevtree = self.sourcerepo.revision_tree(parentids[0]);
+        return self._gettreechanges(self._revtree, prevtree)
 
     def getcommit(self, version):
         """Gets the metainformation of the commit"""



We don't need changes against the second parent to build the merged changeset. It should be much faster. All tests pass with this change, I did not run against a real bzr repository. Also, perhaps you can time the impact of caching the parentids, that would halves the number of calls to _filter_ghosts() and  to locks (and please remove the middle underscore in _filterghosts()).

--
Patrick Mézard




More information about the Mercurial-devel mailing list