size of repository with many branches, vs. git

Dov Feldstern dfeldstern at fastimap.com
Sat Mar 29 15:59:40 CDT 2008


Hi!

I'm a minor contributor to LyX (http://www.lyx.org), which uses 
subversion as its VCS. LyX is a very active project, and the repository 
goes back about 9 years, so it's currently up to about revision 24,000.

About six months ago I started playing around with mercurial, and being 
very pleased with it, I decided to start tracking LyX's svn repository 
with mercurial. I've been using hgsvn to pull trunk, plus the latest 
three release branches, into a mercurial repository. The size of the 
entire repository (trunk + 3 branches) and working directory was about 
~200MB, which seems very reasonable --- about comparable to the size of 
an svn checkout.

Around the same time, some other developers on the project started 
playing around with git, so there is also a git repository tracking the 
project.

Recently, a new tag was created for LyX 1.6.0alpha1. AFAIK, tags are not 
reflected in hgsvn. I noticed that the git clone reflects exactly the 
svn repository (with all branches and tags), and since svn support in 
mercurial's convert extension had been extended, I decided to convert 
the entire repository --- branches and tags and all --- to mercurial.

The conversion went well (with some help from Patrick), but the result 
was disappointing to me: the size of the cloned repository is between 
~700MB (with no --datesort, converted in chunks of 1000 revisions at a 
time) to ~1GB (with --datesort, which probably better reflects what 
would happen over time as the project is tracked in real-time from svn). 
By comparison, the entire git repository (freshly cloned) is only ~200MB!

I find the difference between 200MB to 1GB to be quite significant. Just 
cloning the newly created mercurial repository from LyX's server took 
about two hours (hg clone -e "ssh -C", which I believe is the most 
efficient way over the internet, through ssh?). And I foresee a lot of 
trouble trying to convince other developers to switch over to mercurial 
rather than to git, with this kind of performance disparity... Also, one 
of my aims was to publish the mercurial clone publicly somewhere 
(http://freehg.org, http://sharesource.org, ...), but I don't know if 
they will allow / can handle repositories of this size... (does anyone 
know about this?)

Does this make sense, or am I doing something wrong? I like mercurial a 
lot (and I *love* mercurial queues!), but this is causing me to have a 
crisis of faith in it. So I would be very happy for someone to point out 
to me something that I'm doing wrong...

(Just in case someone wants to delve a little deeper into this: the 
original svn repository is available at 
svn://svn.lyx.org/lyx/lyx-devel/trunk, or browsable through trac at 
http://www.lyx.org/trac/browser/ . The unofficial git repository is at 
http://repo.or.cz/w/lyx.git . As I mentioned, the mercurial repository 
is not currently publicly available, for want of a place to host it...).

Thanks!
Dov



More information about the Mercurial mailing list