Performance issues when merging branches (1000+ files being merged takes 75~ seconds)

Matt Mackall mpm at selenic.com
Thu Jun 17 09:37:09 CDT 2010


On Thu, 2010-06-17 at 09:15 -0500, Matt Mackall wrote:
> On Thu, 2010-06-17 at 00:46 +0300, Idan K wrote:
> > Hi,
> > 
> > As part of choosing a SCM, the usual debate between Mercurial and Git
> > has begun in our work place, we decided to do some benchmarks. One of
> > the tests was this:
> > 
> > 1. Clone Mercurial's repository (http://hg.intevation.org/mercurial/),
> > updated to default.
> > 2. Create a named branch "branch-a".
> > 3. For every file in the repository, insert a new line after line #3
> > (if the file contains 3 lines), containing "### ADDED TEXT ###".
> > 4. Commit.
> > 5. Update back to default.
> > 6. Create a named branch "branch-b".
> > 7. Repeat step 3, but for line #9.
> > 8. Commit.
> > 9. Merge "branch-a" to "branch-b".
> > 
> > I ran this test on a Ubuntu 10.04 with Mercurial 1.5.4 and the result
> > was quite surprising to say the least: Step #9 took about 75 seconds
> > (average of 5 runs) on a Q6600 with 4GB RAM, ext4 drive.
> > I also ran the same test with Git (using fast-export to covert
> > Mercurial's repository to Git) to get an idea of what I should be
> > getting, and this time step #9 took an average of 1 second. That's
> > quite a difference. I know Mercurial is written in Python (with some C
> > modules) and Git in C, but can the gap be that big?
> 
> Oh, probably. This is the first time the performance of file-level
> merging has been raised as a performance issue in our history, so it's
> not exactly been something we've put time into optimizing.
> 
> Your profile didn't show anything interesting, I'm afraid - all the
> numbers are way too small to account for the slowness. Can you send the
> result of using time(1) too? Perhaps there's a lot of extra I/O
> happening.

It looks like for every merged file, we:

- do a fairly extensive merge-tool selection process that tries to find
the best tool on your system (with no caching)
- back up the local pre-merge state in .hg/merge/ for later resolve
- writes out two temporary files (other and ancestor)
- make a backup .orig file in the working directory
- do a pre-merge (read three files in, merge, write one out)

Most of that time (except the second step) could be eliminated for
internal merge.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list