Odd behaviour while cloning (redundant seeks?)

Matt Mackall mpm at selenic.com
Tue May 5 17:02:12 CDT 2009


On Tue, May 05, 2009 at 03:48:32PM -0400, Greg Ward wrote:
> I got bored waiting for Mercurial to clone a local 900MB repository,
> so I ran lsof and strace on to see what it was up to.  I noticed this
> odd-looking sequence of system calls:
> 
> [...many identical _llseek() calls...]
> _llseek(3, 4694016, [4694016], SEEK_SET) = 0
> _llseek(3, 4694016, [4694016], SEEK_SET) = 0
> _llseek(3, 4694016, [4694016], SEEK_SET) = 0
> read(3, "\0\0\0\363Bf\0\0\0\0\0\341\0\0\0019\0\1\36\177\0\1\36\200\0\1\36\177\377\377\377\377"...,
> 4096) = 4096
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> [...many identical _llseek() calls...]
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> read(3, "\0\0\0\363\202\16\0\0\0\0\1\32\0\0\1\314\0\1\36\277\0\1\36\300\0\1\36\277\377\377\377\377"...,
> 4096) = 4096
> _llseek(3, 4702208, [4702208], SEEK_SET) = 0
> _llseek(3, 4702208, [4702208], SEEK_SET) = 0
> _llseek(3, 4702208, [4702208], SEEK_SET) = 0
> [...many identical _llseek() calls...]
> 
> Those sequences of identical _llseek() calls seem to be 30-40 calls
> long.  File descriptor 3 was .hg/store/00changelog.i.  After a few
> repetitions of the above cycle, I saw this:

That points to something in the lazyparser code that we use on large
indexes, which is rather suboptimal. How big is your 00changelog.i
file?

> Any ideas, suggestions, thoughts?  Have I uncovered a hidden
> performance bug, or is this expected?  Worth investigating further?

Poking at it a bit, I see there are a lot of avoidable seeks when
doing a clone. I've killed about half of them, but I haven't looked
into the lazyparser code yet.

-- 
Mathematics is the supreme nostalgia of our time.


More information about the Mercurial-devel mailing list