Odd behaviour while cloning (redundant seeks?)
Matt Mackall
mpm at selenic.com
Tue May 5 17:02:12 CDT 2009
On Tue, May 05, 2009 at 03:48:32PM -0400, Greg Ward wrote:
> I got bored waiting for Mercurial to clone a local 900MB repository,
> so I ran lsof and strace on to see what it was up to. I noticed this
> odd-looking sequence of system calls:
>
> [...many identical _llseek() calls...]
> _llseek(3, 4694016, [4694016], SEEK_SET) = 0
> _llseek(3, 4694016, [4694016], SEEK_SET) = 0
> _llseek(3, 4694016, [4694016], SEEK_SET) = 0
> read(3, "\0\0\0\363Bf\0\0\0\0\0\341\0\0\0019\0\1\36\177\0\1\36\200\0\1\36\177\377\377\377\377"...,
> 4096) = 4096
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> [...many identical _llseek() calls...]
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> _llseek(3, 4698112, [4698112], SEEK_SET) = 0
> read(3, "\0\0\0\363\202\16\0\0\0\0\1\32\0\0\1\314\0\1\36\277\0\1\36\300\0\1\36\277\377\377\377\377"...,
> 4096) = 4096
> _llseek(3, 4702208, [4702208], SEEK_SET) = 0
> _llseek(3, 4702208, [4702208], SEEK_SET) = 0
> _llseek(3, 4702208, [4702208], SEEK_SET) = 0
> [...many identical _llseek() calls...]
>
> Those sequences of identical _llseek() calls seem to be 30-40 calls
> long. File descriptor 3 was .hg/store/00changelog.i. After a few
> repetitions of the above cycle, I saw this:
That points to something in the lazyparser code that we use on large
indexes, which is rather suboptimal. How big is your 00changelog.i
file?
> Any ideas, suggestions, thoughts? Have I uncovered a hidden
> performance bug, or is this expected? Worth investigating further?
Poking at it a bit, I see there are a lot of avoidable seeks when
doing a clone. I've killed about half of them, but I haven't looked
into the lazyparser code yet.
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list