largefiles: overheads to hg startup

Matt Mackall mpm at selenic.com
Mon Oct 24 09:55:17 CDT 2011


On Mon, 2011-10-24 at 00:03 -0700, Victor Suba wrote:
> Hi,
> 
> Sorry for starting with just describing an issue instead of providing a
> patch.  Still trying to figure out the code fully before I can consider
> contributing a patch.
> 
> I realized that largefiles (and kbfiles) were adding significant time to "hg
> status" and "hg commit".
> When either of these extensions is enabled they have a cost, perhaps even
> more so if large files are not used in the repository.
> 
> On mozilla-central, "hg status" goes from 0.7s to 6.5s with "largefiles"
> enabled (on Win 7 with cached file system;  from cold it's much slower)

Ouch. Reproduced that here as well. 0.5s -> 3.8s on Linux with an SSD on
my linux-kernel repo. I think all that's needed to hit this is a repo
with lots of files. Another test:

$ time hg tip -q
269627:f5c909c23dc1

real	0m0.241s
user	0m0.177s
sys	0m0.063s

$ time hg tip -q --config extensions.largefiles=
269627:f5c909c23dc1

real	0m1.953s
user	0m1.767s
sys	0m0.180s

> A significant portion (3.5s) is in checkrequireslfiles, which I guess is
> searching for the existence of '.hglf'.  On a repo that doesn't use
> largefiles, this check will always run (hence worse degrade for repos that
> don't use "largefiles" than those that do).  Speeding up the check here
> would make a big difference.

Yeah, this cost basically needs to vanish entirely. Having a measurable
hit on non-largefile repos is extremely bad news. It's probably enough
to simply check for a .hglf directory in the working directory rather
than grovel through all the files in the store on every startup.

> Second portion is the extension calls dirstate.status twice compared to the
> usual once, and does some extra processing that makes up
> the rest of the time.

Ouch!

Really, destroying status performance just by enabling an extension is
extremely bad news. This is one of the most heavily-used and
heavily-optimized code paths we have. We already want to minimize this
overhead in the using-largefiles case, but it really should be zero in
the no-largefiles case.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list