statdaemon extension

Matt Mackall mpm at selenic.com
Tue Aug 21 14:15:19 CDT 2012


On Tue, 2012-08-21 at 12:59 +0200, Martin Geisler wrote:
> Hi guys,
> 
> I've been working on making Mercurial faster on Windows: a client
> contacted us because TortoiseHg was super slow: clicking on the "working
> directory" line took 15 seconds. This is with Windows XP on a fast
> machine. The repository has about 75k files, is on a SSD, and virus
> scanning was disabled for the working copy.
> 
> Some spring cleaning brought down the number of files, but to make
> things faster I've begun implementing a statdaemon -- an inotify-like
> daemon that will keep track of the file system state.
> 
> It's on Bitbucket:
> 
>   https://bitbucket.org/aragost/statdaemon
> 
> On a repository with 63k files, 'hg status --time' goes from 2.9 sec to
> 1.8 sec with the extension enabled. I get the same result with
> Measure-Command in PowerShell.
> 
> Strangely, I see the time go from 2.4 sec to 2.7 sec when I use 'hg
> perfstatus' to measure the time.
> 
> 
> Bugs:
> 
> With the big tree, I see stale results in 'hg status', even though I see
> that the statdaemon is updating its cache when I make changes. I haven't
> tracked this down yet and the tests (yay, tests on Windows!) haven't
> captured it yet.
> 
> 
> The README is below:
> 
> ====================
> statdaemon extension
> ====================
> 
> This is an extension for speeding up ``hg status`` and related
> commands that traverse the filesytem. This is done by spawning a
> daemon that listens for filesystem events keeps an up-to-date view of
> the filesystem. When ``hg status`` is run, it contacts the daemon and
> quickly retrives the stat data it needs.
> 
> Design
> ======
> 
> The extension is similar to the `inotify extension`_ in that it
> listens for file system events. However, unline inotify, the
> statdaemon knows only caches filesystem information and knows nothing
> about Mercurial. The hope is that this will make the design simpler
> and correct.
> 
> Server
> ------
> 
> The server (daemon) is launched automaticaly when needed but it can
> also be started by hand with ``hg statdaemon``. This is useful for
> debugging.
> 
> When started, the server listens on a random port for incoming
> requests. Clients can send ``listdir`` and ``fetchall`` queries to the
> server:
> 
> * ``listdir(path)``: The server begins watching ``path`` for file
>   system events and sends back the output of
>   ``mercurial.osutil.listdir(path)``.
> 
> * ``fetchall()``: The server sends back all cached data. The client
>   should call this initially to save the number of round-trips.
> 
> Client
> ------
> 
> The extension wraps ``dirstate.status`` to intercept status
> calls. When status data is needed, the client will first call
> ``fetchall()`` to retrive a snapshot of status data for the entire
> working copy. The client caches this locally.
> 
> Calls to ``osutil.listdir`` are intercepted and will first look in the
> cache for status data. If not found there, the server is queried with
> a ``listdir(path)`` query so that it can update its cache. This will
> make the next ``hg status`` call fast since the server will now keep
> track of the needed paths.
> 
> .. _inotify extension: http://mercurial.selenic.com/wiki/InotifyExtension
> 
> Platforms
> =========
> 
> The extension currently has support for Windows. It should work on
> Windows XP and later, but has only been tested on Windows 7. Backends
> for other platforms are very welcome.

This looks interesting. Two notes:

- as we've learned from the inotify extension, it's really hard to get
this sort of thing to be race-free unless you design it to be race-free
from the get-go
- marshal and pickle both effectively let the sender execute arbitrary
code as the receiver, so you'd better be sure you trust the status
daemon

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list