statdaemon extension

Matt Mackall mpm at
Tue Aug 21 14:15:19 CDT 2012

On Tue, 2012-08-21 at 12:59 +0200, Martin Geisler wrote:
> Hi guys,
> I've been working on making Mercurial faster on Windows: a client
> contacted us because TortoiseHg was super slow: clicking on the "working
> directory" line took 15 seconds. This is with Windows XP on a fast
> machine. The repository has about 75k files, is on a SSD, and virus
> scanning was disabled for the working copy.
> Some spring cleaning brought down the number of files, but to make
> things faster I've begun implementing a statdaemon -- an inotify-like
> daemon that will keep track of the file system state.
> It's on Bitbucket:
> On a repository with 63k files, 'hg status --time' goes from 2.9 sec to
> 1.8 sec with the extension enabled. I get the same result with
> Measure-Command in PowerShell.
> Strangely, I see the time go from 2.4 sec to 2.7 sec when I use 'hg
> perfstatus' to measure the time.
> Bugs:
> With the big tree, I see stale results in 'hg status', even though I see
> that the statdaemon is updating its cache when I make changes. I haven't
> tracked this down yet and the tests (yay, tests on Windows!) haven't
> captured it yet.
> The README is below:
> ====================
> statdaemon extension
> ====================
> This is an extension for speeding up ``hg status`` and related
> commands that traverse the filesytem. This is done by spawning a
> daemon that listens for filesystem events keeps an up-to-date view of
> the filesystem. When ``hg status`` is run, it contacts the daemon and
> quickly retrives the stat data it needs.
> Design
> ======
> The extension is similar to the `inotify extension`_ in that it
> listens for file system events. However, unline inotify, the
> statdaemon knows only caches filesystem information and knows nothing
> about Mercurial. The hope is that this will make the design simpler
> and correct.
> Server
> ------
> The server (daemon) is launched automaticaly when needed but it can
> also be started by hand with ``hg statdaemon``. This is useful for
> debugging.
> When started, the server listens on a random port for incoming
> requests. Clients can send ``listdir`` and ``fetchall`` queries to the
> server:
> * ``listdir(path)``: The server begins watching ``path`` for file
>   system events and sends back the output of
>   ``mercurial.osutil.listdir(path)``.
> * ``fetchall()``: The server sends back all cached data. The client
>   should call this initially to save the number of round-trips.
> Client
> ------
> The extension wraps ``dirstate.status`` to intercept status
> calls. When status data is needed, the client will first call
> ``fetchall()`` to retrive a snapshot of status data for the entire
> working copy. The client caches this locally.
> Calls to ``osutil.listdir`` are intercepted and will first look in the
> cache for status data. If not found there, the server is queried with
> a ``listdir(path)`` query so that it can update its cache. This will
> make the next ``hg status`` call fast since the server will now keep
> track of the needed paths.
> .. _inotify extension:
> Platforms
> =========
> The extension currently has support for Windows. It should work on
> Windows XP and later, but has only been tested on Windows 7. Backends
> for other platforms are very welcome.

This looks interesting. Two notes:

- as we've learned from the inotify extension, it's really hard to get
this sort of thing to be race-free unless you design it to be race-free
from the get-go
- marshal and pickle both effectively let the sender execute arbitrary
code as the receiver, so you'd better be sure you trust the status

Mathematics is the supreme nostalgia of our time.

More information about the Mercurial-devel mailing list