statdaemon extension

Martin Geisler mg at aragost.com
Tue Aug 21 05:59:01 CDT 2012


Hi guys,

I've been working on making Mercurial faster on Windows: a client
contacted us because TortoiseHg was super slow: clicking on the "working
directory" line took 15 seconds. This is with Windows XP on a fast
machine. The repository has about 75k files, is on a SSD, and virus
scanning was disabled for the working copy.

Some spring cleaning brought down the number of files, but to make
things faster I've begun implementing a statdaemon -- an inotify-like
daemon that will keep track of the file system state.

It's on Bitbucket:

  https://bitbucket.org/aragost/statdaemon

On a repository with 63k files, 'hg status --time' goes from 2.9 sec to
1.8 sec with the extension enabled. I get the same result with
Measure-Command in PowerShell.

Strangely, I see the time go from 2.4 sec to 2.7 sec when I use 'hg
perfstatus' to measure the time.


Bugs:

With the big tree, I see stale results in 'hg status', even though I see
that the statdaemon is updating its cache when I make changes. I haven't
tracked this down yet and the tests (yay, tests on Windows!) haven't
captured it yet.


The README is below:

====================
statdaemon extension
====================

This is an extension for speeding up ``hg status`` and related
commands that traverse the filesytem. This is done by spawning a
daemon that listens for filesystem events keeps an up-to-date view of
the filesystem. When ``hg status`` is run, it contacts the daemon and
quickly retrives the stat data it needs.

Design
======

The extension is similar to the `inotify extension`_ in that it
listens for file system events. However, unline inotify, the
statdaemon knows only caches filesystem information and knows nothing
about Mercurial. The hope is that this will make the design simpler
and correct.

Server
------

The server (daemon) is launched automaticaly when needed but it can
also be started by hand with ``hg statdaemon``. This is useful for
debugging.

When started, the server listens on a random port for incoming
requests. Clients can send ``listdir`` and ``fetchall`` queries to the
server:

* ``listdir(path)``: The server begins watching ``path`` for file
  system events and sends back the output of
  ``mercurial.osutil.listdir(path)``.

* ``fetchall()``: The server sends back all cached data. The client
  should call this initially to save the number of round-trips.

Client
------

The extension wraps ``dirstate.status`` to intercept status
calls. When status data is needed, the client will first call
``fetchall()`` to retrive a snapshot of status data for the entire
working copy. The client caches this locally.

Calls to ``osutil.listdir`` are intercepted and will first look in the
cache for status data. If not found there, the server is queried with
a ``listdir(path)`` query so that it can update its cache. This will
make the next ``hg status`` call fast since the server will now keep
track of the needed paths.

.. _inotify extension: http://mercurial.selenic.com/wiki/InotifyExtension

Platforms
=========

The extension currently has support for Windows. It should work on
Windows XP and later, but has only been tested on Windows 7. Backends
for other platforms are very welcome.


-- 
Martin Geisler

aragost Trifork
Commercial Mercurial support
http://aragost.com/mercurial/


More information about the Mercurial-devel mailing list