statdaemon extension

Nicolas Dumazet nicdumz at gmail.com
Wed Aug 22 04:24:57 CDT 2012


2012/8/22 Martin Geisler <mg at aragost.com>:
>
> Nicolas wrote some questions for me on Facebook:
>
>> Interesting. Shouldnt be difficult to piggy back on inotify code to
>> get linux & mac backends.
>
> Exactly -- that's definitely the hope! I looked at the inotify and
> FSEvents APIs and it seems that the "tell me the state of this
> directory" API of the statdaemon fits both of them well.
>
> One thing that could be better: for Windows, the daemon will listen on
> events in the working copy -- recursively. This means that it will also
> get events for (huge) ignore directories.
>
> I did it like this since I was wary of making the daemon understand the
> .hgignore file (and in particular understand updates to it) and since it
> seems costly to begin monitoring each directory by itself.
>
> As I understand it, you would use the WaitForMultipleObjects function to
> wait on a file handle in Windows. That function can only wait for
> MAXIMUM_WAIT_OBJECTS, which is normally only 64. Since the extension is
> meant for file trees with many thousands of directories, one would need
> to make some tree-like structure of objects to wait on. That seemed a
> bit too cumbersome for a first prototype :)
>
>> Something that really isnt clear is how you fix the never-ending sync
>> issues. In particular:
>>
>> 1) when your StatServer replies to fetchall, you have no guarantee
>> that the latest scan has completed. Which means that you're
>> potentially replying with old data
>>
>> 2) when your listener modifies self._cache, your StatServer might be
>> reading it. Which means that fetchall() replies might be reflecting
>> some incomplete / inconsistent state
>>
>> 2) is easy to fix (Lock!)
>>
>> Mitigating 1) seems harder since you seem to be scanning all the time.
>> Any way to reply "I'm sure there was no events since I received your
>> message, and here's the state?"
>
> I actually don't think it needs fixing. The way I see it, the daemon has
> to reply with a consistent snapshot: that is, a snapshot that did exist
> at some point in time. To be useful, the snapshot should preferably have
> existed at some very recent point in time :)
>
> So if lots of events are coming in and the daemon is busy updating its
> cache, then it can still send out a copy of its cache: this cached
> snapshot represents the state you would see if you could do 'hg status'
> really, really fast and thus capture the half-updated working copy.
>
> It might very well be that a subsequent 'hg status' will show more
> changes -- but that's just because it was run at a later point in time.
>
> What is not allowed to happen is that the daemon returns a cache
> mentioning only
>
>   ['bbb', 'xxx']
>
> when whatever process is modifying the working copy did
>
>   write('aaa')
>   write('bbb')
>   write('xxx')
>   write('yyy')
>
> The snapshot with only ['bbb', 'xxx'] is inconsistent, but any snapshot
> with a "prefix" of ['aaa', 'bbb', 'xxx', 'yyy'] is okay.
>
> This has some connection with the idea of "serializability" -- that the
> final result will look *as if* the overlapping operations were executed
> serially. Here a cache with ['aaa', 'bbb'] would be okay, since it would
> be the result you would expect if the serial order had been:
>
>   write('aaa')
>   write('bbb')
>
>   hg status
>
>   write('xxx')
>   write('yyy')

Sure, there is some "eventual consistency": if the user calls "hg
status" N times with N big enough, the Nth call will reflect the exact
state of the system.

This, however, seems broken from an user perspective.

Taking again your example, and adding numbers:
1. write('aaa')
2. write('bbb')
3. write('xxx')
4. write('yyy')

If an user does those 4 changes, and issues a `hg status` after 4.,
she will expect ['aaa', 'bbb', 'xxx', 'yyy'] as a result. Anything
else is arguably okay from a system design perspective, but wrong for
the user.
Is the current implementation guaranteeing this?

As far as I understand it, your daemon can return anything in [ [],
['aaa'], ['aaa', 'bbb'], ['aaa', 'bbb', 'xxx'], ['aaa', 'bbb', 'xxx',
'yyy'] ]. Which means that in turn, hg status can return any of those
5 replies.

I don't think that there is a bullet-proof way to always have complete answers.
1) You need to guarantee that the request handler waits for FS events
to be propagated & handled by the FS events handler
2) You need to sync the two threads

And I think that you should not care about the last case:
3) When the request handler receives a request at t0, more events
might be coming (t1) by the time you eventually reply (t2)

Maybe better than the current implementation:

safety_margin_msecs = ...
lock_ = ...
cache_ = ...

last_poll_ = 0

def fsevents_thread():
  while True:
    with lock_:
      data = poll(timeout=safety_margin_msecs)
      last_poll_ = now()
      cache_.update(data)

def handle_request(request):
  wait(now() - last_poll_ + 2 * safety_margin_msecs)
  with lock_:
    cache_.reply(request)

In other words, I suggest adding explicit throttling before answering
requests, trading speed for correctness.
Waiting is _never_ the perfect way to handle synchronization issues,
but given that we don't have access to OS internals, that's currently
the best thing I can think of.

On Linux you could do something like ioctl(inotify_fd, FIONREAD,
&bufsize); to tell if there are any FS events available for processing
before replying to the client request, but that won't work under
Windows or Mac.

Thoughts?
-- 
Nicolas Dumazet — NicDumZ


More information about the Mercurial-devel mailing list