statdaemon extension

Martin Geisler mg at aragost.com
Mon Aug 27 06:11:32 CDT 2012


Matt Mackall <mpm at selenic.com> writes:

> On Thu, 2012-08-23 at 15:47 +0200, Martin Geisler wrote:
>> Matt Mackall <mpm at selenic.com> writes:
>> 
>> > On Wed, 2012-08-22 at 11:24 +0200, Nicolas Dumazet wrote:
>> >
>> >> Thoughts?
>> >
>> > Some thoughts on correctness..
>> >
>> > There are several different models of consistency that we can imagine:
>> >
>> > 1) status results are correct at _time of use_ (ie when we do a commit)
>> > 2) results are correct at _time of return_ (when status function exits)
>> > 3) results are consistent snapshot of some point of time during the call
>> > 4) results are consistent snapshot at the start of the call
>> > 5) results are collection of file states observable during the call
>> >
>> > Mercurial itself uses the weakest of these, 5. The others are
>> > basically impossible to implement, requiring either full directory
>> > tree locking or some sort of arbitrary restartable transaction scheme
>> > + reliable notification daemon. Also, the only one that's an actual
>> > meaningful improvement on 5 is 1. So we basically have a model that
>> > requires the user to serialize their changes and calls to hg for
>> > reliability.
>> 
>> I agree.
>> 
>> > The only model that is not acceptable is one that doesn't include all
>> > events that occur before the call, because then even the serializing
>> > user loses. So this should be the benchmark of correctness: it is
>> > impossible to miss an event that occurred before the status call.
>> 
>> Is this not option 2) above?
>> 
>> That is, the status call must report all events that occured before the
>> call -- and we don't know how long we need to wait in order to ensure
>> that we've received all events.
>
> No, not at all.

I was talking about a daemon that listens for file system events and
also ensures that "it is impossible to miss an event that occurred
before the status call". I think that is close to 2) above and thus
impossible to implement.

> First, (2) includes a notion of instantaneous correctness; (5) does
> not. (5) doesn't even include a notion of causal ordering: you can
> observe status results that were -never- true on the filesystem due
> differences between hg's scanning order and modifications on the
> filesystem. So, (5) doesn't even guarantee self-consistent(!) results
> with parallel writers.
>
> All (5) guarantees is that anything that happened before the call is
> seen, what happens to anything after that point is undefined.
>
> So how does this relate to a stat daemon? If at time x, we arrive at
> an empty fs event queue, we can be confident that we've observed every
> event before time x (putting aside queue spills for the moment).

I don't think that's how it works: we don't know how long the delay is
between when a file is written (and visible to a listdir call) and when
the event is posted (and thus visible to the daemon).

So even if the daemon can say that it has processed all events it was
notified about, then we still don't know if there is an event "on the
way". The gap should be really small, but I expect it to be there.

> So, if we call the daemon and the daemon sees it has no work to do,
> we're done. But if the daemon instead sees it has lots of work queued
> up (possibly a never-ending amount!), it needs to just give up and
> tell hg to do status the old way. Defining 'lots' here is tricky, but
> in practice it should probably be a small number of events like 100 or
> 1000.
>
> Another possibility is that on each client request, we create a new
> event queue, disable new notifications on the existing queue with a
> filter, and then empty and close it.
>
> Relatedly, I think that a stat daemon should be 'lazy' in the sense
> that it avoids interleaving its own I/O requests with whatever is
> generating events (compiles, etc.). If we do, we risk the sorts of bad
> I/O patterns that Windows virus scanners and things like the indexers
> on Windows and Linux tend to cause. A lot of the effort here also ends
> up wasted: tools like compilers and editors create, modify, and delete
> things like temporary files over and over and we may not even run hg
> between compiles. So we could end up doing tons of work on files that
> hg itself never sees.

If I understand you correctly, then yes. I've thought of making the
daemon delay the listdir calls for some time, maybe even until status is
called. Delaying the calls will help when there are lots of events in
the same directory.

The daemon would still rescan a directory where a temporary file was
created and then deleted -- unless more logic is added so that we note
when a file is created and later deleted. That goes against the idea of
making the daemon simple and the FSEvents API on OS X doesn't give us
per-file information anyway. But maybe it's worth it.

-- 
Martin Geisler

aragost Trifork
Commercial Mercurial support
http://aragost.com/mercurial/


More information about the Mercurial-devel mailing list