OS X FSEvents and temporary files

Matt Mackall mpm at selenic.com
Sun Jun 13 17:14:37 CDT 2010


On Sun, 2010-06-13 at 23:38 +0200, Jason Harris wrote:
> On Jun 13, 2010, at 10:40 PM, Matt Mackall wrote:
> 
> > I talked with Martin a bit about the FSEvents problem you're having with
> > MacHG and took a closer look at what Mercurial's doing and how Apple's
> > API works.
> 
> Thanks!
> 
> > In summary, it looks like this:
> > 
> > 1. MacHG gets notified of a change in repo/
> > 2. MacHG asks Mercurial for status info on repo/
> > 3. Mercurial scans repo/ and encounters a file with the x bit changed
> > 4. Mercurial attempts to detect whether this is a real change by
> > detecting whether the filesystem that repo/ is on actually supports the
> > x bit by creating a temp file and changing its mode
> > 5. MacHG gets notified of another change in repo/
> > 6. MacHG calls status
> 
> I am not sure about steps 3 and 4 of course since they are internal to Mercurial but the other steps are indeed exactly what happens and we start racing...
> 
> 
> > So there's a couple salient facts here:
> > 
> > a) the FSEvents API only reports changes at a directory granularity
> > b) it also aggregates events in directory trees
> > c) if directory events are ignored, detection of real changes may be
> > delayed indefinitely
> 
> Directory events are the only kind of events ever reported. So you are
> saying if we turn off notifications then... well... we won't be
> notified. But this seems too much like a tautology so you are probably
> saying something more, but I am not sure what...

Some of this is for other folks following along. But my point is that we
can't disregard any events without inspecting them.

> > d) Mercurial needs these tests to deal with non-Unix-like filesystems,
> > which may be present on Macs
> 
> Ahhhh.... which type of systems??

Anything that's not a local Unix filesystem with normal exec bits and
symlinks. I forget how/if checkcase gets involved here but it may.

>  If I knew this then I could check the entirety of the repository when
> MacHg loads the repository and then we wouldn't need to do this
> checking every single time Mercurial is called (Mercurial is called
> lots and lots of times from MacHg, in fact whenever there are
> changes ;) ). Thus MacHg (or other clients) could be responsible for
> doing the checexec and checklink check's which Mercurial is now doing.
> In fact it would be nice to know this so that MacHg would be able to
> report a nice warning / error message to users. (of course this needs
> to be a switch with default behavior the way things currently work...)
> 
> 
> > e) it appears the temp files only get created/destroyed when there are
> > files with the exec or symlink bits changed
> 
> 
> 
> > One proposed fix is to make the test files in .hg/, but that will get us
> > in trouble if someone decides to use a symlink for .hg (not a terribly
> > unreasonable thing to do).
> 
> Yep. MacHg could also check that .hg is not a symlink and report an error / warning as well...
> 
> > I propose to instead fix it by inserting these steps:
> > 
> > 1a: MacHG immediately grabs a directory listing of repo/
> > 4b: If there are pending events on repo/ upon return from calling
> > Mercurial,
> 
> events get delivered asynchronously and sometimes up to 2 or 3 seconds
> after the changes have taken place.

Really? That's unfortunate. But I think we can still work around it.

>  (you can set this parameter but normally there is at least some
> delay...) MacHg gets a lot of its speed by doing things in a threaded
> asynchronous manner. (You can try flushing the events, but sometimes
> the flush will be done mid status check, etc.)
> 
> Thus sometimes several status requests are done asynchronously and its hard to know which directory is paired with which result.
> 
> > 4c: If it has the kFSEventStreamEventFlagMustScanSubDirs flag set, go to
> > step 2 - a genuine file change appeared during status
> 
> This flag very very rarely comes up. It happens when for instance the processor loads are maxed out ans somehow the events are missed. Its a very exceptional case... So maybe you meant something else?
> 
> 
> > 4d: Else, grab a directory listing of repo/ and compare it against the
> > one from 1a, if there are any changes, go to step 2
> 
> By comparing you mean to compare the status of all the top level files
> and directories. Where the status is their sizes, modification dates,
> permissions, and any other meta-data associated with the file right?
> We would do this in order to detect if the file "changed" right?  That
> is basically you are suggesting to do a top level walk of the
> directory looking for changes manually right?
> 
> Buttt.... say something changes in a low level directory but a status
> is also done at an upper level in an asynchronous way, then the
> directories will be coalesced by FSEvents, and then the FSEvents
> monitor will pass back "something changed" inside the whole tree. And
> thus, the top level manual walk of the directory wouldn't pick up this
> change at the lower levels.
> 
> If we had to walk all of the files in the whole repository then we are
> basically doing the whole job of the FSEvents monitor and moreover it
> wouldn't be at all fast. It would be far too slow and thats why I am
> using FSEvents monitor in the first place. (of course we would still
> have FSEvents monitor telling us that something changed in the first
> place...)
> 
> I tried such tricks as you mentioned in steps 4 and in step 1. I tried
> a large variety of them. They all failed in various aspects. I devoted
> a significant chunk of code to it. I can point to the details of where
> this happened in the Cocoa code if you are really interested in the
> nitty gritty details :)
> 
> I have to say the usability of MacHg increased markedly from a user
> perspective once I found out I could turn off checexec and checklink.
> (Well I hadn't released it at that time but I was using it to of
> course develop itself). The discovery was fantastic. This issue was
> likely the central problem I have had while developing MacHg. Ie there
> are other more complicated things but this one troubled me for the
> longest amount of time and was the most problematic, until finally I
> found I could shut it off with two simple lines of code. After that
> change MacHg was much more reliable and various files didn't slip
> through the net. There are still fringe cases lurking when in
> asynchronous ways if you madly click and start switching around
> repositories and doing random things and loading things up I have seen
> the occasional status glitch, but more or less the transient status
> problems which were plaguing me, are no longer present at all.
> Fantastic.

The basic observation of my approach is that if you have:

known before state in directory X
any number of events in directory X that don't touch subdirectories
known after state in directory X

and before=after, you can ignore all the events between A and B because
you know nothing changed. 

A 'normal' FSEvents watcher would have to basically look at the contents
of each directory in the event queue and compare it to its last known or
startup state like the above to figure out what files have changed.
MacHG can mostly get away without tracking that stuff itself because it
can rely on Mercurial's internal dirstate. But if did a bit of both, it
could be smarter and faster: for instance, by only calling for status on
files it knows have changed.

> > By confirming that the listings in 1a and 4d match, we can prove that
> > nothing has slipped by us in the affected directory while Mercurial was
> > doing its thing. This test should be pretty cheap as all the data will
> > generally be cache hot, moderately sized, and recursion isn't necessary.
> > 
> > Also note that it's possible to add:
> > 
> > 1b: If we have a cached copy, compare it. If it's unchanged, we're done
> > - no need to call hg.
> > 
> > ..which might be nice for dealing with temp files created by editors,
> > compilers, etc.
> 
> So, thanks for looking at this problem which is a real sticking point!
> 
> However it would be really nice if as a client I knew what I was
> looking for in this checkexec and checklink calls, and MacHg could
> really easily scan the repo once for the problematic bits you are
> looking for in the first place and then somehow set some environment
> variable in passing through to Mercurial saying that MacHg has handled
> this problem, and not to create temporary files Eg
> GUICLIENTDONTCHECKEXECORLINK = 1, but a better name :)

We could do something like that, but it's probably getting a bit too
intimate with the internals.

> I could of course right now just traverse every single directory in
> the repository (not following symlinks) and make sure checkexec and
> checklink pass in every single directory of the repository. Thus the
> repository would be able to work with MacHg, and MacHg would issue
> some meaningful error message if this wasn't the case like "The
> repository FooBar located on the file system BugSplatter cannot work
> with MacHg because BugSplatter is of type Blargh... Please move FooBar
> to a file system of type MOO..." sort of thing. But likely if I know
> exactly which conditions we are looking for and why I could fine tune
> this message a great deal...

Not allowing people to use repos on flash drives and NFS doesn't sound
like a win (though I'm pretty sure FSEvents doesn't/can't work with NFS
anyway - you're gonna have to fall back to polling).

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list