[PATCH 1 of 3] util: add fswatcher class

Fri May 27 11:55:11 CDT 2011

On Fri, 2011-05-27 at 16:18 +0200, Sune Foldager wrote:
> On 2011-05-27 07:40, Matt Mackall wrote:
> >On Fri, 2011-05-27 at 12:22 +0200, Sune Foldager wrote:
> >> >
> >> >Not at all sure about this mtime/ctime comparison business you're doing.
> >> >Not only is its safety suspect, several filesystems don't have the
> >> >concept of a ctime.
> >>
> >> Well, the ctime is from time.time() to not really related to the file system.
> >
> >Ahh, I see. Please, pick another variable name, it's impossible for a
> >Unix hacker to see mtime and ctime on the same line and not think
> >they're both referring to stat fields.
> 
> Alright.
> 
> >> Benoit told me this was the usual way to sidestep the sub-second problem.
> >
> >Well, no, not exactly. We can't actually trust that system notion of
> >time is connected to the filesystem's notion of time (see network
> >filesystem). If all parties don't have a reliable NTP setup, it's quite
> >easy for causality to appear to be violated. Dirstate does:
> >
> >        # use the modification time of the newly created temporary file as the
> >        # filesystem's notion of 'now'
> >        now = int(util.fstat(st).st_mtime)
> >
> >For our watcher purposes, I'm not sure if there's a good way to do this
> >short of writing a temp file (to the same fs).
> 
> Somewhat annoying to have to do that, and where should we write to, etc.
> It requires more knowledge than the wather has currently.

The safest choice is in the same directory as the watched file.
And yes, highly annoying.

But can be skipped entirely if we've got subsecond timestamps.

> >> Dirstate tracks and compares mtime and size and contents, that's a difference.
> >> Size alone is not enough, as an external tool could, in a very contrived
> >> scenario, go in and modify the repo.
> >
> >Ok, but no one has ever suggested that size alone was enough. The idea
> >would be to force a reload if either mtime OR size (OR inode OR ctime,
> >etc.) changed.
> 
> But I'm not sure how you can't always cook up some pathological situation where
> it defeats anything we try here.

We're only interested in 'realistic cases'. People with write access who
want to break things don't need to trick us to do it, so we needn't
concern ourselves with anything that wouldn't happen in normal use.

>  Maybe we should, for the purpose of hgweb and
> hgwebdir, just accept that we can't catch all cases and then do one of
> two things:
> 
> - Do pretty well, e.g. maybe not start writing files to get a concept of
>    current time, unless we feel it's really needed. And maybe or maybe not
>    comparing sizes etc. We won't catch all cases, but some.
> - Document that it's peoples own responsibility to make sure the repo isn't
>    modified under their noses while the hold a repo instance, and then not
>    really check for it at all.

Maybe. Maybe it's also time to introduce the reader lock I talked about
earlier:

http://markmail.org/message/2kz57imteeu5tqmx

If we think of things in terms of readers, appenders, and destroyers
rather than readers and writers, we can see that we actually don't care
all that much about appenders. That is, so long as people aren't
actually deleting things, the view exported by hgweb is still 'causally
plausible' and valid, even if it gets a little behind.

> We would also exploit that for revlogs in particular, the size will almost
> always change if the repo is modified.
> 
> In either case, I agree with the .revalidate (or similar) "internal" approach,
> instead of the external watcher through .getwacther one.

-- 
Mathematics is the supreme nostalgia of our time.