[PATCH V6] dirstate: avoid a race with multiple commits in the same process

Matt Mackall mpm at selenic.com
Tue Mar 22 13:31:08 CDT 2011


On Tue, 2011-03-22 at 19:10 +0100, Adrian Buehlmann wrote:
> On 2011-03-22 18:11, Matt Mackall wrote:
> > On Tue, 2011-03-22 at 18:03 +0100, Adrian Buehlmann wrote:
> >> On 2011-03-22 14:56, Matt Mackall wrote:
> >>> On Tue, 2011-03-22 at 09:13 +0100, Adrian Buehlmann wrote:
> >>>> On 2011-03-22 02:09, Greg Ward wrote:
> >>>>> On 21 March 2011, Adrian Buehlmann said:
> >>>>>>     def normal(self, f):
> >>>>>>         '''Mark a file normal and clean.'''
> >>>>>>         self._dirty = True
> >>>>>>         self._addpath(f)
> >>>>>>         s = os.lstat(self._join(f))
> >>>>>>         self._map[f] = ('n', s.st_mode, s.st_size, int(s.st_mtime))
> >>>>>>         if f in self._copymap:
> >>>>>>             del self._copymap[f]
> >>>>>>
> >>>>>>         # Right now, this file is clean: but if some code in this
> >>>>>>         # process modifies it without changing its size before the clock
> >>>>>>         # ticks over to the next second, then it won't be clean anymore.
> >>>>>>         # So make sure that status() will look harder at it.
> >>>>>>         self._lastnormal.add(f)
> >>>>>>
> >>>>>> Hmmm.
> >>>>>>
> >>>>>> local.status does a full compare for files that need to be looked into,
> >>>>>> which IIUC now includes the files in self._lastnormal (see
> >>>>>> dirstate.status() ).
> >>>>>
> >>>>> The set "files that need to be looked into" *may* include files in
> >>>>> self._lastnormal.  If one of them has changed size or mtime, it takes
> >>>>> the fast path to M status.
> >>>>
> >>>> Right. But it's only fast for the files which *have* been changed.
> >>>
> >>> You can only take things out of the _lastnormal state when you know
> >>> their timestamp is older than the filesystem time. In particular, you
> >>> cannot do it when you've confirmed the contents are unchanged. Then
> >>> changes to a file -in process- may be missed.
> >>>
> >>> But I think I see how to fix this: add _lastnormaltime. Whenever we call
> >>> normal(), we stat the file to get its timestamp and size for writing to
> >>> dirstate. Given that we're only worried about files in the current
> >>> second (or whatever the fs quantum is), if the current timestamp is
> >>> newer than the timestamp on the _lastnormal list, we can dump the
> >>> current list.
> >>>
> >>> diff -r f52e5211450a mercurial/dirstate.py
> >>> --- a/mercurial/dirstate.py	Mon Mar 21 15:30:20 2011 -0500
> >>> +++ b/mercurial/dirstate.py	Tue Mar 22 08:54:48 2011 -0500
> >>> @@ -50,6 +50,7 @@
> >>>          self._dirty = False
> >>>          self._dirtypl = False
> >>>          self._lastnormal = set()        # files believed to be normal
> >>> +        self._lastnormaltime = None
> >>>          self._ui = ui
> >>>  
> >>>      @propertycache
> >>> @@ -290,6 +291,9 @@
> >>>          # process modifies it without changing its size before the clock
> >>>          # ticks over to the next second, then it won't be clean anymore.
> >>>          # So make sure that status() will look harder at it.
> >>> +        if self._lastnormaltime < s.st_mtime:
> >>> +            self._lastnormaltime = s.st_mtime
> >>> +            self._lastnormal = set()
> >>>          self._lastnormal.add(f)
> >>>  
> >>>      def normallookup(self, f):
> >>>
> >>
> >> Excellent idea.
> >>
> >> I think we can even do != instead of <:
> > 
> > I don't this will work, we may mark a file normal that was last written
> > to hours ago. This will be the usual case with commit.
> 
> Hmm. Right. But then, a single odd toxic file with a faked mtime in the
> distant enough future will bomb your _lastnormal set.

Yes. But that's ok: if you fake mtimes, you can already expect to lose.

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list