[PATCH V6] dirstate: avoid a race with multiple commits in the same process
Adrian Buehlmann
adrian at cadifra.com
Tue Mar 22 12:03:23 CDT 2011
On 2011-03-22 14:56, Matt Mackall wrote:
> On Tue, 2011-03-22 at 09:13 +0100, Adrian Buehlmann wrote:
>> On 2011-03-22 02:09, Greg Ward wrote:
>>> On 21 March 2011, Adrian Buehlmann said:
>>>> def normal(self, f):
>>>> '''Mark a file normal and clean.'''
>>>> self._dirty = True
>>>> self._addpath(f)
>>>> s = os.lstat(self._join(f))
>>>> self._map[f] = ('n', s.st_mode, s.st_size, int(s.st_mtime))
>>>> if f in self._copymap:
>>>> del self._copymap[f]
>>>>
>>>> # Right now, this file is clean: but if some code in this
>>>> # process modifies it without changing its size before the clock
>>>> # ticks over to the next second, then it won't be clean anymore.
>>>> # So make sure that status() will look harder at it.
>>>> self._lastnormal.add(f)
>>>>
>>>> Hmmm.
>>>>
>>>> local.status does a full compare for files that need to be looked into,
>>>> which IIUC now includes the files in self._lastnormal (see
>>>> dirstate.status() ).
>>>
>>> The set "files that need to be looked into" *may* include files in
>>> self._lastnormal. If one of them has changed size or mtime, it takes
>>> the fast path to M status.
>>
>> Right. But it's only fast for the files which *have* been changed.
>
> You can only take things out of the _lastnormal state when you know
> their timestamp is older than the filesystem time. In particular, you
> cannot do it when you've confirmed the contents are unchanged. Then
> changes to a file -in process- may be missed.
>
> But I think I see how to fix this: add _lastnormaltime. Whenever we call
> normal(), we stat the file to get its timestamp and size for writing to
> dirstate. Given that we're only worried about files in the current
> second (or whatever the fs quantum is), if the current timestamp is
> newer than the timestamp on the _lastnormal list, we can dump the
> current list.
>
> diff -r f52e5211450a mercurial/dirstate.py
> --- a/mercurial/dirstate.py Mon Mar 21 15:30:20 2011 -0500
> +++ b/mercurial/dirstate.py Tue Mar 22 08:54:48 2011 -0500
> @@ -50,6 +50,7 @@
> self._dirty = False
> self._dirtypl = False
> self._lastnormal = set() # files believed to be normal
> + self._lastnormaltime = None
> self._ui = ui
>
> @propertycache
> @@ -290,6 +291,9 @@
> # process modifies it without changing its size before the clock
> # ticks over to the next second, then it won't be clean anymore.
> # So make sure that status() will look harder at it.
> + if self._lastnormaltime < s.st_mtime:
> + self._lastnormaltime = s.st_mtime
> + self._lastnormal = set()
> self._lastnormal.add(f)
>
> def normallookup(self, f):
>
Excellent idea.
I think we can even do != instead of <:
@@ -290,6 +291,9 @@
# process modifies it without changing its size before the clock
# ticks over to the next second, then it won't be clean anymore.
# So make sure that status() will look harder at it.
+ if self._lastnormaltime != s.st_mtime:
+ self._lastnormaltime = s.st_mtime
+ self._lastnormal = set()
self._lastnormal.add(f)
We only need to care about the things that happen in the same most
recent filesystem timeslot we've seen.
If we see a different timeslot, we can forget about the previous one.
I think it doesn't even matter if the clock made a jump backwards.
More information about the Mercurial-devel
mailing list