[PATCH V6] dirstate: avoid a race with multiple commits in the same process
Adrian Buehlmann
adrian at cadifra.com
Tue Mar 22 13:10:44 CDT 2011
On 2011-03-22 18:11, Matt Mackall wrote:
> On Tue, 2011-03-22 at 18:03 +0100, Adrian Buehlmann wrote:
>> On 2011-03-22 14:56, Matt Mackall wrote:
>>> On Tue, 2011-03-22 at 09:13 +0100, Adrian Buehlmann wrote:
>>>> On 2011-03-22 02:09, Greg Ward wrote:
>>>>> On 21 March 2011, Adrian Buehlmann said:
>>>>>> def normal(self, f):
>>>>>> '''Mark a file normal and clean.'''
>>>>>> self._dirty = True
>>>>>> self._addpath(f)
>>>>>> s = os.lstat(self._join(f))
>>>>>> self._map[f] = ('n', s.st_mode, s.st_size, int(s.st_mtime))
>>>>>> if f in self._copymap:
>>>>>> del self._copymap[f]
>>>>>>
>>>>>> # Right now, this file is clean: but if some code in this
>>>>>> # process modifies it without changing its size before the clock
>>>>>> # ticks over to the next second, then it won't be clean anymore.
>>>>>> # So make sure that status() will look harder at it.
>>>>>> self._lastnormal.add(f)
>>>>>>
>>>>>> Hmmm.
>>>>>>
>>>>>> local.status does a full compare for files that need to be looked into,
>>>>>> which IIUC now includes the files in self._lastnormal (see
>>>>>> dirstate.status() ).
>>>>>
>>>>> The set "files that need to be looked into" *may* include files in
>>>>> self._lastnormal. If one of them has changed size or mtime, it takes
>>>>> the fast path to M status.
>>>>
>>>> Right. But it's only fast for the files which *have* been changed.
>>>
>>> You can only take things out of the _lastnormal state when you know
>>> their timestamp is older than the filesystem time. In particular, you
>>> cannot do it when you've confirmed the contents are unchanged. Then
>>> changes to a file -in process- may be missed.
>>>
>>> But I think I see how to fix this: add _lastnormaltime. Whenever we call
>>> normal(), we stat the file to get its timestamp and size for writing to
>>> dirstate. Given that we're only worried about files in the current
>>> second (or whatever the fs quantum is), if the current timestamp is
>>> newer than the timestamp on the _lastnormal list, we can dump the
>>> current list.
>>>
>>> diff -r f52e5211450a mercurial/dirstate.py
>>> --- a/mercurial/dirstate.py Mon Mar 21 15:30:20 2011 -0500
>>> +++ b/mercurial/dirstate.py Tue Mar 22 08:54:48 2011 -0500
>>> @@ -50,6 +50,7 @@
>>> self._dirty = False
>>> self._dirtypl = False
>>> self._lastnormal = set() # files believed to be normal
>>> + self._lastnormaltime = None
>>> self._ui = ui
>>>
>>> @propertycache
>>> @@ -290,6 +291,9 @@
>>> # process modifies it without changing its size before the clock
>>> # ticks over to the next second, then it won't be clean anymore.
>>> # So make sure that status() will look harder at it.
>>> + if self._lastnormaltime < s.st_mtime:
>>> + self._lastnormaltime = s.st_mtime
>>> + self._lastnormal = set()
>>> self._lastnormal.add(f)
>>>
>>> def normallookup(self, f):
>>>
>>
>> Excellent idea.
>>
>> I think we can even do != instead of <:
>
> I don't this will work, we may mark a file normal that was last written
> to hours ago. This will be the usual case with commit.
Hmm. Right. But then, a single odd toxic file with a faked mtime in the
distant enough future will bomb your _lastnormal set.
More information about the Mercurial-devel
mailing list