[PATCH 2 of 2] Reorder rename operations to minimise risk of leaving repository in unknown state

Adrian Buehlmann adrian at cadifra.com
Sat Oct 3 15:57:09 CDT 2009



On 03.10.2009 20:07, Adrian Buehlmann wrote:
> On 03.10.2009 19:13, Steve Borho wrote:
>> On Sat, Oct 3, 2009 at 6:35 AM, Sune Foldager <cryo at cyanite.org> wrote:
>>
>>> Laurens Holst wrote:
>>>> I think the core problem here is that in Windows, there is simply not
>>>> a concept of an atomic rename to an existing file.
>>> Indeed. It works like this: You can never rename a file into an existing
>>> file in any way. Also, you CAN delete an open file (if opened with
>>> correct share modes), but it will not disappear from the directory list
>>> until it is closed. Finally, you CAN rename an open file (if opened with
>>> correct share modes), and the rename WILL take place immediately,
>>> fortunately. This is why the code looks as it does right now.
>>>
>>>
>> Based on this, I think the patch would help.  I'll take leaked temp files
>> over missing repository files any day of the week.  Doing the unlink last
>> will still throw an exception, which is the right thing to do.  Mercurial
>> should throw a great big fit when it is interfered with like this (so people
>> switch A/V tools), but reordering the calls makes us more resistant to data
>> loss.
>>
> 
> And what if atomictemp's are used multiple times in a hg run
> and it chokes on a rename that is not the last one?
> 
> Then you have some files done and some not, thanks to the
> abort inflicted by the scanner.

An interesting execution path is for example doing a pull.

If I understand the code correctly, a pull may lead to calls to
revlog.checkinlinesize, which does a rename() on an atomic temp on line 1042
in revlog.py when switching from an inlined revlog (everything in the .i file)
to a non-inlined revlog (separate .i and .d files -- used for larger
revlogs).

So, that means while adding revisions during a pull, we create
tempfiles in the store (in .hg/store/data) and the scanner may catch
one of them and inflict an abort on it if we try to do the unlink
on it. In a racy manner.

So there may indeed be multiple tempfile deletes in a single hg run.






More information about the Mercurial-devel mailing list