[PATCH 2 of 2] Reorder rename operations to minimise risk of leaving repository in unknown state

Adrian Buehlmann adrian at cadifra.com
Sat Oct 3 16:18:19 CDT 2009


On 03.10.2009 22:42, Steve Borho wrote:
> On Sat, Oct 3, 2009 at 1:07 PM, Adrian Buehlmann <adrian at cadifra.com> wrote:
> 
>> On 03.10.2009 19:13, Steve Borho wrote:
>>> On Sat, Oct 3, 2009 at 6:35 AM, Sune Foldager <cryo at cyanite.org> wrote:
>>>
>>>> Laurens Holst wrote:
>>>>> I think the core problem here is that in Windows, there is simply not
>>>>> a concept of an atomic rename to an existing file.
>>>> Indeed. It works like this: You can never rename a file into an existing
>>>> file in any way. Also, you CAN delete an open file (if opened with
>>>> correct share modes), but it will not disappear from the directory list
>>>> until it is closed. Finally, you CAN rename an open file (if opened with
>>>> correct share modes), and the rename WILL take place immediately,
>>>> fortunately. This is why the code looks as it does right now.
>>>>
>>>>
>>> Based on this, I think the patch would help.  I'll take leaked temp files
>>> over missing repository files any day of the week.  Doing the unlink last
>>> will still throw an exception, which is the right thing to do.  Mercurial
>>> should throw a great big fit when it is interfered with like this (so
>> people
>>> switch A/V tools), but reordering the calls makes us more resistant to
>> data
>>> loss.
>>>
>> And what if atomictemp's are used multiple times in a hg run
>> and it chokes on a rename that is not the last one?
>>
>> Then you have some files done and some not, thanks to the
>> abort inflicted by the scanner.
>>
> 
> At some point we have to depend on Mercurial's journaling to rollback
> incomplete transactions when exceptions occur.  Fortunately most operations
> are append-only and not rename / replace.

Oh, and the scanner might catch the transaction journal file and inflict
an abort if we try to unlink it (in transaction.close, line 130 in
transaction.py).

Hehe, and transaction.close is called from transaction.abort
("generally called on error").

If everything goes optimally wrong, the scanner might hit us twice.





More information about the Mercurial-devel mailing list