[PATCH 2 of 2] Reorder rename operations to minimise risk of leaving repository in unknown state

Adrian Buehlmann adrian at cadifra.com
Sun Oct 4 05:17:30 CDT 2009


On 03.10.2009 22:42, Steve Borho wrote:
> On Sat, Oct 3, 2009 at 1:07 PM, Adrian Buehlmann <adrian at cadifra.com> wrote:
> 
>> On 03.10.2009 19:13, Steve Borho wrote:
>>> On Sat, Oct 3, 2009 at 6:35 AM, Sune Foldager <cryo at cyanite.org> wrote:
>>>
>>>> Laurens Holst wrote:
>>>>> I think the core problem here is that in Windows, there is simply not
>>>>> a concept of an atomic rename to an existing file.
>>>> Indeed. It works like this: You can never rename a file into an existing
>>>> file in any way. Also, you CAN delete an open file (if opened with
>>>> correct share modes), but it will not disappear from the directory list
>>>> until it is closed. Finally, you CAN rename an open file (if opened with
>>>> correct share modes), and the rename WILL take place immediately,
>>>> fortunately. This is why the code looks as it does right now.
>>>>
>>>>
>>> Based on this, I think the patch would help.  I'll take leaked temp files
>>> over missing repository files any day of the week.  Doing the unlink last
>>> will still throw an exception, which is the right thing to do.  Mercurial
>>> should throw a great big fit when it is interfered with like this (so
>> people
>>> switch A/V tools), but reordering the calls makes us more resistant to
>> data
>>> loss.
>>>
>> And what if atomictemp's are used multiple times in a hg run
>> and it chokes on a rename that is not the last one?
>>
>> Then you have some files done and some not, thanks to the
>> abort inflicted by the scanner.
>>
> 
> At some point we have to depend on Mercurial's journaling to rollback
> incomplete transactions when exceptions occur.  Fortunately most operations
> are append-only and not rename / replace.

The critical thing is the word "most" in our context here.

I fear the reordering done by the patch might have the potential to
make things worser in revlog.checkinlinesize.

If we reorder the renames (i.e. put the unlink at the end), then the new
indexfile will have been put in place by fp.rename(), but line 1044 in
revlog.py ("tr.replace(...)") will not have been reached if the rename aborts
(due to the os.unlink failing, inflicted by the scanner holding the
temporary open by denying other processes to delete it).

Excerpt from revlog.py (lines 1033 to 1045):

        fp = self.opener(self.indexfile, 'w', atomictemp=True)
        self.version &= ~(REVLOGNGINLINEDATA)
        self._inline = False
        for i in self:
            e = self._io.packentry(self.index[i], self.node, self.version, i)
            fp.write(e)

        # if we don't call rename, the temp file will never replace the
        # real index
        fp.rename()

        tr.replace(self.indexfile, trindex * self._io.size)
        self._chunkclear()

So the new index file will be put in place but it will not be recorded
in the transaction.

Wouldn't this leave the repo in an inconsistent state if the transaction
is then rolled back (because of the abort)?

revlog.checkinlinesize is called when adding revisions to the repo.


More information about the Mercurial-devel mailing list