D588: win32: use fewer system calls for unlink()
Augie Fackler
raf at durin42.com
Fri Sep 1 12:43:46 EDT 2017
(replying to this thread to see if the response shows up in Phabricator, Kevin just configured that)
> On Sep 1, 2017, at 02:16, Adrian Buehlmann <adrian at cadifra.com> wrote:
>
> Ugh. Can I reply to a phabricator notification by email?
>
> Adding gregory.szorc at gmail.com <mailto:gregory.szorc at gmail.com> manually, as I'm not sure replaying to
> those nasty phabricator emails is going to work...
>
> On 2017-09-01 00:32, indygreg (Gregory Szorc) wrote:
>> indygreg created this revision.
>> Herald added a subscriber: mercurial-devel.
>> Herald added a reviewer: hg-reviewers.
>>
>> REVISION SUMMARY
>> unlink() on Windows is complicated by the fact that Windows doesn't
>> support removing an open file. Our unlink wrapper currently goes
>> through a dance where it generates a random filename, renames the
>> file, then attempts to unlink.
>>
>> This is a lot of extra work for what is likely the special case of
>> attempting to unlink a file that is open or can't be simply unlinked.
>>
>> In this commit, we refactor the code to use fewer system calls in
>> the common cases of 1) unlink just works 2) file doesn't exist.
>>
>> For the file doesn't exist case (before)
>>
>> 1. stat() to determine if path is directory
>> 2. generate random filename
>> 3. rename()
>>
>> after:
>>
>> 1. stat() to get path info
>>
>> For the file not open case (before)
>>
>> 1. stat() to determine if path is directory
>> 2. generate random filename
>> 3. rename()
>> 4. unlink()
>>
>> after:
>>
>> 1. stat()
>> 2. unlink()
>>
>> For the file open case (before)
>>
>> 1. stat() to determine if path is directory
>> 2. generate random filename
>> 3. rename()
>> 4. unlink()
>>
>> after:
>>
>> 1. stat()
>> 2. unlink()
>> 3. generate random filename
>> 4. rename()
>> 5. unlink()
>>
>> There is also a scenario where the unlink fails due to the file being
>> marked as read-only. In this case we also introduce an extra unlink()
>> call. However, I reckon the common case is the file isn't marked as
>> read-only and skipping the random generation and rename is worth it.
>>
>> I /think/ this change makes bulk file writing on Windows faster. This
>> is because vfs.__call__ calls unlink() when a file is opened for
>> writing. When writing a few hundred files files as part of a clone or
>> working directory update, the extra system calls can matter.
>> win32.unlink() did show up in performance profiles, which is what
>> caused me to look at this code. But I/O performance is hard to measure
>> and I'm not sure if the ~30s off of ~620s for a stream clone unbundle
>> on the Firefox repo is indicative of real world performance. I do know
>> the new code uses fewer system calls and shouldn't be slower.
>>
>> REPOSITORY
>> rHG Mercurial
>>
>> REVISION DETAIL
>> https://phab.mercurial-scm.org/D588
>>
>> AFFECTED FILES
>> mercurial/win32.py
>>
>> CHANGE DETAILS
>>
>> diff --git a/mercurial/win32.py b/mercurial/win32.py
>> --- a/mercurial/win32.py
>> +++ b/mercurial/win32.py
>> @@ -12,6 +12,7 @@
>> import msvcrt
>> import os
>> import random
>> +import stat
>> import subprocess
>>
>> from . import (
>> @@ -522,11 +523,26 @@
>> def unlink(f):
>> '''try to implement POSIX' unlink semantics on Windows'''
>>
>> - if os.path.isdir(f):
>> - # use EPERM because it is POSIX prescribed value, even though
>> - # unlink(2) on directories returns EISDIR on Linux
>> - raise IOError(errno.EPERM,
>> - "Unlinking directory not permitted: '%s'" % f)
>> + # If the path doesn't exist, raise that exception.
>> + # If it is a directory, emulate POSIX behavior.
>> + try:
>> + st = os.stat(f)
>> + if stat.S_ISDIR(st.st_mode):
>> + # use EPERM because it is POSIX prescribed value, even though
>> + # unlink(2) on directories returns EISDIR on Linux
>> + raise IOError(errno.EPERM,
>> + "Unlinking directory not permitted: '%s'" % f)
>> + except OSError as e:
>> + if e.errno == errno.ENOENT:
>> + raise
>> +
>> + # In the common case, a normal unlink will work. Try that first and fall
>> + # back to more complexity if and only if we need to.
>> + try:
>> + os.unlink(f)
>> + return
>> + except (IOError, OSError) as e:
>> + pass
>>
>> # POSIX allows to unlink and rename open files. Windows has serious
>> # problems with doing that:
>
> Do you get an error at all, if a file, which is in open state, is unlinked?
>
> My fear is: You won't get an error, but instead the filename is blocked
> by the file being held in place by the other process, until the other
> process closes it. Which means: You already lost the game.
>
> Which would explain why we didn't do things like you propose here.
>
> See also
>
> https://www.mercurial-scm.org/wiki/UnlinkingFilesOnWindows <https://www.mercurial-scm.org/wiki/UnlinkingFilesOnWindows>
>
> (specifically, heading 2)
>
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org <mailto:Mercurial-devel at mercurial-scm.org>
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel <https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.mercurial-scm.org/pipermail/mercurial-devel/attachments/20170901/13cc9109/attachment.html>
More information about the Mercurial-devel
mailing list