How Mercurial could be made working with Virus Scanners on Windows

Adrian Buehlmann adrian at cadifra.com
Wed Dec 8 02:53:50 CST 2010


I'm writing this to document how far I got in dealing with deleting open
files on Windows.

Mercurial obviously deletes files in a lot of places, but it also does
so to break up any hardlinks, which is a big theme in Mercurial due to
its preference to create hardlinks as much as it can when cloning
repositories.

Open files can happen at any time, either done by Mercurial itself (e.g.
revlog lazy parser keeps the index file open), or caused by other
programs like editors (although the sane ones probably quickly read the
file and then close it again), but more importantly, and quite
frequently, by the ubiquitous anti virus scanners.

As documented on

  http://mercurial.selenic.com/wiki/UnlinkingFilesOnWindows

opening files on Windows by programs is mostly done in two flavors:

(A) blocking rename and deletion by other processes
(B) allowing rename and deletion by other processes

'A' is done by python's built-in 'open' function, and 'B' is done by
mercurial itself (windows.posixfile) and -- lo and behold -- by the
majority of virus scanners, which includes the famous and gratis
"Microsoft Security Essentials". Same for other popular ones (e.g. Avast).

It's easy to investigate whether your favorite virus scanner is doing
'B' or something else. Use the free Process Explorer by Microsoft and
instruct it to log activity by your virus scanner. Then look at calls of
the CreateFile Windows API function

  http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx

and look at the value for the parameter dwShareMode. If the bit
FILE_SHARE_DELETE was set, then the opened file can be renamed and
deleted (type 'B'). If your virus scanner does anything more harmful
than that, then please get a different one.

Now, back to Mercurial. Method 'B' is the best we can get, but it is
still causing a major problem: files that were opened using 'B' are sent
into a ghost state if os.unlink is called on them. I've called that
state "scheduled delete" on the UnlinkingFilesOnWindows wiki page.

Files in this ghost state block the filename for as long as any reading
process doing 'B' holds it open. Which means no file can be created
under that name again despite os.unlink was called on it. Pretty amazing.

I consider solving this problem a key point in getting mercurial to work
with virus scanners on Windows.

How can it be done?

Files opened with method 'B' can be renamed. Processes holding the file
open continue to hold it open under the new name. So if you os.unlink a
file after rename, the "ghost file" state filename blocking happens for
the new name.

So a trick to harden mercurial against AV scanners is to rename files to
a random name before deleting them, so that the dreaded filename
blocking (the ghost state) is done on a random name and not on the
precious original name of the file, which may be needed again to
recreate the file under the same name (util.opener does that to break up
hardlinks when 'w'riting to files).

I'm writing these findings so that this information may be picked up by
whoever is interested in trying to make mercurial work with av scanners
on windows. Feel free to take this task.



More information about the Mercurial-devel mailing list