Windows: hardlink support is broken on shared drives

Adrian Buehlmann adrian at cadifra.com
Wed Aug 18 04:50:59 CDT 2010


On 18.08.2010 10:24, Patrick Mézard wrote:
> Le 18/08/10 09:57, Didly Bom a écrit :
>> On Tue, Aug 17, 2010 at 10:06 PM, Patrick Mézard <pmezard at gmail.com <mailto:pmezard at gmail.com>> wrote:
>>
>>     Le 17/08/10 20:10, Matt Mackall a écrit :
>>     > On Tue, 2010-08-17 at 19:49 +0200, Patrick Mézard wrote:
>>     >> Le 17/08/10 16:32, Matt Mackall a écrit :
>>     >>> See this thread:
>>     >>>
>>     >>> http://mercurial.markmail.org/thread/4hvungefkgmq3cum
>>     >>>
>>     >>> And this bug:
>>     >>>
>>     >>> http://mercurial.selenic.com/bts/issue761
>>     >>>
>>     >>> If we remotely commit to a repo that has hardlinks because it was
>>     >>> locally cloned, we get a mess in the clone. So we either need to:
>>     >>>
>>     >>> a) figure out how to reliably get an accurate link count over the wire
>>     >>>
>>     >>> or
>>     >>>
>>     >>> b) figure out how to detect we're on a network share and assume
>>     >>> -everything- is a linked file when committing (slow!)
>>     >>>
>>     >>>
>>     >>> Even something as drastic as disabling hardlink support is insufficient
>>     >>> because there will still be repos with hardlinks on many users' systems.
>>     >>>
>>     >>> I've upped the priority on this bug to critical - we've gotten several
>>     >>> reports of it that we've failed to properly identify the cause of.
>>     >>>
>>     >>> The interesting code bits are here:
>>     >>>
>>     >>> http://www.selenic.com/hg/file/0b84864d1325/mercurial/win32.py#l23
>>     >>>
>>     >>> http://www.selenic.com/hg/file/0b84864d1325/mercurial/win32.py#l53
>>     >>
>>     >> And the interesting quote from:
>>     >>
>>     >>     http://msdn.microsoft.com/en-us/library/aa364952%28VS.85%29.aspx
>>     >>
>>     >> "Depending on the underlying network features of the operating system
>>     >> and the type of server connected to, the GetFileInformationByHandle
>>     >> function may fail, return partial information, or full information for
>>     >> the given file."
>>     >
>>     > So that gives us three cases:
>>     >
>>     > a) works - return real nNumberOfLinks
>>     > b) partial - pretend there are 2 to force link breaking
>>     > c) fails - pretend there are 2 to force link breaking
>>     >
>>     > The trick is to distinguish case (a) from (b). If nNumberOfLinks > 1,
>>     > then we can assume it worked. But we'll probably have to look at other
>>     > fields in the structure to sanity-check it in other cases. But we're
>>     > going to need people with Windows (ie not me) to do some testing with
>>     > different setups to figure out if we can actually reliably detect case
>>     > (b).
>>     >
>>     > Step one here is probably to write a test script that people can run to:
>>     >
>>     > a) create a hardlink
>>     > b) report the result details of GetFileInformationByHandle
>>
>>     Test script attached.
>>
>>     It implements two commands:
>>
>>     $ python testlink.py a linka => hardlink linka from a
>>     $ python testlink.py linka => display linka links count (=2) and the volume serial number
>>
>>     Here are the links count and volume serial number when testing a local or remote hardlink.
>>
>>     setup                                      links  serial
>>     -----                                      -----  ------
>>     WinXP local NTFS                           2      != 0
>>     OSX from WinXP through Parallels           1      == 0
>>     WinXP from Win2003 through RDP             2      != 0
>>     Win2003 local NTFS                         2      != 0
>>     Win2003 from Win2003 through mounted dir   1      == 0
>>
>>     (interesting that a dir mounted through RDP works better than a regular mounted dir...)
>>
>>     At least in these cases, testing serial == 0 would tell us if we can trust the GetFileInformationFromHandle() call.
>>
>>     Patches will follow.
>>
>>
>> Patrick,
>>
>> I ran your script on my WindowsXP machine. The output of the script when I ran it on a local NTFS drive was:
>>
>> C:\test>python testlink.py linka
>> (8224, <PyTime:18/08/2010 7:46:03>, <PyTime:18/08/2010 7:46:03>, <PyTime:18/08/2
>> 010 7:46:03>, 1819093006, 0, 4, 2, 1441792, 467230)
>> 2
>>
>> I assume that "2" is the link count, but where is the "serial" in there?
> 
> Yes I dumped all fields while investigating which ones were changing but I realized this is not very usable. I have attached a clearer version of the script. The fields are in the same order than in the Win32 structure:
> 
>     http://msdn.microsoft.com/en-us/library/aa363788%28v=VS.85%29.aspx
> 
> So in your case, link=2 and serial=1819093006. You get an "a", congratulations.
> 
>> I can run tests on Windows XP SP (NTFS and FAT) and on Windows Server 2003 and Windows Server 2003 x64 Edition (NTFS only), both locally and in shared drives. Is there some particular tests that you'd like me to run?
>>
>> I may also be able to run tests on Windows7. I have a colleague who has had problems when using a "central" repository located on a Windows 7 machine and accessed through a shared drive and I suspect that his problems may be related to this issue as well. Unfortunately he is on vacation right now so I do not know if I'll be able to access his server until he comes back.
> 
> Useful tests:
> 
> WinXP mounted NTFS from Win2003
> WinXP mounted FAT from Win2003
> WinXP RDP FAT from Win2003
> Win2003 mounted NTFS from WinXP
> Win7 locally
> Win7 mounted NTFS from WinXP


I entered the following in a cmd.exe running on Windows 7 x64 (running
Python 2.6.5) with a drive ("Z:") mounted from a Samba server (running
Samba version 3.0.26a) which runs on an ancient FreeBSD 6.2 box on my
local network:

$ cd
Z:\
$ echo foo > a
$ python testlink.py a linka
$ type linka
foo
$ python testlink.py linka
links=1
serial=110166285
$ echo bar >> linka
$ type linka
foo
bar
$ type a
foo
bar


Running 'ls -i' in a terminal on the FreeBSD box shows:

%ls -i
2779165 a               2779165 linka           2779164 testlink.py

So:

* both 'a' and 'linka' have the same file serial number on FreeBSD
* modifying 'linka' through the mounted drive also modifies 'a'
* 'testlink.py' erroneously reports 'links=1' (ideally, should be: 2)
* 'serial' as reported by 'testlink.py' is non-zero




More information about the Mercurial-devel mailing list