Windows: hardlink support is broken on shared drives

Adrian Buehlmann adrian at cadifra.com
Wed Aug 18 15:34:14 CDT 2010


On 18.08.2010 20:22, Patrick Mézard wrote:
> Le 18/08/10 20:02, Adrian Buehlmann a écrit :
>> On 18.08.2010 19:39, Matt Mackall wrote:
>>> On Wed, 2010-08-18 at 11:47 +0200, Patrick Mézard wrote:
>>>> Le 17/08/10 22:34, Matt Mackall a écrit :
>>>>> On Tue, 2010-08-17 at 22:06 +0200, Patrick Mézard wrote:
>>>>>> Le 17/08/10 20:10, Matt Mackall a écrit :
>>>>>>> On Tue, 2010-08-17 at 19:49 +0200, Patrick Mézard wrote:
>>>>>>>> Le 17/08/10 16:32, Matt Mackall a écrit :
>>>>>>>>> See this thread:
>>>>>>>>>
>>>>>>>>> http://mercurial.markmail.org/thread/4hvungefkgmq3cum
>>>>>>>>>
>>>>>>>>> And this bug:
>>>>>>>>>
>>>>>>>>> http://mercurial.selenic.com/bts/issue761
>>>>>>>>>
>>>>>>>>> If we remotely commit to a repo that has hardlinks because it was
>>>>>>>>> locally cloned, we get a mess in the clone. So we either need to:
>>>>>>>>>
>>>>>>>>> a) figure out how to reliably get an accurate link count over the wire
>>>>>>>>>
>>>>>>>>> or 
>>>>>>>>>
>>>>>>>>> b) figure out how to detect we're on a network share and assume
>>>>>>>>> -everything- is a linked file when committing (slow!)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Even something as drastic as disabling hardlink support is insufficient
>>>>>>>>> because there will still be repos with hardlinks on many users' systems.
>>>>>>>>>
>>>>>>>>> I've upped the priority on this bug to critical - we've gotten several
>>>>>>>>> reports of it that we've failed to properly identify the cause of.
>>>>>>>>>
>>>>>>>>> The interesting code bits are here:
>>>>>>>>>
>>>>>>>>> http://www.selenic.com/hg/file/0b84864d1325/mercurial/win32.py#l23
>>>>>>>>>
>>>>>>>>> http://www.selenic.com/hg/file/0b84864d1325/mercurial/win32.py#l53
>>>>>>>>
>>>>>>>> And the interesting quote from:
>>>>>>>>
>>>>>>>>     http://msdn.microsoft.com/en-us/library/aa364952%28VS.85%29.aspx
>>>>>>>>
>>>>>>>> "Depending on the underlying network features of the operating system
>>>>>>>> and the type of server connected to, the GetFileInformationByHandle
>>>>>>>> function may fail, return partial information, or full information for
>>>>>>>> the given file."
>>>>>>>
>>>>>>> So that gives us three cases:
>>>>>>>
>>>>>>> a) works - return real nNumberOfLinks
>>>>>>> b) partial - pretend there are 2 to force link breaking
>>>>>>> c) fails - pretend there are 2 to force link breaking
>>>>>>>
>>>>>>> The trick is to distinguish case (a) from (b). If nNumberOfLinks > 1,
>>>>>>> then we can assume it worked. But we'll probably have to look at other
>>>>>>> fields in the structure to sanity-check it in other cases. But we're
>>>>>>> going to need people with Windows (ie not me) to do some testing with
>>>>>>> different setups to figure out if we can actually reliably detect case
>>>>>>> (b).
>>>>>>>
>>>>>>> Step one here is probably to write a test script that people can run to:
>>>>>>>
>>>>>>> a) create a hardlink
>>>>>>> b) report the result details of GetFileInformationByHandle
>>>>>>
>>>>>> Test script attached.
>>>>>>
>>>>>> It implements two commands:
>>>>>>
>>>>>> $ python testlink.py a linka => hardlink linka from a
>>>>>> $ python testlink.py linka => display linka links count (=2) and the volume serial number
>>>>>>
>>>>>> Here are the links count and volume serial number when testing a local or remote hardlink.
>>>>>>
>>>>>> setup                                      links  serial
>>>>>> -----                                      -----  ------
>>>>>> WinXP local NTFS                           2      != 0               
>>>>>> OSX from WinXP through Parallels           1      == 0
>>>>>> WinXP from Win2003 through RDP             2      != 0
>>>>>> Win2003 local NTFS                         2      != 0
>>>>>> Win2003 from Win2003 through mounted dir   1      == 0
>>>>>
>>>>> Ok, so if we consider our cases from before:
>>>>>
>>>>> a -> works
>>>>> b -> detectable partial
>>>>> c -> fail
>>>>> x -> undetectable partial
>>>>>
>>>>> then we've got something like:
>>>>>
>>>>> server                    client
>>>>>                 winxp   2003  2000  vista  ...   
>>>>> local             a       a
>>>>> OSX parallels     b
>>>>> 2003 RDP          a
>>>>> 2003 mounted              b
>>>>> Vista
>>>>> Samba
>>>>> NetApp
>>>>> ...
>>>>>
>>>>> We probably need to fill in this table a bit more before we can declare
>>>>> this test works.
>>>>
>>>> It seems I fooled myself with testlink.py and misread the result (big
>>>> surprise). I redid the tests this morning, printing the links count
>>>> and the serial number and all cases fail except the local ones,
>>>> meaning links=1 and serial!=0 where reading network shares.
>>>
>>> Ok, so we currently have -no- known cases where we can see hardlinks on
>>> a networked drive?
>>>
>>
>> I just tried Patrick's testlink.py on a Windows 7 drive mounted on
>> another, different box running Windows 7 as well (both x64).
>>
>> Reports links=1 when it should be 2.
>>
>> So not even the -most recent- Windows platform can see links using
>> testlink.py on a volume mounted through a drive letter.
>>
>> This feels like there is a bug in testlink.py (or inside the win32api
>> module) or it might be a problem with mounting a volume through a drive
>> letter in general.
> 
> FWIW, I just installed samba on a recent debian and:
> 1- testlink.py works on the debian side: create hardlinks and reports correct link count
> 2- testlink.py manages to create hardlinks from winxp on the share, and reports the correct link count (and a non-nul serial number)

Ok. So you found at least one good case.

testlink.py source code should be ok then.

Probably irrelevant, but what version of Python and pywin32 do you have?

Here (on Windows 7 x64) I have Python 2.6.5, and my pywin32 is "Python
2.6 pywin32-214" (must be build 214, which seems to be the most recent).


More information about the Mercurial-devel mailing list