Windows: hardlink support is broken on shared drives

Patrick Mézard pmezard at gmail.com
Wed Aug 18 17:29:55 CDT 2010


Le 18/08/10 22:34, Adrian Buehlmann a écrit :
> On 18.08.2010 20:22, Patrick Mézard wrote:
>> Le 18/08/10 20:02, Adrian Buehlmann a écrit :
>>> On 18.08.2010 19:39, Matt Mackall wrote:
>>>> On Wed, 2010-08-18 at 11:47 +0200, Patrick Mézard wrote:
>>>>> Le 17/08/10 22:34, Matt Mackall a écrit :
>>>>>> On Tue, 2010-08-17 at 22:06 +0200, Patrick Mézard wrote:
>>>>>>> Le 17/08/10 20:10, Matt Mackall a écrit :
>>>>>>>> On Tue, 2010-08-17 at 19:49 +0200, Patrick Mézard wrote:
>>>>>>>>> Le 17/08/10 16:32, Matt Mackall a écrit :
>>>>>>>>>> See this thread:
>>>>>>>>>>
>>>>>>>>>> http://mercurial.markmail.org/thread/4hvungefkgmq3cum
>>>>>>>>>>
>>>>>>>>>> And this bug:
>>>>>>>>>>
>>>>>>>>>> http://mercurial.selenic.com/bts/issue761
>>>>>>>>>>
>>>>>>>>>> If we remotely commit to a repo that has hardlinks because it was
>>>>>>>>>> locally cloned, we get a mess in the clone. So we either need to:
>>>>>>>>>>
>>>>>>>>>> a) figure out how to reliably get an accurate link count over the wire
>>>>>>>>>>
>>>>>>>>>> or 
>>>>>>>>>>
>>>>>>>>>> b) figure out how to detect we're on a network share and assume
>>>>>>>>>> -everything- is a linked file when committing (slow!)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Even something as drastic as disabling hardlink support is insufficient
>>>>>>>>>> because there will still be repos with hardlinks on many users' systems.
>>>>>>>>>>
>>>>>>>>>> I've upped the priority on this bug to critical - we've gotten several
>>>>>>>>>> reports of it that we've failed to properly identify the cause of.
>>>>>>>>>>
>>>>>>>>>> The interesting code bits are here:
>>>>>>>>>>
>>>>>>>>>> http://www.selenic.com/hg/file/0b84864d1325/mercurial/win32.py#l23
>>>>>>>>>>
>>>>>>>>>> http://www.selenic.com/hg/file/0b84864d1325/mercurial/win32.py#l53
>>>>>>>>>
>>>>>>>>> And the interesting quote from:
>>>>>>>>>
>>>>>>>>>     http://msdn.microsoft.com/en-us/library/aa364952%28VS.85%29.aspx
>>>>>>>>>
>>>>>>>>> "Depending on the underlying network features of the operating system
>>>>>>>>> and the type of server connected to, the GetFileInformationByHandle
>>>>>>>>> function may fail, return partial information, or full information for
>>>>>>>>> the given file."
>>>>>>>>
>>>>>>>> So that gives us three cases:
>>>>>>>>
>>>>>>>> a) works - return real nNumberOfLinks
>>>>>>>> b) partial - pretend there are 2 to force link breaking
>>>>>>>> c) fails - pretend there are 2 to force link breaking
>>>>>>>>
>>>>>>>> The trick is to distinguish case (a) from (b). If nNumberOfLinks > 1,
>>>>>>>> then we can assume it worked. But we'll probably have to look at other
>>>>>>>> fields in the structure to sanity-check it in other cases. But we're
>>>>>>>> going to need people with Windows (ie not me) to do some testing with
>>>>>>>> different setups to figure out if we can actually reliably detect case
>>>>>>>> (b).
>>>>>>>>
>>>>>>>> Step one here is probably to write a test script that people can run to:
>>>>>>>>
>>>>>>>> a) create a hardlink
>>>>>>>> b) report the result details of GetFileInformationByHandle
>>>>>>>
>>>>>>> Test script attached.
>>>>>>>
>>>>>>> It implements two commands:
>>>>>>>
>>>>>>> $ python testlink.py a linka => hardlink linka from a
>>>>>>> $ python testlink.py linka => display linka links count (=2) and the volume serial number
>>>>>>>
>>>>>>> Here are the links count and volume serial number when testing a local or remote hardlink.
>>>>>>>
>>>>>>> setup                                      links  serial
>>>>>>> -----                                      -----  ------
>>>>>>> WinXP local NTFS                           2      != 0               
>>>>>>> OSX from WinXP through Parallels           1      == 0
>>>>>>> WinXP from Win2003 through RDP             2      != 0
>>>>>>> Win2003 local NTFS                         2      != 0
>>>>>>> Win2003 from Win2003 through mounted dir   1      == 0
>>>>>>
>>>>>> Ok, so if we consider our cases from before:
>>>>>>
>>>>>> a -> works
>>>>>> b -> detectable partial
>>>>>> c -> fail
>>>>>> x -> undetectable partial
>>>>>>
>>>>>> then we've got something like:
>>>>>>
>>>>>> server                    client
>>>>>>                 winxp   2003  2000  vista  ...   
>>>>>> local             a       a
>>>>>> OSX parallels     b
>>>>>> 2003 RDP          a
>>>>>> 2003 mounted              b
>>>>>> Vista
>>>>>> Samba
>>>>>> NetApp
>>>>>> ...
>>>>>>
>>>>>> We probably need to fill in this table a bit more before we can declare
>>>>>> this test works.
>>>>>
>>>>> It seems I fooled myself with testlink.py and misread the result (big
>>>>> surprise). I redid the tests this morning, printing the links count
>>>>> and the serial number and all cases fail except the local ones,
>>>>> meaning links=1 and serial!=0 where reading network shares.
>>>>
>>>> Ok, so we currently have -no- known cases where we can see hardlinks on
>>>> a networked drive?
>>>>
>>>
>>> I just tried Patrick's testlink.py on a Windows 7 drive mounted on
>>> another, different box running Windows 7 as well (both x64).
>>>
>>> Reports links=1 when it should be 2.
>>>
>>> So not even the -most recent- Windows platform can see links using
>>> testlink.py on a volume mounted through a drive letter.
>>>
>>> This feels like there is a bug in testlink.py (or inside the win32api
>>> module) or it might be a problem with mounting a volume through a drive
>>> letter in general.
>>
>> FWIW, I just installed samba on a recent debian and:
>> 1- testlink.py works on the debian side: create hardlinks and reports correct link count
>> 2- testlink.py manages to create hardlinks from winxp on the share, and reports the correct link count (and a non-nul serial number)
> 
> Ok. So you found at least one good case.
> 
> testlink.py source code should be ok then.
> 
> Probably irrelevant, but what version of Python and pywin32 do you have?
> 
> Here (on Windows 7 x64) I have Python 2.6.5, and my pywin32 is "Python
> 2.6 pywin32-214" (must be build 214, which seems to be the most recent).

I cannot really say sorry, it depends on the machine I was testing on and range from py2.5.4 and probably pywin32 213 to python 2.6 and pywin32 214.

I have written a simple C program printing the link count from GetFileInformationByHandle() and it returns the same result when reading the debian samba from winxp, and reading win2003 mounted drive from win2003.

I have also tried to detect mounted drives assuming they would have a reparse point flag (from GetFileAttributes()), without success. It just shows a directory when passed the drive ("z:\").

And I have posted this by despair:

    http://stackoverflow.com/questions/3517175/detecting-if-path-is-on-a-windows-mapped-network-drive

--
Patrick Mézard



More information about the Mercurial-devel mailing list